Overview
The following diagram (click zoom to expand) shows the recommended configuration for running multiple (ie 1+) clusters of multiple nodes (2+ but we recommend 4+: ie 2+ API nodes and 2+ DB nodes).
Note that sharding is not fully supported (or at least not fully tested) as of the March 2012 release. Apart from one weekly maintenance script (that is awaiting a new MongoDB) feature, we believe it should work. As of 3M documents indexed, sharding is not necessary in case.
Step 1: Configure AWS settings
There are 2 things that need to be done in the AWS management console to prepare for Infinit.e install:
- Set up security groups
- Set up S3 storage for index and DB backups
Security groups
The only port that is needed is port 80, though ssh at least on authorized IP addresses is standard.
There is no functional need to separate out the different clusters into different groups, but there are obviously safety/security reasons, eg to stop someone logged in to cluster "X" to deliberately or inadvertently access the technology stack on cluster "Y".
So having one group per cluster that disallows internal traffic (eg 10.*.*.*) is probably desirable (note that nodes within a group have unrestricted access to each other, which is desirable).
An even stricter configuration would be to have 2 groups per cluster, one for API nodes and one for DB nodes, only allowing port 27017 and 27016 access between them.
S3 storage
Given a root S3 path (S3ROOT say), eg we might use "infinit.e-saas.ikanow.com" (which is entered into the properties.configuration file, see below), the following buckets are required:
- mongo.<S3ROOT>: daily database backups, put in the same region as the cluster.
- elasticsearch.<S3ROOT>: daily index backups, put in the same region as the cluster.
- backup.mongo.<S3ROOT>: weekly database backups, put in a different region (and ideally country) to the cluster.
- backup.elasticsearch.<S3ROOT>: weekly index backups, put in a different region (and ideally country) to the cluster.
Step 2: Create a properties.configuration file
TODO
Step 3: Start a load balancer
TODO
Step 4: Start database nodes
TODO
Step 5: Start API nodes
TODO
Step 6: Connect the API nodes to the load balancer
TODO
Miscellaneous notes
TODO What to do next
TODO note that we're not using the cloudformation templates quite like they're supposed to be ysed