...
- Load balancers: don't scale themselves but (a) require very little resource because are very simple, (b) can scale linearly using DNS round robin.
- API nodes: elasticsearch scales Lucene reads and writes "horizontally" (in the above diagram), Hadoop also scales horizontally, the harvesters scale by partitioning the sources (via the scalable database), and the per-query processing scales by virtue of the load balancer.
- DB nodes: MongoDB writes scale horizontally (shards), and reads scale vertically (replicas). Each API node has a complete list of (horizontal) shards and (vertical) replicas (cached from the config server) so there are no bottlenecks.
- Config server nodes: These nodes store primary key ranges and regularly update the caches in the API and DB nodes, so the resource requirements remain approximately constant.
Licensing of different components
Gliffy |
---|
| |
---|
size | 500 |
---|
name | ikanow_licensing |
---|
|
For reference, the top level data model:
...