Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Load balancers: don't scale themselves but (a) require very little resource because are very simple, (b) can scale linearly using DNS round robin.
  • API nodes: elasticsearch scales Lucene reads and writes "horizontally" (in the above diagram), Hadoop also scales horizontally, the harvesters scale by partitioning the sources (via the scalable database), and the per-query processing scales by virtue of the load balancer.
  • DB nodes: MongoDB writes scale horizontally (shards), and reads scale vertically (replicas). Each API node has a complete list of (horizontal) shards and (vertical) replicas (cached from the  config server) so there are no bottlenecks.
  • Config server nodes: These nodes store primary key ranges and regularly update the caches in the API and DB nodes, so the resource requirements remain approximately constant. 

Licensing of different components

Gliffy
size500
nameikanow_licensing

Anchor
db
db
For reference, the top level data model:

...