Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This section describes steps that can be taken to squeeze the most performance out of a cluster (at the expense of a more complex configuration).

Hardware

In this page it is assumed that the user is running more powerful machines - for example 12 cores, 64GB, with 1 or 2 fast RAID volumes (see below). It is also assumed that the (Elasticsearch) real-time index and the (MongoDB) data store are on different nodes.

The configuration suggested below assumes at least this - where more CPU/memory would affect the suggested configuration this is noted.

Disk Configuration

(This section focuses on magnetic disks, SSD is briefly mentioned at the bottom)

...

Info

We have not tested Infinit.e using SSD, though both MongoDB and Elasticsearch have been used. The general approach to utilizing SSD is:

  • If you have enough SSD then use it as the /raidarray or /dbarray
  • If not then set it up as an additional cache in between memory and disk

Java version

Currently we are tested against Oracle's JDK6 and JDK7. Oracle JDK8 testing is ongoing. Once JDK8 is tested it is expected to be significantly faster, for at least two reasons:

...

For now the recommended version is the latest Oracle JDK7.

Virtual Memory

It is recommended that there be SWAP space equal to at least 10GB - probably 20GB for 60GB of RAM.

Configuration file settings

(Relative to the central configuration file described here):

  • TODO

RPM to node distribution

Assuming Nx API nodes and Mx DB nodes, the "standard" mapping is:

...

To maximize the ingest performance, you can also install the infinit.e-processing-engine RPM on the DB nodes. This doubles the number of harvesters. Note that it is necessary to copy any additional JARS into the DB nodes' plugins/extractors/unbundled directories (see here), just like for the API nodes.

Post install configuration

...

Hadoop configuration

In the Hadoop configuration guide, the following settings are recommended:

  • mappers (XXX): XXXX
  • reducers (XXX): XXX

TODO

Post install configuration

This section describes changes that are made to the processes' configuration files after RPMs have been installed.

Warning

The files described in here are designated as RPM configuration files, meaning that they will only be overwritten if the RPM-side version of the file is updated (in which case the "old" version is saved to "<filename>.RPMSAVE". Care must therefore be taken while updating RPMs to note if this happens (and then the user must merge the files by hand if so)

  • /etc/sysconfig/tomcat6-interface-engine:
    • XXX
    • (After changing restart the corresponding service with: XXX)
  • /etc/sysconfig/infinite-index-engine:
    • XXX
    • (After changing restart the corresponding service with: XXX)
  • /opt/infinite-home/bin/infinite-px-engine.sh
    • XXX
    • (After changing restart the corresponding service with: XXX)

Source JSON configuration

The file extractor is the most optimized one, so wherever possible that should be used.

XXX

XXX roadmap