Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note: This document covers both OFFLINE and ONLINE versions of the file.

1. Parameters

...

that should (almost) always best set

1.1 Basic Infinit.e Settings

...

1.13 Entity Extractor Properties

 The following properties are required to configure the use of Alchemy or Open Calais.

#-------------------------------------------------------------------------------
# 1.13] Entity Extractor Properties
#-------------------------------------------------------------------------------
# Alchemy and Open Calais Keys:
# (Obtain from alchemyapi.com or opencalais.com)
extractor.key.alchemyapi=
extractor.key.opencalais=
#----------------------------------------------
# Entity extraction type selection: opencalais or alchemyapi or none
# ("opencalais" has a much higher limit than "alchemyapi" (1000/day) so is recommended for free use
#  "alchemyapi" extracts sentiment, "opencalais" extracts entity associations Note this can be overridden per source)
extractor.entity.default=
# Text extraction type selection: boilerplate or alchemyapi or none
# ("alchemyapi" is much better, but has the limit discussed above. Note this can be overridden per source)
extractor.text.default=

...

The ui.end.point.url property is used to to tell the UI where to connect to the Infinit.e API.

#-------------------------------------------------------------------------------
# 1.14] Interface Related Properties for the AppConstants.js file found in:
#       /mnt/opt/infinite-tomcat/interface-engine/webapps/ROOT/
#-------------------------------------------------------------------------------
# The REST end point of the server (or a DNS/AWS load balancer across multiple rest end points):
# (Will normally end "/api/") 
ui.end.point.url=http://MY_REST_ENDPOINT/api/

...

2.2 Software as a Service Properties

 Properties that are only modified if Infinit.e is deployed in SAAS mode (which is uncommon).

#-------------------------------------------------------------------------------
# 2.2] Software as a service (SAAS) settings
#-------------------------------------------------------------------------------
# If true, allows admin requests that come from trusted sources to have admin privileges: 
app.saas=false
# A list of trusted DNS/IP addresses (eg from CMS):
app.saas.trusted.dns=
2.3 Amazon Services Properties

 The use.aws property is used to configure whether or not the platform is deployed on Amazon EC2.

#-------------------------------------------------------------------------------
# 2.3] Amazon services properties
#-------------------------------------------------------------------------------
# Values: 0=false, 1=true
# If deployed on an EC2 cluster set this to 1:
use.aws=0
2.6 API Search Test

 Default search test terms and expected results values used to monitor the Infinit.e service.

#-------------------------------------------------------------------------------
# 2.6] API Search Test Terms and Expected Results
#-------------------------------------------------------------------------------
# List of terms formatted like: "*" "something" "something":
# (The continuous testing randomly selects one of these for querying the API)
api.search.test.terms="*"
# The expected results (max 100), if a different number comes back, the system alerts:
api.search.expected.results=0
2.7 Amazon AWS Settings

 Property used by s3cmd to connect to Amazon to move files around.

#-------------------------------------------------------------------------------
# 2.7] Amazon AWS Settings
#-------------------------------------------------------------------------------
# Used for s3cmd, see their web page for details: 
s3.gpg.passphrase=
2.8 MongoDB Properties

 MongoDB database configuration properties.

#-------------------------------------------------------------------------------
# 2.8] MongoDB Properties
#-------------------------------------------------------------------------------
# (server/port should normally point to localhost:27017), where API nodes have a mongos
db.server=localhost
db.port= 27017
# db.sharded - 0 = false and 1 = true
db.sharded=0
# The max number of documents to store (eg 10M). Docs will be dropped in order of age.
# (Not currently supported):
db.capacity=10000000
# MongoDB config server or servers (must be 1 or 3 comma separated IPs), non-EC2/AWS installations only
db.config.servers=
db.replica.sets=
#----------------------------------------------
# db.cluster.subnet - used for non-EC2/AWS only installations to help mongodb configurations
# identify proper host ip addresses, e.g. 127.0.0.
db.cluster.subnet=
#----------------------------------------------
# The location from which to fetch the geo.bson dump used for feature.geo
# can start s3://, http:// or https://, else is assumed to be a file, eg
#db.geo_archive=s3://config.saas.infinite.ikanow.com/geo.bson.tar.gz
# Can always be retrieved here
db.geo_archive=http://www.ikanow.com/infinit.e-preinstall/geo.bson.tar.gz
2.9 UI Inactivity Timeout

 

#-------------------------------------------------------------------------------
# 2.9] UI inactivity timeout (in seconds)
#-------------------------------------------------------------------------------
access.timeout=1800
2.10 Elasticsearch Properties

 

#-------------------------------------------------------------------------------
# 2.10] Elasticsearch Properties
#----------------------------------------------
# Discovery mode = ec2 (if running on AWS) or zen (specify a list of IPs below):
elastic.node.discovery=ec2
#----------------------------------------------
# ES nodes, e.g.: elastic.search.nodes='NODE1:9300','NODE2:9300','NODE3:9300':
# Needed if discovery.mode=zen (not EC2/AWS), a set of IPs to try (>= 1 must be running elasticsearch)
elastic.search.nodes=
#-------------------------------------------------------------------------------
# mlockall = should equal true except if running on a machine with < 4GB of RAM
bootstrap.mlockall=true
# (Should normally be localhost:9300, unless an API node is running with no index node) 
elastic.url=localhost:9300
2.11 Harvester Properties

 

#-------------------------------------------------------------------------------
# 2.11] Harvester Properties
#-------------------------------------------------------------------------------
# Comma-separated-list from File,Database,Feed (note Database and Feed need jars not bundled with the RPM)
harvester.types=File,Database,Feed
# Web crawling etiquette: the time to way between consecutive accesses to the same time (10s is standard)
harvest.feed.wait=10000
# The minimum time between consecutive harvests (avoids thrashing FS/DB/RSS when there's nothing to get)
harvest.mintime.ms=300000
# Restricts the number of docs that can be harvested per cycle for memory reasons:
harvest.maxdocs_persource=5000
# Threading configuration type:num_threads (type from above):
# (eg for RSS heavy increase the "feed", for DB heavy increase the "file" etc. Beyond 20 there is limited benefit). 
harvest.threads=file:5,database:5,feed:20
2.12 Hadoop Config Path

...

The Hadoop config path is a local folder where Infinit.e stores map reduce jobs if Hadoop is used.

#-------------------------------------------------------------------------------
# 2.12] Hadoop config path
#-------------------------------------------------------------------------------
hadoop.configpath=/mnt/opt/hadoop-infinite/mapreduce/
2.13 Entity Extractor Properties

...

#-------------------------------------------------------------------------------
# 2.13] Entity Extractor Properties
#-------------------------------------------------------------------------------
# Alchemy extraction level
# 1==people postproc, 2==geo postproc, 3==both
# (This uses some hard-coded heuristics to work around known AlchemyAPI errors)
app.alchemy.postproc=3
2.14 UI Related Properties

 

#-------------------------------------------------------------------------------
# 2.14] Interface Related Properties for the AppConstants.js file found in:
#       /mnt/opt/infinite-tomcat/interface-engine/webapps/ROOT/
#-------------------------------------------------------------------------------
# For SaaS applications, the URL of the web page (eg containing CMS links for forgot password/logout etc):
# (Can be left blank otherwise)
ui.domain.url=
# Forgot password URL: (SaaS only, ie integrated with a CMS)
# (relative to ui.domain.url):
ui.forgot.password=forgot-password/
# Logout URL: (SaaS only, ie integrated with a CMS)
# (relative to ui.domain.url):
ui.logout=?action=logout
2.15 Map API Key

 Obsolete: Google has ceased support for this API and is not generating any new keys.

#-------------------------------------------------------------------------------
# 2.15] Maps API key:
#-------------------------------------------------------------------------------
# Not needed for MapQuest open API
google.maps.api.key=

 

...