Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note: This document covers both OFFLINE and ONLINE versions of the file.

Key configuration properties - must be set

1. Parameters the should (almost) always best set

1.1 Basic Infinit.e Settings

Change the admin.password and test.user.password values from the default of infinit.e!2012.

################################################################################
##-------------------------------------------------------------------------------
# 1.1] Basic Infinit.e Settings 
#################################################################################-------------------------------------------------------------------------------
# Default admin and test user passwords
# Admin: admin_infinite@ikanow.com
# CHANGE THIS:
admin.password=infinit.e!2012
# Test User: test_user@ikanow.com
# CHANGE THIS:
test.user.password=infinit.e!2012

Note: These password values are encrypted when the admin and test user accounts are created in the Infinit.e database.

1.3 Amazon Services Properties

The s3.url property is required for backups when Infinit.e is hosted on Amazon.

#-------------------------------------------------------------------------------
# 1.

...

Software as a service (SAAS) settings

SASS settings should be left in their default settings (shown below).

################################################################################
# 3] Amazon services properties
#-------------------------------------------------------------------------------
# This is the root s3 bucket name to be used for backups (use.aws=1 only):
# (The following names are used: mongo.<s3.url>, elasticsearch.<s3.url> .. daily backups in the same region
#  backup.mongo.<s3.url>, backup.elasticsearch.<s3.url> ... monthly backups in a different region
#  Note these dirs need to be set up manually)
s3.url=
1.4 EMail Server Settings

The following properties need to be filled out in order for Infinit.e to be able to send messages via email (system errors, communications to/from users, etc).

#-------------------------------------------------------------------------------
# 1.4] EMail Server Settings
#-------------------------------------------------------------------------------
# The server to be used for mail transactions (eg smtp.google.com if Internet-enabled, contact your sysadmin if not):
mail.server=
# Base-64 encoded SHA-256 hash of username:
mail.username=
# Base-64 encoded SHA-256 hash of password:
mail.password=
# This URL is used as the base for links included in the 
# So should point to an accessible REST endpoint (eg the same as ui.end.point.url below)
url.root=http://MY_REST_ENDPOINT/api/
1.5 Email Addresses  for Log Files

Addresses to send log files from and to.

#-------------------------------------------------------------------------------
# 1.5] EMail Addresses for log files etc.
#-------------------------------------------------------------------------------
# All emails come from this user:
log.files.mail.from=
# System alert emails come from this user:
log.files.mail.to=
1.7 Amazon AWS Settings

AWS access and secret keys required for the Infinit.e platform to access AWS.

#-------------------------------------------------------------------------------
# 1.7] Amazon AWS Settings
#-------------------------------------------------------------------------------
# AWS keys (only needed if use.aws=1)
aws.access.key=
aws.secret.key=
1.8 MongoDB Properties

MongoDB configuration properties that need to be set on any non-EC2/AWS installation.

#-------------------------------------------------------------------------------
# 1.8] MongoDB Properties
#-------------------------------------------------------------------------------
# MongoDB config server or servers (must be 1 or 3 comma separated IPs), non-EC2/AWS installations only
db.config.servers=
db.replica.sets=
#----------------------------------------------
# db.cluster.subnet - used for non-EC2/AWS only installations to help mongodb configurations
# identify proper host ip addresses, e.g. 127.0.0.
db.cluster.subnet=
1.10 Elasticsearch Properties

The elastic.cluster property is required by all installations. The elastic.search.nodes property is only used in non-EC2/AWS installations.

#-------------------------------------------------------------------------------
# 1.10] Elasticsearch Properties
#-------------------------------------------------------------------------------
# Cluster name 
# Any unique name within the EC2 cluster/subnet: 
elastic.cluster=
#----------------------------------------------
# ES nodes, e.g.: elastic.search.nodes='NODE1:9300','NODE2:9300','NODE3:9300':
# Needed if discovery.mode=zen (not EC2/AWS), a set of IPs to try (>= 1 must be running elasticsearch)
elastic.search.nodes=
1.13 Entity Extractor Properties

 

#-------------------------------------------------------------------------------
# 1.13] Entity Extractor Properties
#-------------------------------------------------------------------------------
# Alchemy and Open Calais Keys:
# (Obtain from alchemyapi.com or opencalais.com)
extractor.key.alchemyapi=
extractor.key.opencalais=
#----------------------------------------------
# Entity extraction type selection: opencalais or alchemyapi or none
# ("opencalais" has a much higher limit than "alchemyapi" (1000/day) so is recommended for free use
#  "alchemyapi" extracts sentiment, "opencalais" extracts entity associations Note this can be overridden per source)
extractor.entity.default=
# Text extraction type selection: boilerplate or alchemyapi or none
# ("alchemyapi" is much better, but has the limit discussed above. Note this can be overridden per source)
extractor.text.default=
1.14 Interface Related Properties

The ui.end.point.url property is used to 

#-------------------------------------------------------------------------------
# 1.14] Interface Related Properties for the AppConstants.js file found in:
#       /mnt/opt/infinite-tomcat/interface-engine/webapps/ROOT/
#-------------------------------------------------------------------------------
# The REST end point of the server (or a DNS/AWS load balancer across multiple rest end points):
# (Will normally end "/api/") 
ui.end.point.url=http://MY_REST_ENDPOINT/api/

 

2. Properties that can normally be left at their default

2.2 Software as a Service Properties

 

#-------------------------------------------------------------------------------
# 2.2] Software as a service (SAAS) settings
#-------------------------------------------------------------------------------
# If true, allows admin requests that come from trusted sources to have admin privileges: 
app.saas=false
# A list of trusted DNS/IP addresses (eg from CMS):
app.saas.trusted.dns=
Amazon services properties

Amazon services properties need to set if Infinit.e is deployed to servers hosted by Amazon.

...

2.3 Amazon Services Properties

 

#-------------------------------------------------------------------------------
# 2.3] Amazon services properties
#-------------------------------------------------------------------------------
# Values: 0=false, 1=true
# If deployed on an EC2 cluster set this to 1:
use.aws=0
2.6 API Search Test

 

# This is the root s3 bucket name to be used for backups:
# (The following names are used: mongo.<s3.url>, elasticsearch.<s3.url> .. daily backups in the same region
#  backup.mongo.<s3.url>, backup.elasticsearch.<s3.url> ... monthly backups in a different region
#  Note these dirs need to be set up manually)
s3.url=
EMail Server Settings

 

################################################################################
# EMail Server Settings
# The server to be used for mail transactions (eg smtp.google.com):
mail.server=
# Base-64 encoded SHA-256 hash of username:
mail.username=
# Base-64 encoded SHA-256 hash of password:
mail.password=
# This URL is used as the base for links included in the 
# So should point to an accessible REST endpoint (eg the same as ui.end.point.url below)
url.root=http://MY_REST_ENDPOINT/api/

Properties that can normally be left as their default

...

-------------------------------------------------------------------------------
# 2.6] API Search Test Terms and Expected Results
#-------------------------------------------------------------------------------
# List of terms formatted like: "*" "something" "something":
# (The continuous testing randomly selects one of these for querying the API)
api.search.test.terms="*"
# The expected results (max 100), if a different number comes back, the system alerts:
api.search.expected.results=0
2.7 Amazon AWS Settings

 

#-------------------------------------------------------------------------------
# 2.7] Amazon AWS Settings
#-------------------------------------------------------------------------------
# Used for s3cmd, see their web page for details: 
s3.gpg.passphrase=
2.8 MongoDB Properties

 

#-------------------------------------------------------------------------------
# 2.8] MongoDB Properties
#-------------------------------------------------------------------------------
# (server/port should normally point to localhost:27017), where API nodes have a mongos
db.server=localhost
db.port= 27017
# db.sharded - 0 = false and 1 = true
db.sharded=0
# The max number of documents to store (eg 10M). Docs will be dropped in order of age.
# (Not currently supported):
db.capacity=10000000
# MongoDB config server or servers (must be 1 or 3 comma separated IPs), non-EC2/AWS installations only
db.config.servers=
db.replica.sets=
#----------------------------------------------
# db.cluster.subnet - used for non-EC2/AWS only installations to help mongodb configurations
# identify proper host ip addresses, e.g. 127.0.0.
db.cluster.subnet=
#----------------------------------------------
# The location from which to fetch the geo.bson dump used for feature.geo
# can start s3://, http:// or https://, else is assumed to be a file, eg
#db.geo_archive=s3://config.saas.infinite.ikanow.com/geo.bson.tar.gz
# Can always be retrieved here
db.geo_archive=http://www.ikanow.com/infinit.e-preinstall/geo.bson.tar.gz
2.9 UI Inactivity Timeout

 

#-------------------------------------------------------------------------------
# 2.9] UI inactivity timeout (in seconds)
#-------------------------------------------------------------------------------
access.timeout=1800
2.10 Elasticsearch Properties

 

#-------------------------------------------------------------------------------
# 2.10] Elasticsearch Properties
#----------------------------------------------
# Discovery mode = ec2 (if running on AWS) or zen (specify a list of IPs below):
elastic.node.discovery=ec2
#----------------------------------------------
# ES nodes, e.g.: elastic.search.nodes='NODE1:9300','NODE2:9300','NODE3:9300':
# Needed if discovery.mode=zen (not EC2/AWS), a set of IPs to try (>= 1 must be running elasticsearch)
elastic.search.nodes=
#-------------------------------------------------------------------------------
# mlockall = should equal true except if running on a machine with < 4GB of RAM
bootstrap.mlockall=true
# (Should normally be localhost:9300, unless an API node is running with no index node) 
elastic.url=localhost:9300
2.11 Harvester Properties

 

#-------------------------------------------------------------------------------
# 2.11] Harvester Properties
#-------------------------------------------------------------------------------
# Comma-separated-list from File,Database,Feed (note Database and Feed need jars not bundled with the RPM)
harvester.types=File,Database,Feed
# Web crawling etiquette: the time to way between consecutive accesses to the same time (10s is standard)
harvest.feed.wait=10000
# The minimum time between consecutive harvests (avoids thrashing FS/DB/RSS when there's nothing to get)
harvest.mintime.ms=300000
# Restricts the number of docs that can be harvested per cycle for memory reasons:
harvest.maxdocs_persource=5000
# Threading configuration type:num_threads (type from above):
# (eg for RSS heavy increase the "feed", for DB heavy increase the "file" etc. Beyond 20 there is limited benefit). 
harvest.threads=file:5,database:5,feed:20
2.12 Hadoop Config Path

 

#-------------------------------------------------------------------------------
# 2.12] Hadoop config path
#-------------------------------------------------------------------------------
hadoop.configpath=/mnt/opt/hadoop-infinite/mapreduce/
2.13 Entity Extractor Properties

 

#-------------------------------------------------------------------------------
# 2.13] Entity Extractor Properties
#-------------------------------------------------------------------------------
# Alchemy extraction level
# 1==people postproc, 2==geo postproc, 3==both
# (This uses some hard-coded heuristics to work around known AlchemyAPI errors)
app.alchemy.postproc=3
2.14 UI Related Properties

 

#-------------------------------------------------------------------------------
# 2.14] Interface Related Properties for the AppConstants.js file found in:
#       /mnt/opt/infinite-tomcat/interface-engine/webapps/ROOT/
#-------------------------------------------------------------------------------
# For SaaS applications, the URL of the web page (eg containing CMS links for forgot password/logout etc):
# (Can be left blank otherwise)
ui.domain.url=
# Forgot password URL: (SaaS only, ie integrated with a CMS)
# (relative to ui.domain.url):
ui.forgot.password=forgot-password/
# Logout URL: (SaaS only, ie integrated with a CMS)
# (relative to ui.domain.url):
ui.logout=?action=logout
2.15 Map API Key

 

#-------------------------------------------------------------------------------
# 2.15] Maps API key:
#-------------------------------------------------------------------------------
# Not needed for MapQuest open API
google.maps.api.key=