Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

2.5 Email Addresses  for Log Files
2.6 API Settings

 

  • Default search test terms and expected results values used to monitor the Infinit.e service.
Code Block
#-------------------------------------------------------------------------------
# 2.6] API Settings
#-------------------------------------------------------------------------------
# API Search Test Terms and Expected Results
# List of terms formatted like: "*" "something" "something":
# (The continuous testing randomly selects one of these for querying the API)
api.search.test.terms="*"
# The expected results (max 100), if a different number comes back, the system alerts:
api.search.expected.results=0

...

Code Block
#-------------------------------------------------------------------------------
# 2.11] Harvester Properties
#-------------------------------------------------------------------------------
# Comma-separated-list from File,Database,Feed (note Database and Feed need jars not bundled with the RPM)
harvester.types=File,Database,Feed
# Web crawling etiquette: the time to way between consecutive accesses to the same time (10s is standard)
harvest.feed.wait=10000
# The minimum time between consecutive harvests (avoids thrashing FS/DB/RSS when there's nothing to get)
harvest.mintime.ms=300000
# The minimum time between consecutive source harvests (set if needs to be longer than harvest.mintime.ms,
# eg if you want to pick up a source quickly the first time but then not update so frequently)
harvest.source.mintime.ms=
# Restricts the number of docs that can be harvested per cycle for memory reasons:
harvest.maxdocs_persource=5000
# Threading configuration type:num_threads (type from above):
# (eg for RSS heavy increase the "feed", for DB heavy increase the "file" etc. Beyond 20 there is limited benefit). 
harvest.threads=file:5,database:5,feed:20
# This controls the batch size of sources picked up by a thread, this does not normally need to be changed (its default is shown)
# (It can be reduced in cases where a small number of very long-running sources need to be harvested).
#harvest.distribution.batch.harvest=20
# This disables entity and association aggregation. For almost all applications you will not want to set this.
#harvest.disable_aggregation=false
# This parameter uses the Java Security Manager to prevent scripts accessing local network services (at the expense of some performance)
# It can be turned off for uses of the platform where sources must be approved before being added (etc)
harvest.security=true
# This is a comma-separated list of hosts in the following format "http://<HOST>[:<PORT>]" or "socks://<HOST>:<PORT>"
# When specified, all requests for external content from the harvester are proxied (round-robin) through the specified hosts
harvest.proxy=
2.12 Hadoop Properties

The Hadoop config path is a local folder where Infinit.e stores map reduce jobs if Hadoop is used.

...