Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The following table describes the parameters of the file feed extractor configuration.

FieldDescription
feedType

Currently not used - will allow for RSS vs Atom in future releases (currently only RSS is supported)

waitTimeOverride_ms

Optional - if specified, controls the amount of time between successive reads to a site (default: 10000ms): // ie if a site is timing out it may limit the number of accesses from a given IP - set the number higher // for large sites you can increase the performance of the harvester by setting this number lower.

updateCycle_secs

Optional - if present harvested URLs may be replaced if they are older than this time and are encountered from the RSS or in the "extraUrls"

regexInclude

Optional - if specified, only URLs matching the regex will be harvested

regexExclude

Optional - if specified, any URLs matching the regex will not be harvested

extraUrls

This array allows for manually specified URLs to be harvested once { "url": string // The URL

userAgent

(Optional) If present overrides the system default user agent string

proxyOverride

(Optional) "direct" to bypass proxy (the default), or a proxy specification "(http|socks)://host:port"

httpFields

(Optional) Additional HTTP fields to be applied to the request headers  

...