JSON format
Note that there is a separate overview of using the Structured Analysis Harvester. This page is reference information.
...
Code Block | ||||
---|---|---|---|---|
| ||||
{ "scriptEngine" : "string", // OPTIONAL: String, Infinit.e currently only supports "javascript" (or "JavaScript"), which is the default "script" : "string", // OPTIONAL: String, can contain one or more JavaScript functions, // i.e. "function func() { var foo = 'test'; return foo; }" "scriptFiles" : [ "string" ], // OPTIONAL: Array of Strings, URLs of JavaScript // files to import at runtime "caches": { "string": "string", ... } // A list of caches in the format <CACHE_NAME>:<ID> where <ID> is the "_id" of a JSON share, see overview "title" : "string", // OPTIONAL: String, else document title is whatever is generated by the harvester (eg from RSS/filename) "fullText" : "string", // OPTIONAL: String, else full text is taken from the document contents as per usual. "description" : "string", // OPTIONAL: String, else document description is whatever is generated by the harvester (or an entity extractor if supported) "displayUrl": "string", // OPTIONAL: String, this field is just used for display "publishedDate" : "string", // OPTIONAL: String, must return a date string in a standard format (eg Java, Javascript, ISO, SMTP, MM/dd/yy, MM/dd/yyyy etc) // If not present, published data either comes from harvester (eg created date for files), or is the current time "entities" : [ { ... } ], // OPTIONAL: to create entities from the metadata (see below) "associations" : [ { ... } ], // OPTIONAL: to create associations (events/facts/summaries) from the metadata (see below) "docGeo" : { // (OPTIONAL, to specify the document geo tag) // ONE OF THE FOLLOWING 2 SETS OF FIELDS: // Specify directly: "lat" : "string", // latitude "lon" : "string", // longitude // Or fill in as many search options as possible, if a match can be found it populates the lat/long "city" : "string", // String "stateProvince" : "string", // String "country" : "string", // String "countryCode" : "string" // String }, "rejectDocCriteria": "string", // OPTIONAL: String, an optional script that returns null to keep the document, any string to reject the doc (the string is logged) "metadataFields": "string", // OPTIONAL: String, if present a comma-separated list of top-level metadata fields to either exclude (if "metadataFields" starts with '-'), // or only include (starts with '+', default) - the fields are deleted after all processing but before indexing and storage. // In addition for "-" only, nested objects can be deleted using dot notation, eg "json.object.nested.field" "onUpdateScript": "string" // OPTIONAL: Used to preserve existing metadata when documents are updated, and also to generate new metadata based on the differences // between old and new documents. This script is discussed further in the Structured Analysis Overview linked at the top of this page } |
...