Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Overview

The Infinit.e platform supports scripting the transformation of source data using JavaScript via Rhino, Mozilla's open-source JavaScript implementation (http://www.mozilla.org/rhino/). The following document provides an introduction to specifying JavaScript based data transformation via the Structured Analysis Harvester object.

...

Info

The creation criteria script is executed before any other scripts in the specification object.

Lookup tables in the Unstructured Analysis Handler

It is possible to add lookup tables from JSON shares that can be used in all the javascript scripts in the structured analysis handler (and also the unstructured analysis handler).

These lookup tables to provide a limited form of aliasing a harvest time - also check out the full query-time aliasing capability - in addition to many other cases where a potentially large and dynamic lookup table would be useful.

Using the lookup technology is easy:

  • At the top level of the "structuredAnalysis" object, create a "caches" object that consists of the following:
Code Block
languagejavascript
"structuredAnalysis": {
	"caches": {
		"myLookupTable": "4e0c7e99eb5af0fbdcfbf697"
	}
}
  • Then within any script in the "structuredAnalysis", you can access the JSON object by the local name specified as above. For example, say the following JSON object has been uploaded:
Code Block
languagejavascript
{
	//...
	"US": "United States", "USA", "United States of America",
	"UK": "United Kingdom", "Great Britain", "GB",
	//...
} 

Then the lookup table could be used as follows:

Code Block
{
	"structuredAnalysis": {
		// (caches object specified as above)
		//...
		"entities": [
			//...
			{
				"iterateOver": "geo.countries",
				"disambiguatedName": "$SCRIPT( return myLookupTable[ _value ];)",
				"type": "Country"
			}
		],
		//...
	}
}
Further Reading