Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Overview

Some of the pipeline processing elements enable you to use javascript to obtain metadata, create entities/associations, and perform various other operations.

...

  • specifying a javascript code block
  • a lit list or urls of javascript locations that can be imported.

...

In general terms, the use of javascript for IKANOWs Infinit.e falls into several major categories

...

When data is ingested into IKANOWs Infinit.e it is converted into documents.  The various elements of the processing pipeline can then act on these documents to get metadata.  Metadata objects can then be made available to functions and inline scripts.

You can get configure how you will get the metadata out of the text or metadata by setting script flags.  For example, you can receive the metadata as _doc, _metadata, or as full text.

Also, when iterating over a JSON array each item in the array is passed into the ScriptEngine and is made accessible via an object named: _iterator.

Examples

_doc.metadata

Code Block
"city": "$SCRIPT( return _doc.metadata.location[0].citystateprovince.city; )",

_metadata

Code Block
 "contentMetadata": [                {
                    "fieldName": "email_meta",
                    "script": "var x=_metadata._FILE_METADATA_[0].metadata;x;",
                    "scriptlang": "javascript",
                    "flags": "m"

iterator

Code Block
var make = _iterator.make;
var model = _iterator.model;
var year = _iterator.year;

 

For more information about using javascript to get metadata and for detailed examples and descriptions, see Content metadata, Manual text transformation.

...

You can use javascript to create entities and associations by calling the metadata using the $SCRIPT and $FUNC scripting conventions.

For example, When document metadata iterates over a JSON array each item in the array is passed into the ScriptEngine and is made accessible via an object named: _iterator.For more information about using javascript to create entities and associations, see Manual entities, Manual association of entities.

Criteria

Criteria

...

IN PROGRESS

 

 is a common field shared by all of the pipeline elements.

It can be used to specify a javascript expression which can control the order in which entities extractors are applied to the ingested documents.  The javascript expression can be setup to choose entities extractors, based on the content and metadata extracted so far in the pipeline.

For example, Infinit.e supports the Open Calais, and Salience extraction engines.

The criteria field is of most use with the following pipeline processing elements

  • automated text extraction
  • feature extraction

For more information about use of criteria, and for detailed examples see Automated text extraction, and Feature extraction.

$PATH, $SETPATH and $SCRIPT

Additionally to criteria condition representing logical conditions for extraction, some criteria values will be generated if the pipeline contains conditional elements:

  • each conditional element creates a $SETPATH(<branchA>,<branchB>) statement.  As an example,  a conditional element having node-id =3 would create $SETPATH(3_True,3_False)
  • subsequent elements will have a $PATH(<branch>) statement as part iof the criteria value. A node in the True-branch placed after the conditional node (id=3) would  have $PATH(3_True) as part of the criteria statement.
  • logical conditions for allowing to control the order in which entities extractors are applied will still be placed within a $SCRIPT() statement

These $PATH, $SETPATH and $SCRIPT statements are internally assembled by the flow-builder and will become part of criteria fields of the elements.

If the sourceBuilder() function creates more than one source element per node, the criteria script will be generated for all elements. However, there is one exception:

Only the last element created by a conditional node will contain the criteria value. 

Creation Criteria Scripts

Both entity and association specification objects provide a field called "creationCriteriaScript". This must be JavaScript (though you still need to set the engine and enclose in either $SCRIPT or $FUNC), and you can return one of two things from it:

  • A boolean, in which case the entity object is only created if 
  • A string, in which case any non-null string is treated like a boolean false, and in addition the string is logged as an error that can be accessed from the "harvest.harvest_message" field of sources.
Info

The creation criteria script is executed before any other scripts in the specification object.