Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

Once data is ingested into Infint.e from the various extractors it is stored in JSON format including its metadata fields and content.  It also contains sub-objects such as entities and associations.

...

FieldDescription
rejectDocCriteria

OPTIONAL: If populated, runs a user script function and if return value is non-null doesn't create the object and logs the output. *Not* wrapped in $SCRIPT().

onUpdateScript

OPTIONAL: Used to preserve existing metadata when documents are updated, and also to generate new metadata based on the differences between old and new documents. *Not* wrapped in $SCRIPT().

metadataFieldStorage

OPTIONAL: A comma-separated list of top-level metadata fields to either exclude (if "metadataFields" starts with '-'), or only include (starts with '+', default) - the fields are deleted at that point in the pipeline.

If the negative filter (ie starts with '-') is used then metadata fields can be nested, using the dot notation. For the positive filter (default), the fields must be top-level.

Use Cases

The fields of the Document storage settings configuration can be used to support the following use cases

...

Anchor
storage
storage
Document Storage Settings

Metadata Field Storage

...

Consider this document:

Code Block
{
	//...
	"metadata": {
		"field1": {
			//...
		},
		"field2": {
			"field2.1": "test",
			"field2.2": "object"
		}
	}
}

Here are some example metadataFieldStorage fields, and the resulting documents after the pipeline element is complete.

Code Block
"metadataFieldStorage": "+" 
{
	//...
	"metadata": {
	}
}
 
"metadataFieldStorage": "field1" 
{
	//...
	"metadata": {
		"field1": {
			//...
		}
	}
}
 
"metadataFieldStorage": "-field2.2" 
{
	//...
	"metadata": {
		"field1": {
			//...
		}
		"field2": {
			"field2.1": "test"
		}
	}
}
 
"metadataFieldStorage": "field2.2" 
// NOT ALLOWED
 
"metadataFieldStorage": "-field1,field2.2" 
{
	//...
	"metadata": {
		"field2": {
			"field2.1": "test"
		}
	}
}

Filtering Creation of Entities and Associations

rejectDocCriteria provides a way to evaluate some data for a specific set of criteria.  If the return value is non-null (ie. the criteria has matched on some of the data) this returned data the document will not be used to generate entities, or associations. This parameter is a good way to create filters on the creation of metadata entities and associations.

The example source below shows how to populate this parameter with a script for filtering purposes.

be discarded.

In the example , used to analyze some data in a Twitter feed, the JSON object "link" would not be used as a metadata object for the creation of any entities or associations.  The object would simply be logged as part of the discovered metadata.

 

below, if the JSON field obtained from a twitter aggregation service didn't contain one of the two fields "link" or "object", then it would be discarded.

Code Block
},        {
            "storageSettings": {
                "rejectDocCriteria": "$SCRIPT( if (null == _doc.metadata.json[0].link || null == _doc.metadata.json[0].object) return 'reject'; )"
            }
        }
    ]
}

 

 

...

Anchor
updating
updating
Updating Documents

...

Info

 The "$SCRIPT" convention used in entity/association scriptlets is not required here.

...

This script has access to the following Javascript objects:

...

The last evaluated expression in the script (eg you don't "return val;" you just end the script "val;"), which can be a string, an object, or an array of objects is placed in a metadata field called "_PERSISTENT_".

Preserving Metadata From Old Versions Of Documents

The following code saves the entirety of the old document's metadata:

...

Code Block
 "onUpdateScript": "var retVal = _old_doc.metadata; retVal;"}
// RESULT (IN THE CASE OF A DOCUMENT THAT DOESN'T CHANGE):
{
    // Usual document fields
    "metadata": {
        "test1": "test",
        "test2": { "field": "value" },
        "_PERSISTENT_": [{
            "test1": "test",
            "test2": { "field": "value" },
        }]
    }
}

Generating New Metadata Based On Both New And Old Versions Of Documents

In this example, the return value will represent the delta of the two documents under comparison.

...

Panel

Footnotes:

Legacy documentation:

Legacy documentation:

TODO