Overview
Once data is ingested into Infint.e from the various extractors it is stored in JSON format including its metadata fields and content. It also contains sub-objects such as entities and associations.
...
Field | Description |
---|---|
rejectDocCriteria | OPTIONAL: If populated, runs a user script function and if return value is non-null doesn't create the object and logs the output. *Not* wrapped in $SCRIPT(). |
onUpdateScript | OPTIONAL: Used to preserve existing metadata when documents are updated, and also to generate new metadata based on the differences between old and new documents. *Not* wrapped in $SCRIPT(). |
metadataFieldStorage | OPTIONAL: A comma-separated list of top-level metadata fields to either exclude (if "metadataFields" starts with '-'), or only include (starts with '+', default) - the fields are deleted at that point in the pipeline. If the negative filter (ie starts with '-') is used then metadata fields can be nested, using the dot notation. For the positive filter (default), the fields must be top-level. |
Use Cases
The fields of the Document storage settings configuration can be used to support the following use cases
...
Anchor | ||||
---|---|---|---|---|
|
Metadata Field Storage
...
Consider this document:
Code Block |
---|
{
//...
"metadata": {
"field1": {
//...
},
"field2": {
"field2.1": "test",
"field2.2": "object"
}
}
} |
Here are some example metadataFieldStorage fields, and the resulting documents after the pipeline element is complete.
Code Block |
---|
"metadataFieldStorage": "+"
{
//...
"metadata": {
}
}
"metadataFieldStorage": "field1"
{
//...
"metadata": {
"field1": {
//...
}
}
}
"metadataFieldStorage": "-field2.2"
{
//...
"metadata": {
"field1": {
//...
}
"field2": {
"field2.1": "test"
}
}
}
"metadataFieldStorage": "field2.2"
// NOT ALLOWED
"metadataFieldStorage": "-field1,field2.2"
{
//...
"metadata": {
"field2": {
"field2.1": "test"
}
}
} |
Filtering Creation of Entities and Associations
rejectDocCriteria
provides a way to evaluate some data for a specific set of criteria. If the return value is non-null (ie. the criteria has matched on some of the data) this returned data the document will not be used to generate entities, or associations. This parameter is a good way to create filters on the creation of metadata entities and associations.
The example source below shows how to populate this parameter with a script for filtering purposes.
be discarded.
In the example , used to analyze some data in a Twitter feed, the JSON object "link" would not be used as a metadata object for the creation of any entities or associations. The object would simply be logged as part of the discovered metadata.
below, if the JSON field obtained from a twitter aggregation service didn't contain one of the two fields "link" or "object", then it would be discarded.
Code Block |
---|
}, { "storageSettings": { "rejectDocCriteria": "$SCRIPT( if (null == _doc.metadata.json[0].link || null == _doc.metadata.json[0].object) return 'reject'; )" } } ] } |
...
Anchor | ||||
---|---|---|---|---|
|
...
Info |
---|
The "$SCRIPT" convention used in entity/association scriptlets is not required here. |
...
This script has access to the following Javascript objects:
...
The last evaluated expression in the script (eg you don't "return val;" you just end the script "val;"), which can be a string, an object, or an array of objects is placed in a metadata field called "_PERSISTENT_".
Preserving Metadata From Old Versions Of Documents
The following code saves the entirety of the old document's metadata:
...
Code Block |
---|
"onUpdateScript": "var retVal = _old_doc.metadata; retVal;"} // RESULT (IN THE CASE OF A DOCUMENT THAT DOESN'T CHANGE): { // Usual document fields "metadata": { "test1": "test", "test2": { "field": "value" }, "_PERSISTENT_": [{ "test1": "test", "test2": { "field": "value" }, }] } } |
Generating New Metadata Based On Both New And Old Versions Of Documents
In this example, the return value will represent the delta of the two documents under comparison.
...
Panel |
---|
Footnotes: Legacy documentation: Legacy documentation:
TODO |