Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 14 Next »

Overview

This toolkit element enables the generation of one of more types of association between existing entities based on the document or content metadata. The expressions default to replacement strings, or $SCRIPT(...) can be used to return a string using javascript.

 

Format

{
	"display": string,
	"associations": [
	{
	    "iterateOver":string, //  OPTIONAL: If specified as a list of entity types, steps over entities with matching types (again, lock step or combinatorially)
	                    // Can also specify a metadata field (nesting supported using dot notation), in which case they are looped over to generate calls with _value/_iterator/_index
	    "entity1":string, //  OPTIONAL: String/script: In "iterateOver"/type cases, the disambiguated name of the entity type; otherwise using entity1_index is preferred.
	    "entity1_index":string, // OPTIONAL: String/script: should return the 'disambiguated_name/type' string, must resolve to an entity or is discarded
	    "entity2":string, // OPTIONAL: String/script: In "iterateOver"/type cases, the disambiguated name of the entity type; otherwise using entity1_index is preferred.
	    "entity2_index":string, // OPTIONAL: String/script: should return the 'disambiguated_name/type' string, must resolve to an entity or is discarded
	    "verb":string, // MANDATORY: String/script
	    "verb_category":string, // MANDATORY: String/script
	    "assoc_type":string, // MANDATORY: Must specify/return one of "Fact", "Event", "Summary" and be overridden (eg converted to summary if there is only 1 index)
    	                // (if left blank, the Structured Analysis Handler will auto-generate this field reasonably accurately based on the contents)
	    "time_start":string, // OPTIONAL: String/script: Must specify/return a time in ISO date format ("yyyy-MM-dd'T'HH:mm:ss") or Javascript time format
	    "time_end":string, // OPTIONAL: String/script: Must specify/return a time in ISO date format ("yyyy-MM-dd'T'HH:mm:ss") or Javascript time format
	    "geo_index":string, // OPTIONAL: String/script: The entity index corresponding to the "geotag" below (or the Type in "iterateOver" cases)
	    "geotag": { // OPTIONAL: Format is identical to the docGeo format specified above
    	    "lat":string, "lon":string,
	        "city":string, "stateProvince":string, "country":string, "countryCode":string
	    },
	    //(note the ontology_type for associations is always "point" - use geo_index to specify larger areas)
	    "creationCriteriaScript":string, // // OPTIONAL: script: If populated, runs a user script function and if return value is false doesn't create the object
	}
	]
}

 

Description

Associations can be "something that happens or is regarded as happening; an occurrence, especially one of some importance", "the outcome, issue, or result of anything", or "something that occurs in a certain place during a particular interval of time" (Definitions found here: http://dictionary.reference.com/browse/event). Within Infinit.e events are typically a combination of entities assembled in the form of Noun - Verb - Noun, e.g. "a car crashed into a building", "the plane flew to San Diego". In addition to the Noun - Verb - Noun form events can include geographic information (i.e., where an event happened) as well as a start and/or end time for an event.

Examples

Basic Association

In the basic association example, the code specifies an entity from the document metadata.  The association specifies the relationship between the type of offense and the date and time at which it was reported.

  {
            "associations": [
                {
                    "entity1": "$metadata.offense,$metadata.method",
                    "geotag": {},
                    "time_start": "$metadata.reportdatetime",
                    "verb": "reported",
                    "verb_category": "crime"
                }
            ]
        }

 

Sample Output:

The resulting output displays the details of the date and time, and the specific type of offense reported.

 "associations": [
        {
            "entity1": "theft,2",
            "entity1_index": "theft,2/criminalactivity",
            "verb": "reported",
            "verb_category": "crime",
            "time_start": "2011-01-29T00:00:00",
            "geotag": {
                "lat": 38.9099278028729,
                "lon": -77.0436067765966
            },
            "assoc_type": "Summary"
        }

 

iterateOver

iterateOver can be used in more advanced cases.

Iterate Over a Metadata Field:

Specify a metadata field, in order to generate associations.

eg.  "iterateOver": "json.twitter_entities.user_mentions"

This is the same as iterating over a metadata array to obtain entities using Manual entities.

See detailed example below

Iterate Over a Single Entity Type:

Use "dummy" to iterate over only one entity type,

eg "entity1/dummy" or "entity2,dummy"

See detailed example below

Multiplicative:

Create one association for every combination of entities of specified types

eg. "iterateOver": "entity1/entity2/geo_index",

See detailed example below

Associative:

Create one association for every pair (/set) of entities of specified types

See detailed example below

creationCriteriaScripts

Association fields generated from the entity loop are placed in "_iterator". For example, for "iterateOver": "entity1/entity2/geo_index", an _iterator object with the following fields is available in the Javascript: "_iterator.entity1_index", "_iterator.entity2_index", "_iterator.geo_index".

These fields can be usefully used together with "creationCriteriaScript" scriptlets to filter out unwanted associations, eg when looping over entity1 and entity2 with the same entity type, the following script would ensure the association didn't involve the same entity:

"creationCriteriaScript": "$SCRIPT( return _iterator.entity1_index != _iterator.entity2_index; )", "iterateOver": "entity1/entity2", "entity1": "EmailAddress", "entity2": "EmailAddress", //etc

The creationCriteriaScript runs before the association is generated (so can be safely used to remove items that would return errors).

See detailed example below

Examples

iterateOver

iterateOver a Metadata Field

The following code example is used to process email communication between two parties.  The association between the two entities is formed by iterating over the metadata object: "email_meta.Message-To"

{
            "associations": [
                {
                    "assoc_type": "Event",
                    "entity1": "$SCRIPT( return _doc.metadata._FILE_METADATA_[0].metadata.Author[0];)",
                    "entity2": "$SCRIPT(return _value;)",
                    "iterateOver": "email_meta.Message-To",
                    "time_start": "$SCRIPT( return _doc.publishedDate;)",
                    "verb": "emailed",
                    "verb_category": "emailed/communicated"
                }
            ]
        }

Output:

The output displays the association between the sender and receiver in the email correspondence.

 ],
    "associations": [
        {
            "entity1": "cara.semperger@enron.com",
            "entity1_index": "cara.semperger@enron.com/account",
            "verb": "emailed",
            "verb_category": "emailed/communicated",
            "entity2": "will.smith@enron.com",
            "entity2_index": "will.smith@enron.com/account",
            "time_start": "2001-07-09T14:33:32",
            "assoc_type": "Event"
        }
    ],

 

Multiplicative

Multiplicative association are associations that are created by "multiplying" a combination of entities, locations, and times together to determine the number of associations to extract from the source data.

In the example, a perpetrator (Sunni Islamic Extremist) attacked multiple types of victims (an adult and a child) in Sri Lanka.

The association specification uses the multiplicative format to create events using the following math to determine the total number of associations: Entity1 (Person Perpetrator) * Entity2 (Victim Type) * Geo_index (Location) = Total Number of Associations.

 

},
                {
                    "creationCriteriaScript": "$FUNC( isOrganizationSpecified(); )",
                    "entity1": "Organization",
                    "entity2": "FacilityType",
                    "geo_index": "Location",
                    "iterateOver": "entity1/entity2/geo_index",
                    "time_start": "$SCRIPT( return _doc.metadata.incidentdate[0]; )",
                    "verb": "attacked",
                    "verb_category": "assault/attack"
                },
                {
                    "creationCriteriaScript": "$FUNC( isOrganizationSpecified(); )",
                    "entity1": "Organization",
                    "entity2": "VictimType",
                    "geo_index": "Location",
                    "iterateOver": "entity1/entity2/geo_index",
                    "time_start": "$SCRIPT( return _doc.metadata.incidentdate[0]; )",
                    "verb": "attacked",
                    "verb_category": "assault/attack"
                },
                {
                    "creationCriteriaScript": "$FUNC( isOrganizationSpecified(); )",
                    "entity1": "Organization",
                    "entity2": "HostageType",
                    "geo_index": "Location",
                    "iterateOver": "entity1/entity2/geo_index",
                    "time_start": "$SCRIPT( return _doc.metadata.incidentdate[0]; )",
                    "verb": "took hostage",
                    "verb_category": "assault/attack"
                },
                {
                    "creationCriteriaScript": "$SCRIPT( if (isOrganizationSpecified() == false) return true; )",
                    "entity1": "PersonPerpetrator",
                    "entity2": "FacilityType",
                    "geo_index": "Location",
                    "iterateOver": "entity1/entity2/geo_index",
                    "time_start": "$SCRIPT( return _doc.metadata.incidentdate[0]; )",
                    "verb": "attacked",
                    "verb_category": "assault/attack"
                },
                {
                    "creationCriteriaScript": "$SCRIPT( if (isOrganizationSpecified() == false) return true; )",
                    "entity1": "PersonPerpetrator",
                    "entity2": "VictimType",
                    "geo_index": "Location",
                    "iterateOver": "entity1/entity2/geo_index",
                    "time_start": "$SCRIPT( return _doc.metadata.incidentdate[0]; )",
                    "verb": "attacked",
                    "verb_category": "assault/attack"
                },
                {
                    "creationCriteriaScript": "$SCRIPT( if (isOrganizationSpecified() == false) return true; )",
                    "entity1": "PersonPerpetrator",
                    "entity2": "HostageType",
                    "geo_index": "Location",
                    "iterateOver": "entity1/entity2/geo_index",
                    "time_start": "$SCRIPT( return _doc.metadata.incidentdate[0]; )",
                    "verb": "took hostage",
                    "verb_category": "assault/attack"
                }

 

Sample Output:

TODO add sample output.

Associative

Additive associations cover the less common case where (eg) 2 entity types have the same number of elements and are ordered "in lock step". For example:

"entities": [
	{ "index": "alex/person", ... },
	{ "index": "craig/person", ... },
	{ "index": "baltimore/city", ... },
	{ "index": "washington dc/city", ...},
	...
]

In this case the additive association specification:

{
	"iterateOver": "entity1,entity2", // note "," instead of "/"
	"entity1": "Person",
	"entity2": "City",
	"verb_category": "lives in",
	...
}

Would generate the 2 associations "alex/person lives in baltimore/city" and "craig/person lives in washintgon dc/city".

 

Footnotes:

Legacy documentation:

Legacy documentation:

 

 



  • No labels