Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

The Infinit.e Structured Analysis Harvester is designed to take data ingested from structured data sources (database tables, XML documents, etc.) and enrich the data via the assignment of geospatial information, entities and events. The Structured Analysis Harvester is also capable of transforming source data via basic string concatenation (using simple regular expression support) and more complex transformations using JavaScript. The example Source.structuredAnalysis object below demonstrates the basic features of specifying how to enrich harvested structured data.

Source.structuredAnalysis object
source : {
   ... 
   structuredAnalysis : {
        docGeo : {"lat":"$metadata.latitude","lon":"$metadata.longitude"},
        description : "$metadata.reportdatetime: $metadata.offense,$metadata.method was 
            reported at: $metadata.blocksiteaddress",
        entities : [
            {disambiguous_name:"$metadata.offense,$metadata.method", dimension:"What", 
                type:"CriminalActivity"},
            {disambiguous_name:"$metadata.blocksiteaddress,$metadata.city,$metadata.state",
                dimension:"Where",type:"Place", geotag: {latitude:"$metadata.latitude",
                longitude:"$metadata.longitude"}}],
        events : [ 
            {entity1:"$metadata.offense,$metadata.method",verb:"reported",verb_category:"crime",
                time_start:"$metadata.reportdatetime","geo_index" : "Location", 
                geotag: {latitude:"$metadata.latitude",longitude:"$metadata.longitude"} }]
   }
   ...
}
Using the $ Operator to Extract Document Data

When structured data is extracted from a source (via the File, Database, or other harvester), each field extracted is captured in the Feed.metadata object. Within the Structured Analysis Harvester data stored in the Metadata object can be access using the $ operator to signify that we are attempting to retrieve data from a field in our document. For example, in the document above you can extract the Offense field using the following syntax:

$metadata.offense

Note: When data is extracted and added to the Metadata object all field name are converted to lowercase.

Further Reading
  • No labels