Specifying Document Level Geographical Location

The Structured Analysis Harvester supports assigning a geographical location to a document using latitude and longitude values. There are two ways in which you can have the harvester assign a location to a document in the harvesting process:

Specify the Latitude and Longitude Explicitly

The docGeo object allows you to specify the fields to extract from your structured data source to use as your latitude and longitude values. In the example below the "lat" (or latitude) value is being extracted from the document's metadata.latitude field while the "lon" (or longitude) value is being extracted from the metadata.longitude field of the extract document.

Source.structuredAnalysis object
"structuredAnalysis" : {
    ...
    docGeo : {
        "lat" : "$metadata.latitude",
        "lon":"$metadata.longitude"},
    ...
}

Specify City, State/Province, and Country

The second method specifying a document level geographical location is to specify city, state/province, and country in the the docGeo object as shown in the example below (Note: The example below uses embedded JavaScript. Information on using JavaScript within the StructuredAnalysis object can be found here: Transforming data with JavaScript):

Source.structuredAnalysis object
"structuredAnalysis" : {
    ...
    "docGeo" : {
        "city" : "$SCRIPT( return _doc.metadata.location[0].citystateprovince.city; )",
        "stateProvince" : "$SCRIPT( return _doc.metadata.location[0].citystateprovince.stateprovince; )",
        "country" : "$SCRIPT( return _doc.metadata.location[0].country; )"
    },
    ...
}

The Structured Analysis Harvester attempts to use the city, state/province, and country information supplied to retrieve latitude and longitude values from the Infinit.e GeoReference table via GeoRefence.enrichGeoInfo() method.

Note: The enrichGeoInfo() function can be called specifying whether or not to return an exact match (Boolean exactMatchOnly). If the exactMatchOnly parameter is set to false the enrichGeoInfo function will attempt to broaden its search if it is unable to match the original search parameters (i.e. city, state, country). Ultimately the enrichGeoInfo function will apply the average latitude and longitude values for a location's country (the geographical center point) if it is not possible to obtain a more precise match.

Alternatives

In some cases, it will not be clear what geographical type a field is (eg a freeform field that might be city, state, or country). The geographical specification allows you to specify alternatives, eg:

"structuredAnalysis" : {
    ...
    "docGeo" : {
        "city" : "$SCRIPT( return _doc.metadata.cityOrStateOrProvinceOrCountry; )",
        "country" : "USA"
		"alternatives": [
			{
		        "stateProvince" : "$SCRIPT( return _doc.metadata.cityOrStateOrProvinceOrCountry; )",
		        "country" : "USA"
			},
			{
		        "country" : "$SCRIPT( return _doc.metadata.cityOrStateOrProvinceOrCountry; )"
			}
		]
    },
    ...
}

The alternatives are tried in order until one of them works or there are no more to try.

Further Reading