Geo JSON format

Geo Format

This section describes both the JSON format used to represent geospatial information, and also some rationale for that representation (specifically the meaning and use of the "ontology_type" field).

In this section:


 

Overview

Geospatial information is present in the following objects:

There are essentially two types of object format: for the first three objects in the above list:

Most common geo format
//(entity or document object)
{
	"geotag": { // (or "docGeo" for documents
		"lat": number,
		"lon": number
	}
	"ontology_type": string // Only present for entities, for documents/associations, defaults to "point" (see below)
}

And then for geo aggregation:

Geo aggregations
// (query object)
{
	"geo": [
		{
			"type": string, // this is the ontology type
			"count": integer, // the number of times this lat/long/type appears in the query
			"lat": number,
			"lon": number
		},
		//etc
	]
}

Usage Guide

Ontology types are currently used in three ways in the Infinit.e platform (including the stock visualizations):

  • When querying: If the user specifies the "ontology_type" in the geo query, then only strictly "smaller" types will be searched (eg if countrysubsidiary is specified then only city and point types will match). geographicalregion counts as being at the same level as continent for the purpose of this heuristic.
  • Entities returned from a query (either as aggregations or child objects of documents): The ontology type is returned and the map widget allows the user to include or exclude flags based on type.
  • Geo aggregations: The ontology type is returned and the map widget allows the user to include or exclude points from the heatmap based on type

Field Guide

Ontology Type Overview

The meaning and usage of the "ontology_type" field merits further discussion.

One of the issues with using a single (lat,long) point to geotag locations is that actual locations are areas (ignoring height, which is a problem for another day). Whether you care about this depends on the scale at which the query is being launched, eg:

  • If you're looking from a national level then viewing cities as points is fine...
  • ... If you're looking at a city plus the surrounding area of then it likely isn't.
  • If you just want to see which countries are being talked about on a map, then viewing a country as a point is fine...
  • ... If you're viewing a mix of geotagged entities including cities and countries, then it becomes confusing.

The ideal solution is to represent "where" entities with a polygon of points, but this is computationally expensive, difficult to generate, etc (and also doesn't solve many of the visualization problems - eg how do you render a set of entities with some polygons, some points on a map?). The Community Edition roadmap includes plans to incorporate an initially limited set of polygons in an initially limited way, which was the "last story out" in V0.

In the meantime, the "ontology type" provides a consistent terminology for dealing with the size of a geotagged entity in order to either include or ignore it in searches and visualization

For visualizations, eg inside actionscript or javascript, you can also use the type as an indicator to treat the point differently - eg go fetch its polygon from a third party source).

The remainder of this page should clarify this overview by explaining what the different values are allowed, how they are generated, and how they are used.

Specification

The values of the ontology type are a small subset of geographic types from the OpenCyc OWL ontology:

Note that apart from "geographicalregion", there is a clear hierarchy to the different types.

When entities are generated from OpenCalais or AlchemyAPI, the following regex mappings are applied (substring matching is allowed):

When generated manually (Manual entities) or by a different (plugin) entity extractor, the following rules are applied:

  • The user/plugin developer can specify the ontology type manually
  • The above matching rules are applied (ie defaulting to "point')

Other values may be added in the future if needed (together with the possibility of allowing users to generate their own ontology type mappings).