Advanced Options

Advanced Options

The Advanced Options appear when you click on Options>Advanced Options.

Description

Use the Advanced Options to set document analysis thresholds, scoring weights, geo decay, and other settings which impact the overall functionality of the platform and the widgets.

FieldDescriptionNotes
Enable/Disable Scoring

When disabled, documents & entities are returned without significance or relevance scoring.

Default setting is enabled.

 
Number of documents to analyze

The maximum number of documents to be returned from the Lucene (/ElasticSearch) query and analyzed according to the significance algorithm.

Default is 1,000 Documents.

 
Scoring Weights

Ratio of Significance vs. Relevance scoring weights

Default setting is 2:1 (Significance:Relevance)

 
Weight by

Applies half-life principle to results ranking, based on desired time interval.

Time decay date:

Hour:

Half Life:

 

Example: 1m (one month) time decay -  results within 1 month of the entered date are promoted to top of results; results between 1 to 2 months from decay time are halved; results 2 to 3 months from decay time are quartered, etc. 
Geo Decay

Applies half-life principle to results ranking based on distance (kilometers) from lat/long centerpoint.

Default: None

Example: a geo decay of 10k (10 kilometers) means that results 10-20k from the centerpoint have their scores reduced by 50%, results 20-30k from centerpoint have scores reduced 25%, etc.
Aggregate Significance Weighting

If true, aggregated entities are weighted by relevance to ensure entities occurring in more relevant documents are weighted up

Automatic: The platform attempts to decide for itself based on the query. 

Always Weight: Entities in more relevant documents are always weighted up.

Never Weight: Entities in more relevant documents are never weighted up.

Default: Automatic

 
Manual Weightings (in order of precedence)

Source Weights: Documents from weighted source are promoted to top of query results

Example: "www.google.com.search.56.6.":1.25","www.google.com.search.532.23":0.75

 Type Weighs: Documents matching weighted source type are promoted to top of query results

Example:

"News":1.1,"Social":4.0

 Tag Weights: Documents matching weighted source tag are promoted to top of query results

Example:

"mysql":2.0,"katrina":1.1


 Default: None 

The weights are applied as follows:

  • First the source weights are applied.

  • If no source weight matches the document, then the type weights are applied.

  • If no type weight matches the document, then a tag weight is generated by averaging all matching entries from the "tagWeights" map.

  • If no weight matches the document, then its total score is preserved. 

Return documents

If disabled, documents are excluded from query results and therefore no results displayed in Doc Viewer. Only entities, events, facts, geo-tags,

Default: Enabled

 
Return Standalone Events

Aggregates events, facts and summaries while retaining the temporal element (unlike event/fact aggregation)

 Standalone events are only viewable in the event timeline widget (vs event/fact aggregations in the event graph)

Default: Disabled 

 
Max Documents to Return

Max # of documents to return in Doc Viewer results list

Default: 100 Documents

 
Documents to Skip

Discards the first X documents. If set to 100, skips docs 1-100

Default: None

 
Include Entities

If disabled, entities and entity scoring are not returned with queries - will result in slightly improved performance and query time. 

Default: Enabled

 
Score Entities

If disabled, entities are returned with queries, however significance/relevance scoring is not performed - will result in slightly improved performance and query time.

Default: Enabled

 
Include Geotags

If disabled, geotags are excluded from query results. Map Widget will not display docs.

Default: Enabled

 
Include Metadata

If disabled, source-specific metadata is not returned with query results.

Default: Enabled

 
Include Summaries

If disabled, document summaries are excluded front query results.

Default: Enabled

 
Include Events

If disabled, events are excluded from query results.

Default: Enabled

 
Include Facts

If disabled, facts are excluded from query results.

Default: Enabled

 
Aggregate Geotags

If disabled, most common geotags are not aggregated.

Default: Enabled

Max Geotags to Return - Default: 1,000

 
Aggregate Times

If disabled, document counts of query results are not aggregated.

Default: Enabled

Aggregation Interval - Default: 1w (one week)

 
Aggregate Entities

If disabled, top ranking entities are not aggregated.

Default: Enabled

Max Entities to Return - Default: 250 (recommend setting to 3,000+)

 
Aggregate Events

If disabled, top ranking events are not aggregated.

Default: Enabled

Max Events to Return - Default: 100

 
Aggregate Facts

If disabled, top ranking facts are not aggregated.

Default: Enabled

Max Facts to Return - Default: 100

 
Aggregate SourcesDefault: Disabled 
Aggregate Source MetadataDefault: Disabled 
Entity Filters

Docs not containing an entity of that type will be discarded

 Other entities types will be discarded from docs that are promoted

 Negative filters - entering a minus (-) before an entity type will discard that entity type from all results, no effect on query (i.e. -keyword will omit all keywords

The entity type filter or association verb category filter can be specified in one of two ways (in Advanced Options):

  • Negatively, as a comma-separated list starting with "-" 
    • (see under "Entity Filter" in the screenshot below: no entities with type "Topic" or "Keyword" would be included in the query dataset)
  • Positively, as a comma-separated list 
    • (see under "Association Filter" in the screenshot below: in that case only associations with verb category "retweet" or "mentions" would be returned in the query results)
Association Filters

Docs not containing an association of that type will be discarded

 Other association types will be discarded from docs that are promoted

 Negative filters - entering a minus (-) before an association type will discard that association type from all results, no effect on query (i.e. -generic relations) 

The entity type filter or association verb category filter can be specified in one of two ways (in Advanced Options):

  • Negatively, as a comma-separated list starting with "-" 
    • (see under "Entity Filter" in the screenshot below: no entities with type "Topic" or "Keyword" would be included in the query dataset)
  • Positively, as a comma-separated list 
    • (see under "Association Filter" in the screenshot below: in that case only associations with verb category "retweet" or "mentions" would be returned in the query results)