Search index settings

Overview

Once data is ingested into Infint.e from the various extractors it is stored in JSON format including its metadata fields and content.  It also contains sub-objects such as entities and associations.

It is possible to instruct Infinit.e as to what fields will be searchable, in order to improve performance.

By configuring search index settings, you can determine if the flowing fields will be indexed into Lucene or excluded:

  • entity fields
  • association fields
  • entity fields with geo data
  • association fields with geo data
  • document fields

This page has been organized into the following sections, for ease of localization

Format

{
	"display": string,
	"searchIndex": {
        "entityFilter":string, // (regex applied to entity indexes, plus starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only)
        "assocFilter":string, // (regex applied to new-line separated entity indexes in associations, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only)
        "entityGeoFilter":string, // (regex applied to entity indexes if the entity has geo, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only)
        "assocGeoFilter":string, // (regex applied to new-line separated entity indexes in associations with geo, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only)
        "fieldList": [ string ], // (comma-separated list of doc fields, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only)
        "metadataFieldList": [ string ],   // (comma-separated list of doc fields, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only)
	}
}

 

Description

The following table describes the parameters of the search index settings configuration.

ParameterDescription
entityFilter
 (regex applied to entity indexes, plus starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only)
assocFilter
(regex applied to new-line separated entity indexes in associations, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only)
entityGeoFilter
(regex applied to entity indexes if the entity has geo, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only)
assocGeoFilter
(regex applied to new-line separated entity indexes in associations with geo, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only)
fieldList
(comma-separated list of doc fields, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only)
metadataFieldList
(comma-separated list of doc fields, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only)

 

Use Cases

There are two main use cases for Search index settings.  

  • Provide a Regex to filter out entity or association fields from being indexed for search
  • Provide a comma seperated list of document fields to be filtered out of search indexing

See examples below

Examples

Indexing Metadata

In this example, the title is not indexed, no entities of type keyword or topic, or associations containing the string "theme" are indexed, and only a few metadata fields.

},        {
            "searchIndex": {
				"fieldList":"-title",
				"entityFilter": "-Keyword|Topic",
				"assocFilter": "-.*theme.*",
                "metadataFieldList": "fieldSet1,fieldSet2.indexable"
            }
        }
    ]
}

 

 

Footnotes:

Legacy documentation:

Legacy documentation:

  • See legacy documentation under "Format"