Search index settings
Overview
Once data is ingested into Infint.e from the various extractors it is stored in JSON format including its metadata fields and content. It also contains sub-objects such as entities and associations.
It is possible to instruct Infinit.e as to what fields will be searchable, in order to improve performance.
By configuring search index settings, you can determine if the flowing fields will be indexed into Lucene or excluded:
- entity fields
- association fields
- entity fields with geo data
- association fields with geo data
- document fields
This page has been organized into the following sections, for ease of localization
Format
{ "display": string, "searchIndex": { "entityFilter":string, // (regex applied to entity indexes, plus starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only) "assocFilter":string, // (regex applied to new-line separated entity indexes in associations, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only) "entityGeoFilter":string, // (regex applied to entity indexes if the entity has geo, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only) "assocGeoFilter":string, // (regex applied to new-line separated entity indexes in associations with geo, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only) "fieldList": [ string ], // (comma-separated list of doc fields, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only) "metadataFieldList": [ string ], // (comma-separated list of doc fields, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only) } }
Description
The following table describes the parameters of the search index settings configuration.
Parameter | Description |
---|---|
entityFilter | (regex applied to entity indexes, plus starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only) |
assocFilter | (regex applied to new-line separated entity indexes in associations, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only) |
entityGeoFilter | (regex applied to entity indexes if the entity has geo, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only) |
assocGeoFilter | (regex applied to new-line separated entity indexes in associations with geo, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only) |
fieldList | (comma-separated list of doc fields, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only) |
metadataFieldList | (comma-separated list of doc fields, starts with "+" or "-" to indicate inclusion/exclusion, defaults to include-only) |
Use Cases
There are two main use cases for Search index settings.
- Provide a Regex to filter out entity or association fields from being indexed for search
- Provide a comma seperated list of document fields to be filtered out of search indexing
See examples below
Examples
Indexing Metadata
In this example, the title is not indexed, no entities of type keyword or topic, or associations containing the string "theme" are indexed, and only a few metadata fields.
}, { "searchIndex": { "fieldList":"-title", "entityFilter": "-Keyword|Topic", "assocFilter": "-.*theme.*", "metadataFieldList": "fieldSet1,fieldSet2.indexable" } } ] }
Footnotes:
Legacy documentation:
- Source configuration objects - legacy under "searchIndexSettings"
Legacy documentation:
- See legacy documentation under "Format"