Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The user query is turned into an ElasticSearch query and applied across the cluster.
  • The number of documents returned from ElasticSearch is capped at a "large" number (default 1000, eg 10x the documents to return). The documents are ordered by their Lucene score (or optionally just by descending date).
  • Each returned document is then assigned a INF:Significance score as described below.
  • The significance and relevance scores are then normalized against each other based on a relative importance specified by the user (default 2:1 in favor of significance) and combined, with the mean score set to 100 (like the "+" stats in baseball, eg 120 is 20% higher than average).
  • The top scoring documents or entities are returned to the client.

...

"decay" is the "half life" of the decay (ie the duration from "time" at which the score is halved). It is in the format "N[INF:dmwy]" where N is an integer and d,m,w,y denote "day", "month", "week" or "year" (eg "1w", "1m"; note currently if "m" is used, then the duration is always 1 month).

...