Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Overview

Starting with either the raw content (or the content transformed by a preceding manual or automated text pipeline element), applies the javascript, regex, or xpath transformation and writes the output to the document's full text (or description, or title, or one of the textual metadata fields).

TODO

Format

TODO convert to JSON

{
	"display": string,
	"text": [
	{} // see ManualTextExtractionSpecPojo below
	]
}
//////////////////////////////////
	public static class ManualTextExtractionSpecPojo {
		public String fieldName; // One of "fullText", "description", "title"
		public String script; // The script/xpath/javascript expression (see scriptlang below)
		public String flags; // Standard Java regex field (regex/xpath only), plus "H" to decode HTML
		public String replacement; // Replacement string for regex/xpath+regex matches, can include capturing groups as $1 etc
		public String scriptlang; // One of "javascript", "regex", "xpath"
	}

Legacy documentation:

TODO

Description

Legacy documentation:

TODO

Examples

TODO

  • No labels