...
Info |
---|
Setting the URL to the above default is in many cases not desirable, since unlike title/description/fullText/displayUrl, the document "url" field cannot be changed (since it is used for deduplication). Therefore there is a more complex syntax that enables the URL to be derived from one or more fields:
|
...
Code Block | ||||
---|---|---|---|---|
| ||||
////The source { //... "processingPipeline": [ //... { "splitter": { "scriptlang": "automatic_json", "script": "fullText.object, http://test/{01}/{12}, url, meta" } } //... ] //... } ////Would map the extracted document { "url": "blahurl", "title": "blah" "fullText": "<objects><object><meta>1</meta><url>blah1</url></object><object><meta>2</meta><url>blah2</url></object></objects>" } ////to the 2 derived docs: { "title": "blah (1)", "url": "http://test/blah1/1", "fullText": "<object><meta>1</meta><url>blah1</url></object>", "metadata: { "json": [ { "meta": "1", "url": "http://blah1" } ] } }, { "title": "blah (2)", "url": "http://test/blah2/2", "fullText": "<object><meta>1</meta><url>blah1</url></object>", "metadata: { "json": [ { "meta": "2", "url": "http://blah2" } ] } } ////Of course, subsequent pipeline elements can then manipulate/add fields other than "url" as per usual |
...