Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • If "automatic" is used then no metadata is generated and the fullText is the split object, eg in the above XML example, you'd get two documents, with no metadata and the following fullText fields:
    • "<meta>1</meta><url>http://blah1</url>"
    • "<meta>2</meta><url>http://blah2</url>"
  • If "automatic_json" is used, then the fullText is the same, but the metadata object contains a single field, "json", containing the JSON-ified object, eg:
    • "metadata": { "json": [ { "meta": "1", "url": "http://blah1" } ] }
    • "metadata": { "json": [ { "meta": "2", "url": "http://blah2" } ] }
  • If "automatic_xml" is used, it is similar, except the metadata object contains one element for each field of the JSON-ified object, eg:
    • "metadata": { "meta": [ "1" ], "url": [ "http://blah1" ] }
    • "metadata": { "meta": [ "12" ], "url": [ "http://blah2" ] }
  • (ie "automatic_json" vs "automatic_xml" are consistent with the metadata formats derived from the "file" extractor element)

...