...
In the following example, manual text transformation is used to parse a log file over the web, with a script
of type javascript.
Code Block |
---|
{ }, "globals": { "contentMetadatascripts": [ { "function decode(x)\n{\n var info = {}; \n var rec "fieldName": "info",= x.split(','); \n info.device = rec[0];\n info.date = rec[1];\n info.srcIP = rec[2];\n "script": "var info.dstIP = decode(text); info;",rec[3];\n info.alert = rec[4];\n info.country = rec[5];\n "scriptlang": "javascriptreturn info;\n}" }] ]} }, { "textharvest": { [ {"searchCycle_secs": 3600 } "fieldName": "fullText", }, { "scriptdocMetadata": ",",{ "scriptlangtitle": "regex", $metadata.info.alert @ $metadata.info.date [$metadata.info.device]: $metadata.info.dstIP -> $metadata.info.srcIP", "flagspublishedDate": "md",$SCRIPT( return _doc.metadata.info[0].date; )" } "replacement": " , " }, { }, "contentMetadata": [ { "fieldName": "descriptioninfo", "script": ",",var info = "scriptlang": "regex"decode(text); info;", "flagsscriptlang": "mdjavascript", } "replacement": " , " ] } ] }, |
...
Globals is used to define a function called "decode," which is then used to capture the metadata for the sample input data in a variable called "info."
Info can be used to capture the metadata for the sample input data . The metadata that will be captured in the example is as follows:
...
as follows:
- info.date
- info.srcIP
- info.dstIP
- info.alert
- info.country
This captured metadata from the sample input data can then be used as output for the script.:
Code Block |
---|
], "fullText": "SCANNER_1 , 2012-01-01T13:43:00 , 10.0.0.1 , 66.66.66.66 , DUMMY_ALERT_TYPE_1 , United States", "mediaType": ["Log"], "metadata": {"info": [{ "alert": "DUMMY_ALERT_TYPE_1 ", "country": "United States", "date": "2012-01-01T13:43:00", "device": "SCANNER_1 ", "dstIP": "66.66.66.66", "srcIP": " 10.0.0.1" }]}, |
...
As a result, Infinit.e supports XPath 1.0 (with one minor extension to allow combined XPath regex).
In this example, an Xpath script is used as part of manual text extraction, in order to convert a sample XML document into JSON.
...
to allow combined XPath regex).
In this example, an Xpath script is used as part of manual text extraction, in order to convert a sample XML document into JSON.
Original XML Source:
Code Block |
---|
<?xml version="1.0" encoding="UTF-8"?>
<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<price>$5.95</price>
<description>two of our famous Belgian Waffles with plenty of real maple syrup</description>
<calories>650</calories>
</food>
<food>
<name>Strawberry Belgian Waffles</name>
<price>$7.95</price>
<description>light Belgian waffles covered with strawberries and whipped cream</description>
<calories>900</calories>
</food>
<food>
<name>Berry-Berry Belgian Waffles</name>
<price>$8.95</price>
<description>light Belgian waffles covered with an assortment of fresh berries and whipped cream</description>
<calories>900</calories>
</food>
<food>
<name>French Toast</name>
<price>$4.50</price>
<description>thick slices made from our homemade sourdough bread</description>
<calories>600</calories>
</food>
<food>
<name>Homestyle Breakfast</name>
<price>$6.95</price>
<description>two eggs, bacon or sausage, toast, and our ever-popular hash browns</description>
<calories>950</calories>
</food>
</breakfast_menu> |
Source Configuration:
In the source configuration example below, a xpath script is specified to perform the JSON conversion.
Code Block |
---|
{ "links": { "extraMeta": [ { "context": "First", "fieldName": "convert_to_json", "flags": "o", "script": "//breakfast_menu/food[*]", "scriptlang": "xpath" } ], "script": "function convert_to_docs(jsonarray, url)\n{\n var docs = [];\n for (var docIt in jsonarray) {\n var predoc = jsonarray[docIt];\n delete predoc.content;\n var doc = {};\n doc.url = _doc.url.replace(/[?].*/,\"\") + '#' + docIt;\n doc.fullText = predoc;\n doc.title = \"TBD\";\n doc.description = \"TBD\";\n docs.push(doc);\n }\n return docs;\n}\nvar docs = convert_to_docs(_doc.metadata['convert_to_json'], _doc.url);\ndocs;", "scriptflags": "d" } }, |
The sample output would then return a series of JSON formatted responses. For example,
...