Page Comparison

...

Code Block

{
    "description": "wits test",
    "isPublic": true,
    "mediaType": "Report",
    "searchCycle_secs": -1,
    "tags": [
        "incidents",
        "nctc",
        "terrorism",
        "wits",
        "events",
        "worldwide"
    ],
    "title": "wits test",
    "processingPipeline": [
        {
            "file": {
                "XmlIgnoreValues": [
                    "DefiningCharacteristicList",
                    "TargetedCharacteristicList",
                    "WeaponTypeList",
                    "PerpetratorList",
                    "VictimList",
                    "EventTypeList",
                    "CityStateProvinceList",
                    "FacilityList"
                ],
                "XmlPrimaryKey": "icn",
                "XmlRootLevelValues": [
                    "Incident"
                ],
                "XmlSourceName": "https://wits.nctc.gov/FederalDiscoverWITS/index.do?N=0&Ntk=ICN&Ntx=mode%20match&Ntt=",
                "domain": "XXX",
                "password": "XXX",
                "username": "XXX",
                "url": "smb://modus:139/wits/allfiles/"
            }
        },

Configuring CSV/SV

There are two options for configuring CSV:

Specify the field names manually
Derive the field names from the header

These are described below

Specifying the field names manually

You can use XmlRootLevelValues to set the root object for CSV/SV file parsing.field names

When you do this, CSV parsing occurs automatically and the records are mapped into a metadata object called "csv" with the field names corresponding to the values of this array.

In the following sample code, the file extractor is configured to act on .csv content to set the root object and make additional configurations.

Code Block

{
    "description": "For cyber demo",
    "isPublic": false,
    "mediaType": "Log",
    "searchCycle_secs": 3600,
    "tags": [
        "cyber",
        "structured"
    ],
    "title": "Cyber Logs Test",
    "processingPipeline": [
        {
            "file": {
                "XmlRootLevelValues": [],
                "domain": "DOMAIN",
                "password": "PASSWORD",
                "type": "csv",
                "username": "USER",
                "url": "smb://FILESHARE:139/cyber_logs/"
            }
        },

Using XmlIgnore Values to Derive Field Names Automatically

The fieldnames can also be derived automatically by setting XmlIgnoreValues. In this case, XmlRootLevelValues need not be set.

For "*sv" files

TODO source example

Using XmlIgnore Values to Derive Field Names Automatically

The field names can also be derived automatically from the headers.

The field "XmlIgnoreValues" is used to identify the headers - the start of each line is compared to each of the strings in this array - if they match the line is ignored. This allows header lines to be ignored.In addition, the first line matching an ignore value field that consists of the more than 1 token-separated field will be used to generate the fieldnameselement in "XmlIgnoreValues", if it matches then that line is designated as a header and does not generate a document.

Furthermore, if the header line contains the right number of fields, then it is used to generate the field names used in the "csv" object.

Example:

If "XmlIgnoreValues": "#", and the first three lines are "#", "#header", and "#field1,field2,field3" then the processing will assume the 3 fields are field1, field2, and field3.

...

Versions Compared

Old Version 25

New Version 26

Key

Configuring CSV/SV

Specifying the field names manually

Using XmlIgnore Values to Derive Field Names Automatically

Using XmlIgnore Values to Derive Field Names Automatically