CSV Data

Ingest .CSV Data and Generate Entities

A common Manager task is to ingest a .CSV file and perform some basic source configuration tasks on it, in order to generate usable Entities.

Uploading the .CSV File

After you have located a .csv file which is appropriate for the platform, it can be uploaded using the File Uploader.

To upload the file

  1. Navigate to Manager>File Uploader.
  2. Fill in the fields of the File Uploader and Choose the file.
  3. Click on Submit.

Make sure you take note of the generated ID, which is displayed above after submission.

where id, is the alphanumeric character string displayed to the right of get/


 


Creating the New Source

Once the .csv file is uploaded you can create a new source.

To create the new source

  1. Navigate to Source Editor>New Source.
  2. Select template: Infinit.e ZIP Archives/JSON Share Example.
  3. Click on Select.
  4. Fill in the remaining information, and ensure you select the correct Community.
  5. Click on Save Source.


Editing the Source

You can edit the source using the JSON editor or the Source Builder.*

*Enterprise edition only

To edit the source using Source Builder

  1. From the newly created source, click on SRC UI.  The Source Builder is displayed.
  2. Delete the elements in the Source View, except for the File Extractor.
  3. Paste the previously copied share ID into the url field of the Form View.

 

    4. Scroll down and set type to "line-seperated."

 

    5. Escape from the Source Builder.

    6. Click on Save Source.

    7. Click on Test Source.

The tested source is displayed in a new window.

At this point you should review the tested source to ensure it is as expected.  A common problem that can occur at this stage is that badly formatted .csv file can make it difficult to properly identify the .csv headers.

Advanced Configuration for .CSV Files

The three key fields for the File extractor, when extracting .csv files are the following

  • RootLevelValues
    • TODO:definition
  • IgnoreValues
    • TODO definition
  • AttributePrefix
    • TODO definition

Common Advanced Configurations to Rectify Problems:

  • Try pasting the problematic headers from the Source Test Output into RootLevelValues, and remove any problematic quotes or other characters
  • Use ignore values to indicate any values that should be ignored. eg. #

Re-testing

After you have made your advanced configurations you can re-test.

A successful test result should show results for metadata, as indicated in the JSON example below.

{
    "communityId": ["53add292e4b015f8f5817611"],
    "created": "Sep 23, 2014 10:46:27 PM UTC",
    "description": "\"St. Louis-area police\",\"Organization\",\"Who\",5.263157894736842,1.232778207145495,0.05263157894736842,1,1,0,0.0,0.0",
    "mediaType": ["Report"],
    "metadata": {"csv": [{
        "doc_count": "1",
        "entity_dimension": "Who",
        "entity_name": "St. Louis-area police",
        "entity_type": "Organization",
        "query_avg_frequency": "0.05263157894736842",
        "query_coverage": "5.263157894736842",
        "query_significance": "1.232778207145495",
        "total_frequency": "1"
    }]},
    "modified": "Sep 23, 2014 10:02:30 PM UTC",
    "publishedDate": "Sep 23, 2014 10:02:30 PM UTC",
    "source": ["CSV Example"],
    "sourceKey": ["inf...share.5421edf6e4b00c006cf54cd6.miscDescription."],
    "sourceUrl": "inf://share/5421edf6e4b00c006cf54cd6/miscDescription/csv file",
    "tags": ["csv"],
    "title": "csv file",
    "url": "inf://share/5421edf6e4b00c006cf54cd6/miscDescription/csv file/0fadc6615d7a5a77625881f8bb61092b.sv"
}
{
    "communityId": ["53add292e4b015f8f5817611"],
    "created": "Sep 23, 2014 10:46:27 PM UTC",
    "description": "\"Kenya\",\"Country\",\"Where\",5.263157894736842,1.188316079123999,0.05263157894736842,5,8,0,0.0,0.0",
    "mediaType": ["Report"],
    "metadata": {"csv": [{
        "doc_count": "5",
        "entity_dimension": "Where",
        "entity_name": "Kenya",
        "entity_type": "Country",
        "query_avg_frequency": "0.05263157894736842",
        "query_coverage": "5.263157894736842",
        "query_significance": "1.188316079123999",
        "total_frequency": "8"

 

Entity Enrichment

Once the source is successfully returning metadata you can add either Manual Entities or Automated Entities, in order to enrich the .csv data

To add Entities

TODO

 


Publish the Source

TODO

 


Video Walkthrough

 

 

 

 

In this section:

Related User Documentation:

Source Manager User Guide