Source gallery

Note that the source format is being modified, so this gallery is not very active - one the new format is finalized, it will be converted to a set of examples in the new format

By category

Harvest Types
Search and update cycles
Generating metadata
Generating entities and associatons using NLP
Generating entities from metadata
Generating associations from metadata
Generating associations from entities
Retaining and discarding metadata for storage and/or indexing
  • Retaining/discarding metadata for storage
  • Retaining/discarding entities, associations, metadata for indexing: GNIP sample

By source

GNIP

Source, example documents and output

Categories:

  • File (JSON)
  • Unstructured Analysis
    • Javascript
  • Structured Analysis
    • Entities
    • Associations
    • Entities and associations from arrays
    • Javascript
WITS

Source, example documents and output

Categories:

  • File (XML)
  • Structured Analysis
    • Entities
    • Associations
    • Entities from arrays
    • Javascript
DC crime data

Source, example documents and output

Categories:

  • Database (mysql)
  • Structured Analysis
    • Entities
    • Associations
    • Entities from arrays
    • Javascript
Enron data

Source, example documents and output

Categories:

  • File ("office")
  • Entities from NLP
  • Unstructured Analysis
    • regex
  • Structured Analysis
    • Entities
    • Associations
    • Entities from arrays
    • Javascript
Web-hosted XML

Source, example documents and output

Categories:

  • Feed (Web)
  • Following Links
    • xpath
    • javascript
  • Unstructured Analysis
    • javascript
Log file data

Source, example documents and output

Categories:

  • Feed (Web), File (line-separated)
  • Following Links
    • xpath
    • javascript
  • Unstructured Analysis
    • javascript
  • Structured Analysis
    • Entities
    • Associations
    • Entities from arrays
    • Javascript
Feed Source

Source, example documents and output

Categories:

  • Feed (RSS)
  • Entities from NLP