Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3
Info

Note that the source format is being modified, so this gallery is not very active - one the new format is finalized, it will be converted to a set of examples in the new format

By category

Harvest Types
Search and update cycles
Generating metadata
Generating entities and associatons using NLP
Generating entities from metadata

...

Generating associations from entities
Retaining and discarding metadata for storage and/or indexing
  • Retaining/discarding metadata for storage
  • Retaining/discarding entities, associations, metadata for indexing: GNIP sample

By source

GNIP

...

Source, example documents and output

Categories:

  • File (JSON)
  • Unstructured Analysis
    • Javascript
  • Structured Analysis
    • Entities
    • Associations
    • Entities and associations from arrays
    • Javascript
WITS

...

Source, example documents and output

Categories:

  • File (XML)
  • Structured Analysis
    • Entities
    • Associations
    • Entities from arrays
    • Javascript
DC crime data

...

Source, example documents and output

Categories:

  • Database (mysql)
  • Structured Analysis
    • Entities
    • Associations
    • Entities from arrays
    • Javascript
Enron data

...

Source, example documents and output

Categories:

  • File ("Officeoffice")
  • Entities from NLP
  • Unstructured Analysis
    • regex
  • Structured Analysis
    • Entities
    • Associations
    • Entities from arrays
    • Javascript
Web-hosted XML

Source, example documents and output

Categories:

  • Feed (Web)
  • Following Links
    • xpath
    • javascript
  • Unstructured Analysis
    • javascript
Log file data

Source, example documents and output

Categories:

  • Feed (Web), File (line-separated)
  • Following Links
    • xpath
    • javascript
  • Unstructured Analysis
    • javascript
  • Structured Analysis
    • Entities
    • Associations
    • Entities from arrays
    • Javascript
Feed Source

Source, example documents and output

Categories:

  • Feed (RSS)
  • Entities from NLP