Note that the source format is being modified, so this gallery is not very active - one the new format is finalized, it will be converted to a set of examples in the new format
By category
Harvest Types
- Feed
- RSS
- HTML: Log File Source Gallery
- Following links: Log File Source Gallery
- File
- "office": Enron sample
- Line-separared: Log File Source Gallery
- XML: WITS sample
- JSON: GNIP sample
- Database
- SQL: DC crime sample
Search and update cycles
- Search cycles
- Update cycles: DC crime sample
Generating metadata
- Using regex
- Using javascript: GNIP sample, Log File Source Gallery, Enron sample
- Global functions
- Accessing external content
- Using xpath
- Metadata pipelines
Generating entities and associatons using NLP
- Text cleansing: Log File Source Gallery, Enron sample
- Specifying the text extraction engine
- Specifying the entity extraction engine
Generating entities from metadata
- With strings/replacement: GNIP sample, WITS sample, DC crime sample, Log File Source Gallery, Enron sample
- With javascript: GNIP sample, WITS sample, Log File Source Gallery, Enron sample
- Global functions: WITS sample
- From metadata arrays: GNIP sample
Generating associations from metadata
- With strings/replacement: GNIP sample, Log File Source Gallery, Enron sample
- With javascript: GNIP sample, Log File Source Gallery, Enron sample
- Global functions: WITS sample
- From metadata arrays: GNIP sample, Log File Source Gallery, Enron sample
Generating associations from entities
- With strings: DC crime sample, Log File Source Gallery, Enron sample
- With javascript: WITS sample, Log File Source Gallery, Enron sample
Retaining and discarding metadata for storage and/or indexing
- Retaining/discarding metadata for storage
- Retaining/discarding entities, associations, metadata for indexing: GNIP sample
By source
GNIP
Source, example documents and output
Categories:
- File
- JSON
- Unstructured Analysis
- Javascript
- Structured Analysis
- Entities
- Associations
- Entities and associations from arrays
- Javascript
WITS
Source, example documents and output
Categories:
- File
- XML
- Structured Analysis
- Entities
- Associations
- Entities from arrays
- Javascript
DC crime data
Source, example documents and output
Categories:
- Database
- Structured Analysis
- Entities
- Associations
- Entities from arrays
- Javascript
Enron data
Source, example documents and output
Categories:
- File
- "Office"
- Entities from NLP
- Unstructured Analysis
- regex
- Structured Analysis
- Entities
- Associations
- Entities from arrays
- Javascript