...
By category
Harvest Types
- Feed
- RSS: Feed Source
- HTML: Log File Source Gallery, Web-hosted XML
- Following links: Log File Source Gallery, Web-hosted XML
- File
- "office": Enron sample
- Line-separared: Log File Source Gallery
- XML: WITS sample
- JSON: GNIP sample
- Database
- SQL: DC crime sample
...
- Using regex
- Using javascript: GNIP sample, Log File Source Gallery, Enron sample
- Global functions
- Accessing external content
- Using xpath: Web-hosted XML (web following context)
- Metadata pipelines
Generating entities and associatons using NLP
- Text cleansing: Log File Source Gallery, Enron sample
- Specifying the text extraction engine: Feed Source
- Specifying the entity extraction engine: Enron sample, Feed Source
Generating entities from metadata
...
Source, example documents and output
Categories:
- File (JSON)
- Unstructured Analysis
- Javascript
- Structured Analysis
- Entities
- Associations
- Entities and associations from arrays
- Javascript
...
Source, example documents and output
Categories:
- File (XML)
- Structured Analysis
- Entities
- Associations
- Entities from arrays
- Javascript
...
Source, example documents and output
Categories:
- Database (mysql)
- Structured Analysis
- Entities
- Associations
- Entities from arrays
- Javascript
...
Source, example documents and output
Categories:
- File ("Officeoffice")
- Entities from NLP
- Unstructured Analysis
- regex
- Structured Analysis
- Entities
- Associations
- Entities from arrays
- Javascript
Web-hosted XML
Source, example documents and output
Categories:
- Feed (Web)
- Following Links
- xpath
- javascript
- Unstructured Analysis
- javascript
Log file data
Source, example documents and output
Categories:
- Feed (Web), File (line-separated)
- Following Links
- xpath
- javascript
- Unstructured Analysis
- javascript
- Structured Analysis
- Entities
- Associations
- Entities from arrays
- Javascript
Feed Source
Source, example documents and output
Categories:
- Feed (RSS)
- Entities from NLP