RSS Source
A common Source Editor task is to process an RSS feed and configure the IKANOWS text and feature extractors for desired results.
...
- Create a new RSS Source using the Source Manager
- Edit the source using the Source Builder
- Publish the Source
Info |
---|
Another way to quickly create an RSS source is to use the Google Chrome plugin. |
...
4. Under the Source Template drop-down on the left side: select ‘RSS Source Template’ and click ‘Select’
5. Under the New Source template on the right side:
Enter a Title (i.e. NY Times Front Page) and Description (i.e. NY Times RSS)
Select a Community for your source (i.e. General News)
- Select ‘Save Source’
6. When saved, the source template reloads with:
Unique ‘Share ID’ for your source
- Title, Description, and Community ID entered in Step 5 are added to the corresponding fields in the JSON
7. Enter tags directly into the field provided by the GUI.
Note: No comma after the last tag, see below
i.e. "tags": ["NY Times", “Front Page”, “RSS”],
TODO: determine how to get "New York Times" added as a tag
8. Replace the template URL in the "url": field with the .rss or .xml URL for the desired RSS feed, followed by a comma after the quotation
Test the Source
Once you have correctly configured the URL, you can test the source
To test the source
- Click on Test Source
If the source has been correctly configured Infinit.e will return newly generated documents in a pop-up window.
ConfiguringEditing the
ExtractorsSource
You will likely want to edit the source to tailor you extraction settings to properly generate entities and associations.
Two fields must be added to the end of the JSON for text and metadata extractors depending on the content and desired results:
For extraction of entities and associations and if sentiment scoring is NOT desired, copy and paste the following :
"useExtractor":"OpenCalais",
"useTextExtractor": "boilerpipe"
For keyword extraction AND sentiment scoring, or for foreign language sources, copy and paste the following:
"useExtractor":"AlchemyAPI-metadata",
"userTextExtractor":"AlchemyAPI"
10. Next to Test Parameters, change the ‘Number of Documents:’ from ‘10’ to ‘2’, and select ‘Test Source’
11. Depending on your settings, you may need to accept pop-ups in order to receive the success or error message
Publishing the Source
13a. If a second test results in error, copy and paste error message and send to Ikanow POC
13b. If successful, select ‘Publish Source’ to begin harvestingthe text extraction and feature extraction settings.
For more information concerning text extraction and feature extraction, see section Toolkit.
Using Source Builder to Edit the Source
Source Builder provides an intuitive user interface to perform editing of sources. You can use Source Builder to change the Text Extraction and Feature Extraction settings.
To edit the extraction settings
- From the Source editor, click on SRC UI. The Source Builder is displayed
- Use Source View and Form View to change the enginename using the dropdown, as indicated in screenshot below.
In this example Automated Text extraction has been set to alchemyapi, and Automated Entities has been set to opencalais.
For more information concerning text extraction and feature extraction, see section Toolkit.
Re-testing
Code Block |
---|
{
"associations": [
{
"assoc_type": "Summary",
"entity1": "direct contact",
"entity2": "Ebola",
"entity2_index": "ebola/medicalcondition",
"verb": "spread",
"verb_category": "generic relations"
},
{
"assoc_type": "Summary",
"entity2": "Thomas R. Frieden",
"entity2_index": "thomas r. frieden/person",
"verb": "screen",
"verb_category": "generic relations"
},
{
"assoc_type": "Fact",
"entity1": "Thomas R. Frieden",
"entity1_index": "thomas r. frieden/person",
"entity2": "CNN",
"entity2_index": "cnn international/company",
"verb": "tell",
"verb_category": "generic relations"
},
{
"assoc_type": "Fact",
"entity1": "Michael S. Rawlings",
"entity1_index": "michael s. rawlings/person",
"entity2": "Mayor",
"entity2_index": "mayor/position",
"verb": "current",
"verb_category": "career"
},
{
"assoc_type": "Summary",
"entity1": "Thomas R. Frieden",
"entity1_index": "thomas r. frieden/person",
"entity2": "whether the patient is an american citizen",
"verb_category": "quotation"
},
{
"assoc_type": "Summary",
"entity1": "Thomas R. Frieden",
"entity1_index": "thomas r. frieden/person",
"verb": "announced",
"verb_category": "person communication"
}, |
Publishing the Source
- Ensure that you save the source since your last modifications
- Click on Publish Source. The source is published and progress is available from Source Monitor.
Info |
---|
This page does not cover Infinit.e Visualizations using the Infinit.e Visualization widgets. For more information, see sections.... |