Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

ZIP File

 A common Source Editor task is to process an RSS feed and configure the text and feature extractors for desired results.

...

13. Under the Source Templates dropdown on the left side: select "Infinit.e ZIP Archives/JSON Share Example" and click ‘Select.’TODO: verify that this is the correct template to select.

 

14. Under the ‘New Source’ template on the right side:

  • Enter a Title (i.e. PDF ZIP) and Description (i.e. PDF Zip file)

  • Enter desired Tags (separated by spaces, no commas)

  • Select a Community for your source (i.e. General News)

  • Paste the ‘Share ID’ from Step 9

  • Select ‘Save Source’ 

    Image RemovedImage Added


Testing the Source

15. Change the ‘Number of Documents (to the right of Test Parameters) from 10 to 2 and select ‘Test Source’

Image Removed

16. Once you have provided the correct url and saved the source you can test it to verify if documents are returned.  In this template the default feature Engine is set to return both entities and associations.

To test the source

  1. Click on Test Source.  The platform will perform data processing and should then return the documents.  
  2. A Source Test Output window will open displayed either a success or error message:

...

  1.  

Info

If you have pop-blocking on, you will need to accept pop-ups in order to receive the source test output

...

Image Removed

Publishing the Source

17.  If the test is successful, select ‘Publish Source’ and select ‘OK’ when success message appears.

...

 

The code example below shows a representative set of documents returned.

Code Block
{
    "communityId": ["4c927585d591d31d7b37097a"],
    "created": "Oct 14, 2014 10:11:16 PM UTC",
    "description": "",
    "mediaType": ["Report"],
    "metadata": {"_FILE_METADATA_": [{"metadata": {"Content-Type": ["application/octet-stream"]}}]},
    "modified": "Oct 14, 2014 06:19:54 PM UTC",
    "publishedDate": "Oct 14, 2014 06:19:54 PM UTC",
    "source": ["Iran Report 2"],
    "sourceKey": ["inf...share.543d6948e4b0d272bbe48c9c.miscDescription."],
    "tags": ["iran"],
    "title": "__MACOSX/._USIP_Template_5March2012-1.pdf",
    "url": "inf://share/543d6948e4b0d272bbe48c9c/miscDescription/__MACOSX/._USIP_Template_5March2012-1.pdf"
}
{
    "communityId": ["4c927585d591d31d7b37097a"],
    "created": "Oct 14, 2014 10:11:16 PM UTC",
    "description": "",
    "mediaType": ["Report"],
    "metadata": {"_FILE_METADATA_": [{"metadata": {"Content-Type": ["application/octet-stream"]}}]},
    "modified": "Oct 14, 2014 06:19:54 PM UTC",
    "publishedDate": "Oct 14, 2014 06:19:54 PM UTC",
    "source": ["Iran Report 2"],
    "sourceKey": ["inf...share.543d6948e4b0d272bbe48c9c.miscDescription."],
    "tags": ["iran"],
    "title": "__MACOSX/",
    "url": "inf://share/543d6948e4b0d272bbe48c9c/miscDescription/__MACOSX/"
}
{
    "associations": [
        {
            "assoc_type": "Summary",
            "entity1": "U.N. Security Council",
            "entity1_index": "u.n. security council/organization",
            "verb": "sanction",
            "verb_category": "generic relations"
        },
        {
            "assoc_type": "Summary",
            "entity1": "it",
            "entity2": "nuclear device",
            "entity2_index": "nuclear device/industryterm",
            "verb": "build",
            "verb_category": "generic relations"
        },
        {
            "assoc_type": "Summary",
            "entity2": "Qom",
            "entity2_index": "qom,qom province,iran/city",
            "geotag": {
                "lat": 34.6461111111,
                "lon": 50.8788888889
            },
            "verb": "ice storm",
            "verb_category": "natural disaster"
        },
        {
            "assoc_type": "Summary",
            "entity2": "Legal",
            "entity2_index": "legal/product",
            "verb": "known",
            "verb_category": "product recall"
        },
        {
            "assoc_type": "Summary",
            "entity1": "International Atomic Energy Agency",
            "entity1_index": "international atomic energy agency/organization",
            "verb": "report",
            "verb_category": "generic relations"
        },

 

Editing the Source

You will likely want to edit the source to tailor the text extraction and feature extraction settings.

For more information concerning text extraction and feature extraction, see section Toolkit.

Info

It is assumed you have obtained an OpenCalais or AlchemyAPI key and configured the Infinit.e properties file. If not do that first.

 

Using Source Builder to Edit the Source

Source Builder provides an intuitive user interface to perform editing of sources.  You can use Source Builder to change the Text Extraction and Feature Extraction settings.

To edit the extraction settings

  1. From the Source editor, click on SRC UI.  The Source Builder is displayed
  2. Use "Source View" and "Form View" to change the enginename using the dropdown, as indicated in the screenshot below.

In this example, Automated Text Extraction has been set to alchemyapi, and Automated Entities has been set to opencalais.

For more information concerning text extraction and feature extraction, see section Toolkit.


TODO: screenshots from source builder

 

Publishing the Source

Once you are satisfied with the results, you can publish the source.

 

To publish the source

 

  1.  Ensure that you save the source since your last modifications
  2. Click on Publish Source.  The source is published and progress is available from Source Monitor.

Info

If a second test results in an error, double check all fields and test the ZIP File URL in a separate window to ensure it is accurate.

18a.  If a second test results in error, copy and paste error message and send to Ikanow POC.

18b.  If successful, select ‘Publish Source’ to begin harvesting

Info

This page does not cover Infinit.e Visualizations using the Infinit.e Visualization widgets.  For more information, see sections....