Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

ExtraUrls

In the following feed example, the web extractor is used to run extraUrls parameter against the web content.the feed.  ExtraUrls is a complex type that enables urls to be manually specified, overriding settings that would be provided by the RSS feed.  Additionally, in this example,  text extraction is performed using textEngine and featureEngine.

 

Code Block
{
    "description": "For cyber demo"Article on Medical Issues",
    "harvestBadSource": false,
    "isApproved": true,
    "isPublic": true,
    "key": false"http.www.mayoclinic.com.rss.blog.xml",
    "mediaType": "LogNews",
    "searchCycle_secs": 3600"modified": "Oct 19, 2010 11:31:59 AM",
    "tags": [
        "cybertopic:healthcare",
        "industry:healthcare",
        "mayo clinic",
        "structuredhealth"
    ],
    "title": "CyberMayoClinic: LogsGeneral TestTopics",
    "processingPipeline": [
        {
            "feed": {
                "extraUrls": [
                    {
                        "url": "http://INFINITE_ENDPOINT/api/share/get/51ad28a440b4a4f0f757824c?infinite_api_key=API_KEYwww.mayoclinic.com/rss/blog.xml"
                    }
                ]
            }
        },
    

 

ExtraUrls is a complex type that enables urls to be manually specified, overriding settings that would be provided by the RSS feed.

...

    {
            "textEngine": {
                "engineName": "AlchemyAPI"
            }
        },
        {
            "featureEngine": {
                "engineName": "OpenCalais"
            }
        }
    ]
}


 

Refreshing URLs

In this example,the updateCycle_secs parameter is also used to specify the refresh rate of the harvested urls.

...