Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Current »

Examples

This section describes the configuration details for the supported extractors, and provides examples where applicable.

Alchemy API

ParameterDescription
postproc

 

Possible values:

"1","2","3"

Default value is "3."

 

"1" does some post-processing of geographic entities (AlchemyAPI tends to prefer US results even when the context clearly indicates a US location),

 

"2" does some post-processing of person entities (AlchemyAPI tends to prefer famous people even when the context does not support that)
"3" does both.
sentiment

Possible values:

True or False.

Default value is True.

If enabled, a sentiment metric is attached to each extracted entity.

Note that this results in use of an extra AlchemyAPI credit per document.

concepts

Possible values:

True or false.

Default value is false.

If enabled, a metadata field called "concepts" is tagged to the document containing Wiki titles that are related to the contents of the document.

Note that this results in use of an extra AlchemyAPI credit per document.

  

Examples

The example below shows sample code which uses the Alchemy API to parse data from a RSS feed.  The data can then be used to form some entities and associations.  In the example, OpenCalais is also used as the featureEngine.

 

{
    "description": "Article on Medical Issues",
    "harvestBadSource": false,
    "isApproved": true,
    "isPublic": true,
    "key": "http.www.mayoclinic.com.rss.blog.xml",
    "mediaType": "News",
    "modified": "Oct 19, 2010 11:31:59 AM",
    "tags": [
        "topic:healthcare",
        "industry:healthcare",
        "mayo clinic",
        "health"
    ],
    "title": "MayoClinic: General Topics",
    "processingPipeline": [
        {
            "feed": {
                "extraUrls": [
                    {
                        "url": "http://www.mayoclinic.com/rss/blog.xml"
                    }
                ]
            }
        },
        {
            "textEngine": {
                "engineName": "AlchemyAPI"
            }
        },
        {
            "featureEngine": {
                "engineName": "OpenCalais"
            }
        }
    ]
}

 

The Alchemy API will then return an array of entities based on its default configuration, since engineConfig was not used to specify any custom configuration parameters.  For example,

{
    "_id" : "4e1c8afa7d56bb818ed10f76",
    "created" : "1310493434159",
    "description" : "Clarify the role of carbohydrates in the Dr. Bernstein diet and find a 
         healthy eating plan that works for you.",
    "entities" : [
    {
        "actual_name" : "certified diabetes",
        "dimension" : "What",
        "disambiguous_name" : "certified diabetes",
        "doccount" : NumberLong(38),
        "frequency" : 3,
        "gazateer_index" : "certified diabetes/medicalcondition",
        "relevance" : "0.711",
        "totalfrequency" : NumberLong(114),
        "type" : "MedicalCondition"
    },
    {
        "actual_name" : "Diabetes Unit",
        "dimension" : "Who",
        "disambiguous_name" : "Diabetes Unit",
        "doccount" : NumberLong(38),
        "frequency" : 1,
        "gazateer_index" : "diabetes unit/organization",
        "relevance" : "0.235",
        "totalfrequency" : NumberLong(38),
        "type" : "Organization"
    },
    {
        "actual_name" : "Mayo Clinic",
        "dimension" : "What",
        "disambiguous_name" : "Mayo Clinic",
        "doccount" : NumberLong(514),
        "frequency" : 2,
        "gazateer_index" : "mayo clinic/facility",
        "relevance" : "0.305",
        "totalfrequency" : NumberLong(1033),
        "type" : "Facility"
    },

 

Alchemy API metadata

The Alchemy API can also perform feature extraction by configuring the metadata parameters.

ParameterDescriptionData Type
sentiment

Possible values:

True or false

False is default value.

If enabled, a sentiment metric is attached to each extracted entity.

Note that this results in use of an extra AlchemyAPI credit per document.

 
concepts

Possible values:

True or false.

True is default setting.

If enabled, a metadata field called "concepts" is tagged to the document containing Wiki titles that are related to the contents of the document. 

Note that this results in use of an extra AlchemyAPI credit per document.

 
batchSizea string containing an integer, turned off by default. If turned on, the AlchemyAPI call goes out on a batch of documents (the specified number). This makes processing of small documents like tweets more economical (in return for a reduction in accuracy, eg the sentiment is calculated over the batch not each individual tweet).

string,

integer

numKeywordsa string containing an integer, uses the AlchemyAPI default (currently 50) if not specified. If specified, controls the number of keywords returned. If batching is enabled then the requested number is multiplied by the batch size.string, integer
strict

Possible values:

True or False.

False is default setting.

If enabled, fewer high quality keywords are extracted.

 

 

Example

You can use the engineConfig object to pass configuration parameters along to the feature engine.

In this example, the Alchemy API is configured to act on a batch of documents (100) and to return a maximum of 5 keywords per document.   The strict setting will return more high quality keywords, and less keywords overall.

    },        {
            "featureEngine": {
                "engineName": "AlchemyAPI-metadata",
                "engineConfig": {
                    "app.alchemyapi-metadata.batchSize": 100,
                    "app.alchemyapi-metadata.numKeywords": 5,
                    "app.alchemyapi-metadata.strict": "true"
                }
            }
        },
  • No labels