Overview
todo
Entity Types
The following entity types are possible on the platform
he "Entities" view shows the different types of entity extracted either from the Datasift metadata or from the Salience NLP. Most of the types will be from the following:
Entity Type | Example | Description | Source |
---|---|---|---|
Topic | Business, Sports, Politics, Health, War, Law, Crime, Automotive, Investing, Weather, "Software and Internet", Economics, Food, Science, Aviation, Education,"Video Games", Technology, Labor, Art, Travel etc | A high level topic inferred from the contents by the Natural Language Processing. | Salience |
Gender | Male, Mostly_Male, Female, Mostly_Female, Unisex | Obtained from the Datasift "gender" augmentation, an estimate of the gender of the document's author. | Datasift |
FacebookUser | "Mark Zuckerberg", IKANOW, "Facebook Birdwatching Group" | For Facebook documents, any of the people/companies/groups with Facebook accounts mentioned in a post (including the author). | Datasift |
TwitterUser | twitterHandle (ie without the leading'@') | For tweets, any of the people/companies/groups with twitter accounts mentioned in a post (including the author). | Datasift |
RedditUser | witty_handle_here | For reddit posts, the author's account name. | Datasift |
Person | "John Stewart" | For any other document type (blogs, news, forums) the author is categorized as a Person. Note that the Person type is also used for names extracted from the content using NLP. | Datasift (or Salience) |
Hashtag | iwanttotrend (ie without the leading '#') | Hashtags in tweets. | Datasift |
City | "New York, NY, United States" | Locations can be obtained in one of two ways: the registered location of the author (from Datasift), or places mentioned in the content (extracted using NLP). If the place can be geolocated by Infinit.e to a city, then this type is used. | Datasift/Salience |
Region | "Maryland, United States" | Locations can be obtained in one of two ways: the registered location of the author (from Datasift), or places mentioned in the content (extracted using NLP). If the place can be geolocated by Infinit.e to a state or similar adminstrative partition, then this type is used. | Datasift/Salience |
Place | "White House", "Arizona", "US" | Locations can be obtained in one of two ways: the registered location of the author (from Datasift), or places mentioned in the content (extracted using NLP). If the place cannot be geolocated by Infinit.e then this "catch all" is used. | Datasift/Salience |
URL | http://www.ikanow.com/downloads | Links in Facebook posts and tweets. | Datasift |
Person | "Barack Obama" | A name extracted from the content by Salience and believed to be the name of a person. | Salience (or Datasift) |
Job Title | President, CEO | A job title extracted from the content using Natural Language Processing. | Salience |
Company/Organization | Microsoft, UN | A name extracted from the content by Salience and believed to be the name of a company or organization. | |
Quote | "Ask not for whom the bell tolls" | An unattributed quote extracted from the content. | Salience |
Keyword | "american history", "domestic spying program" | A word or phrase from the content that is statistically significant to the meaning of the post. | Salience |
Aliasing
TODO