Event Scoring Explanation

This page is an attempt to explain how the scoring works in the nSight application. This information is how it works as of 4/1/2015 and is subject to change as we tweak the algorithm.

Effects of scoring:

The important thing to understand is how a channels score affect the application. The only thing scores are used are for determining what events will be displayed in a channels event feed. Scores are used in querying IKANOW for events that match keywords that have been scored up, these events are weighted additionally by published date, closer to today is scored higher.

Scoring calculations:

Seeding a channel:

When seeding a channel, any documents you add in the initial screen will have all it's entities added incrementing their score +1. E.g. a document that you add may have entities "Barrack Obama/person", "white house/place", "cell phone/keyword", after adding your channels score would be: [barrack obama/person:1, white house/place:1, cell phone/keyword:1]. When you add a second document with entities: "white house/place", "george bush/person" the scores will go to: [barrack obama/person:1, white house/place:2, cell phone/keyword:1, george bush/person:1].

Adding a location:

When seeding a channel you can add a location. This will be used when retrieving the events feed and all scores will be higher the closer to the location.

Liking an event:

When liking an event, it acts just like seeding a document, all entities in the event have their score increased by 1. Note: events are usually a combination of documents so they can have many entities, many of which may be unrelated to the event title but came from internal docs non the less (e.g. an event might be called "Topic Event: taxes" that has a collection of documents that talk about taxes, but 1 document might discuss white house policy, another is an H&R block advertisement, and the third is a blog complaining about capital gains tax, all 3 documents would have a different subset of entities that would be merged together in the event and all those entities would be liked when you like an event). To offset the effect of adding a lot of unrelated terms, you can like/dislike more events, this will bubble up the keywords you like higher so their results will show up more often.

Disliking an event:

This is the opposite action of liking an event, but instead of decreasing a score by 1 it attempts to be a little more harsh on disliked terms. When you dislike an event it will take all entities in the event (see above point about spread of entities), and it will decrease their score by 1, then half the score. E.g. if you dislike an event with the entity "barrack obama/person" and it currently had a score of 21 for your channel, it would be reduced to 20, then halved to 10. This will eventually allow an entitiy to go negative at a much more rapid rate than just disliking by 1 but attempts to mitigate disliking collateral entities in events (e.g. you may have only wanted to dislike barrack obama/person but the event also had white house/place which you wanted to focus on, this will only reduce it by half so liking other documents can bring it back up).

Viewing current scores:

Currently you can only view what keywords have positive scores and which have negative scores. In the future we hope to show the specific values of the keywords and allow them to be modified/removed/added. To view the entities that have positive/negatives scores click the edit channel button then the info button. Items in the "liked" column have some positive score. Items in the disliked column have some negative score (these are unused in the score calculation currently as we cannot score down search terms easily).