Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

For this example, the "Custom Viewer - Map" is the obvious choice. The screenshot below shows the different options from the header.

TODO screenshot of the drop down, quick explanation

TODO returning to the plugin manager, talk about query and user argumentsImage Added

  • The first dropdown menu selects the plug-in from which to take the results.
    • (The selection will fail if the widget cannot detect any fields that look like lat/long. This is discussed below, under "Visualizing the output of plug-ins")
  • The second menu allows the user to select which field determines the color of the plotted points (from the palette of green/blue/orange/red). 
    • (This job has generated two numeric fields, the aggregated sentiment, and the number of records containing sentiment)
  • The third menu determines how the score field is converted into a color:
    • Linear scale: the lowest score is green, the highest score is red, the buckets are distributed evenly from min to max.
    • Log scale: the lowest score is green, the highest score is red, the buckets are distributed logarithmicly from min to max.
    • Polarity: Red is negative, Green is positive, Blue is neutral (less than 10% of the max in either direction).

Returning back to the plugin manager, there were two larger text fields:

  • "Query" field: together with the "Communities" list, this controls what data is processed
  • "User arguments" field: in this case this is actually the code that is run over the data. This is because it is a Javascript plugin, see below under "Creating new Javascript plug-ins".
    • (Note that for Hadoop JARs this provides generic configuration parameters, see below under "Creating new Hadoop plug-ins")

In this case we can see that the query is:

Code Block
languagejs
{"docGeo":{"$exists":true}}
//^^ (ie only process geo-tagged tweets, eg from cellphones)

There are a few points to note here:

  • The overall syntax of the query is that of MongoDB
    • There are some additional extensions starting with "$": these are documented here, and can be inserted either manually of by by pressing the "Add Options" button that is next to the query.
  • The document fields that the query is applied against are described here.
    • You can view the JSON format of a given document from the "Document Browser" widget, as shown in the screenshot below.

Image Added

As an example, say you wanted to query on only records that were tagged by datasift with gender "Male". There would be two ways of doing this:

Code Block
//Option 1, simplest (see datasift documentation for their metadata format):
//TODO
//Option 2, most generic:
//TODO

The advantage of Option 2 would be that if you later imported other sources that had a "Gender" entity but weren't from datasift (eg had a different metadata format), then you would not have to alter your queries.

 

TODO example - changing the query

TODO map/reduce code

Example 2 - Aggregate sentiment by gender

...