Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The "query" is either a MongoDB or Infinit.e query that selects the desired documents (full query specification: here) ... often this will involve just "{}" and a list of communities, or just the "$srctags" field
  • The "user arguments" field is a JSON object with the following format:

...

  • "runProcessingLoop": This occurs every cycle (eg 500) docs, and reports some harvest statistics and error messages against the Hadoop taskID, object with a single string field "message"
  • "completeMapper": the aggregated statistics for a single Hadoop mapper (identified by taskID), object with a single string field "message"
  • "WARNING": warnings based on likely misconfigurations, object with a single string field "error"