Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Info

Updates a hadoop map reduce job. Returns the output collection id in the data field of the response if successfully queued to run. If you change the timeToRun a job can be rescheduled.

 

Note

Return the word "null" for any of the update fields to not change a field, it will remain whatever the field was previously.

 A detailed guide to creating plugins.

A simple web-based utility is available for  for uploading JARs ( and other files)managing jobs.

Authentication

Required, see Auth - Login.

...

jarURL (required)
A URL to the location of the jar file to run for the job, this can be in our Shares table or hosted somewhere else on the web. Any permission errors will die silently and the job will not complete.

timeToRun (required)
The time you want a job to be run after in long form. For example if you want it to run immediately when possible you can submit 0. If you want the job to run after January 1, 2015 submit: 1420106400000.

frequencyToRun (required)
How often the job should be ran, either: NONE, HOURLY, DAILY, WEEKLY, MONTHLY. This will cause the job to get resubmitted after running, use NONE if you only want the job to run once.

...

query (required)
The mongo or JSON query to use to get the jobs data. {} is a blank query or you can submit null. See Custom - Schedule Job for more information about this query's format, or see the Hadoop Plugin Guide for more information.

inputcollection (required)
The mongo collection you want to use as input. You can submit DOC_METADATA to get the documents table the documents metadata, DOC_CONTENT to get the document contents, or grab a previous map reduce jobs results table in your communities by submitting its id or title (must be a member of that community).

outputKey (required)
The classpath for the map reduce output format key usually org.apache.hadoop.io.Text

outputValue (required)
The classpath for the map reduce output format value usually org.apache.hadoop.io.IntWritable

json (optional)
Current you can pass a json object in containing any custom arguments you want passed in to the map reduce job at runtime, these arguments will be available in the config file in your mapper/reducer
format: { "arguments":"any string you want here" }

Example

http://infinite.ikanow.com/api/custom/mapreduce/updatejob/4f2007dd8196fe53a52c25a1/TestJob/Testing%20map%20reduce/4e9c77ef17ef3523b657a890/%24infinite%2Fshare%2Fget%2F4eafed58233558b98055c872/0/NONE/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24TokenizerMapper/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24IntSumReducer/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24IntSumReducer/null/doc_metadata/org.apache.hadoop.io.Text/org.apache.hadoop.io.IntWritable - this example is the same as the schedule job, it would change a jobs fields to all these new settings (or the same settings)
http://infinite.ikanow.com/api/custom/mapreduce/updatejob/4f2007dd8196fe53a52c25a1/null/null/null/0/null/null/null/null/null/null/null/null/null - this example will just reschedule a job to run as soon as possible

cURL - Login, Get Map Reduce Jobs, Get Map Reduce Job Results, LogoutJava Example

Method.Post

Example using curl:

Code Block
curl \-XPOST 'http://infinite.ikanow.com/api/custom/mapreduce/updatejob/4f2007dd8196fe53a52c25a1/TestJob/Testing%20map%20reduce/4e9c77ef17ef3523b657a890/%24infinite%2Fshare%2Fget%2F4eafed58233558b98055c872/0/NONE/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24TokenizerMapper/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24IntSumReducer/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24IntSumReducer/null/doc_metadata/org.apache.hadoop.io.Text/org.apache.hadoop.io.IntWritable' \-d '{ "arguments":"Sterling Archer" }'


Example Response
Info
Code Block
{"response":{"action":"Update MapReduce Job","success":true,"message":"Job updated successfully, will run on: Wed Dec 31 19:00:00 EST 1969","time":246},"data":"4f2007dd8196fe53a52c25a1"}

...