/custom/mapreduce/updatejob/{jobidortitle}/{jobtitle}/{jobdesc}/{communityIds}/{jarURL}/{timeToRun}/{frequencyToRun}/{mapperClass}/{reducerClass}/{combinerClass}/{query}/{inputcollection}/{outputKey}/{outputValue}
Updates a hadoop map reduce job. Returns the output collection id in the data field of the response if successfully queued to run. If you change the timeToRun a job can be rescheduled.
Return the word "null" for any of the update fields to not change a field, it will remain whatever the field was previously.
A simple web-based utility is available for uploading JARs and managing jobs.
Authentication
Required, see Auth - Login.
Arguments
jobidortitle (required)
The id or the title of the job you want to update
jobtitle (required)
A descriptive name of the job being submitted.
jobdesc (required)
A description of what the job being submitted is attempting to do.
communityIds (required)
Community ID, or IDs (comma-separated), that the map reduce job wants to run on. These will be appended to the mongo query.
jarURL (required)
A URL to the location of the jar file to run for the job, this can be in our Shares table or hosted somewhere else on the web. Any permission errors will die silently and the job will not complete.
timeToRun (required)
The time you want a job to be run after in long form. For example if you want it to run immediately when possible you can submit 0. If you want the job to run after January 1, 2015 submit: 1420106400000.
frequencyToRun (required)
How often the job should be ran, either: NONE, DAILY, WEEKLY, MONTHLY. This will cause the job to get resubmitted after running, use NONE if you only want the job to run once.
mapperClass (required)
The java classpath to the jobs mapper, it should be in the form of package.file$class
reducerClass (required)
The java classpath to the jobs reducer, it should be in the form of package.file$class
combinerClass (required)
The java classpath to the jobs combiner, it should be in the form of package.file$class (use the reducer if you have not written a combiner or submit null).
query (required)
The mongo query to use to get the jobs data. {} is a blank query or you can submit null. Also you can submit any post-processing you want by passing in an array of the form [{mongodb query},{postproc}] where postproc is a json object following the form:
{ "limit":int, "sortField":"field.field.field", "sortDirection":-1|1, "limitAllData":true|false }
See the Hadoop Plugin Guide for more information.
inputcollection (required)
The mongo collection you want to use as input. You can submit DOC_METADATA to get the documents table or grab a previous map reduce results table in your communities by submitting its id.
outputKey (required)
The classpath for the map reduce output format key usually org.apache.hadoop.io.Text
outputValue (required)
The classpath for the map reduce output format value usually org.apache.hadoop.io.IntWritable
json (optional)
Current you can pass a json object in containing any custom arguments you want passed in to the map reduce job at runtime, these arguments will be available in the config file in your mapper/reducer
format: { "arguments":"any string you want here" }
Example
http://infinite.ikanow.com/api/custom/mapreduce/updatejob/4f2007dd8196fe53a52c25a1/TestJob/Testing%20map%20reduce/4e9c77ef17ef3523b657a890/%24infinite%2Fshare%2Fget%2F4eafed58233558b98055c872/0/NONE/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24TokenizerMapper/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24IntSumReducer/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24IntSumReducer/null/doc_metadata/org.apache.hadoop.io.Text/org.apache.hadoop.io.IntWritable - this example is the same as the schedule job, it would change a jobs fields to all these new settings (or the same settings)
http://infinite.ikanow.com/api/custom/mapreduce/updatejob/4f2007dd8196fe53a52c25a1/null/null/null/0/null/null/null/null/null/null/null/null/null - this example will just reschedule a job to run as soon as possible
cURL - Login, Get Map Reduce Jobs, Get Map Reduce Job Results, LogoutJava Example
Method.Post
Example using curl:
curl \-XPOST 'http://infinite.ikanow.com/api/custom/mapreduce/updatejob/4f2007dd8196fe53a52c25a1/TestJob/Testing%20map%20reduce/4e9c77ef17ef3523b657a890/%24infinite%2Fshare%2Fget%2F4eafed58233558b98055c872/0/NONE/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24TokenizerMapper/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24IntSumReducer/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24IntSumReducer/null/doc_metadata/org.apache.hadoop.io.Text/org.apache.hadoop.io.IntWritable' \-d '{ "arguments":"Sterling Archer" }'
Example Response
{"response":{"action":"Update MapReduce Job","success":true,"message":"Job updated successfully, will run on: Wed Dec 31 19:00:00 EST 1969","time":246},"data":"4f2007dd8196fe53a52c25a1"}
Error Response
{"response":{"action":"Update MapReduce Job","success":false,"message":"You are not allowed to use the given input collection.","time":142}}
Other Error Messages:
- Must be owner/admin of job you are trying to update: You are not an admin or submitter of this job
- Bad job title/id: No jobs with this ID exist
- Bad parameters passed in for update or other general error: error scheduling job
- Bad parameter for frequencyToRun: No enum matching scheduled frequency, try NONE, DAILY, WEEKLY, MONTHLY
- Not a unique jobname: A job already matches that title, please choose another title
- Not allowed access to an input collection: You are not allowed to use the given input collection.
- Job is currently running: Job is currently running (or not yet marked as completed). Please wait until the job completes to update it.