Custom - Map Reduce - Update Job

/custom/mapreduce/updatejob/{jobidortitle}/{jobtitle}/{jobdesc}/{communityIds}/{jarURL}/{timeToRun}/{frequencyToRun}/{mapperClass}/{reducerClass}/{combinerClass}/{query}/{inputcollection}/{outputKey}/{outputValue}

Updates a hadoop map reduce job. Returns the output collection id in the data field of the response if successfully queued to run. If you change the timeToRun a job can be rescheduled.

Return the word "null" for any of the update fields to not change a field, it will remain whatever the field was previously.

A detailed guide to creating plugins.

A simple web-based utility is available for uploading JARs and managing jobs.

Authentication

Required, see Auth - Login.

Arguments

jobidortitle (required)
The id or the title of the job you want to update

jobtitle (required)
A descriptive name of the job being submitted.

jobdesc (required)
A description of what the job being submitted is attempting to do.

communityIds (required)
Community ID, or IDs (comma-separated), that the map reduce job wants to run on. These will be appended to the mongo query.

jarURL (required)
A URL to the location of the jar file to run for the job, this can be in our Shares table or hosted somewhere else on the web. Any permission errors will die silently and the job will not complete.

timeToRun (required)
The time you want a job to be run after in long form. For example if you want it to run immediately when possible you can submit 0. If you want the job to run after January 1, 2015 submit: 1420106400000.

frequencyToRun (required)
How often the job should be ran, either: NONE, HOURLY, DAILY, WEEKLY, MONTHLY. This will cause the job to get resubmitted after running, use NONE if you only want the job to run once.

mapperClass (required)
The java classpath to the jobs mapper, it should be in the form of package.file$class

reducerClass (required)
The java classpath to the jobs reducer, it should be in the form of package.file$class

combinerClass (required)
The java classpath to the jobs combiner, it should be in the form of package.file$class (use the reducer if you have not written a combiner or submit null).

query (required)
The mongo or JSON query to use to get the jobs data. {} is a blank query or you can submit null. See Custom - Schedule Job for more information about this query's format, or see the Hadoop Plugin Guide for more information.

inputcollection (required)
The mongo collection you want to use as input. You can submit DOC_METADATA to get the documents metadata, DOC_CONTENT to get the document contents, or grab a previous map reduce jobs results table in your communities by submitting its id or title (must be a member of that community).

outputKey (required)
The classpath for the map reduce output format key usually org.apache.hadoop.io.Text

outputValue (required)
The classpath for the map reduce output format value usually org.apache.hadoop.io.IntWritable

json (optional)
Current you can pass a json object in containing any custom arguments you want passed in to the map reduce job at runtime, these arguments will be available in the config file in your mapper/reducer
format: { "arguments":"any string you want here" }

Example

http://infinite.ikanow.com/api/custom/mapreduce/updatejob/4f2007dd8196fe53a52c25a1/TestJob/Testing%20map%20reduce/4e9c77ef17ef3523b657a890/%24infinite%2Fshare%2Fget%2F4eafed58233558b98055c872/0/NONE/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24TokenizerMapper/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24IntSumReducer/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24IntSumReducer/null/doc_metadata/org.apache.hadoop.io.Text/org.apache.hadoop.io.IntWritable - this example is the same as the schedule job, it would change a jobs fields to all these new settings (or the same settings)
http://infinite.ikanow.com/api/custom/mapreduce/updatejob/4f2007dd8196fe53a52c25a1/null/null/null/0/null/null/null/null/null/null/null/null/null - this example will just reschedule a job to run as soon as possible

cURL - Login, Get Map Reduce Jobs, Get Map Reduce Job Results, LogoutJava Example

Method.Post

Example using curl:

curl \-XPOST 'http://infinite.ikanow.com/api/custom/mapreduce/updatejob/4f2007dd8196fe53a52c25a1/TestJob/Testing%20map%20reduce/4e9c77ef17ef3523b657a890/%24infinite%2Fshare%2Fget%2F4eafed58233558b98055c872/0/NONE/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24TokenizerMapper/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24IntSumReducer/com.ikanow.infinit.e.core.mapreduce.examplejars.Test%24IntSumReducer/null/doc_metadata/org.apache.hadoop.io.Text/org.apache.hadoop.io.IntWritable' \-d '{ "arguments":"Sterling Archer" }'


Example Response
{"response":{"action":"Update MapReduce Job","success":true,"message":"Job updated successfully, will run on: Wed Dec 31 19:00:00 EST 1969","time":246},"data":"4f2007dd8196fe53a52c25a1"}
Error Response
{"response":{"action":"Update MapReduce Job","success":false,"message":"You are not allowed to use the given input collection.","time":142}} 

Other Error Messages:

  • Must be owner/admin of job you are trying to update: You are not an admin or submitter of this job
  • Bad job title/id: No jobs with this ID exist
  • Bad parameters passed in for update or other general error: error scheduling job
  • Bad parameter for frequencyToRun: No enum matching scheduled frequency, try NONE, DAILY, WEEKLY, MONTHLY 
  • Not a unique jobname: A job already matches that title, please choose another title
  • Not allowed access to an input collection: You are not allowed to use the given input collection.
  • Job is currently running: Job is currently running (or not yet marked as completed).  Please wait until the job completes to update it.