Plugin Manager
- Caleb Burch (Unlicensed)
- andrew johnston (Unlicensed)
- AlexI (Unlicensed)
Overview
The Plugin Manager provides a user interface for uploading new and updated MapReduce plugins and saved queries to the system, and for sharing them across the different communities.
If you want to use a MapReduce plugin for many jobs, use the File Uploader to upload the jar once and then use the Plugin Manager to schedule each job that uses the common jar.
A built-in "Javascript Engine" plugin is provided that allows users to write javascript "scriptlets" and distribute them. For more information, see section Javascript Engine for Plugin Manager.
The Plugin Manager GUI is not currently compatible with IE. It is compatible with Chrome, Firefox, and Safari.
Authorization Requirements:
Only Administrators can upload JARs to be used by the Plugin Manager.
Any user can run jobs based on JARs to which they have access.
Using the Plugin Manager
Logging In
The Plugin Manager shares its cookie with the main GUI, the File Uploader, the Source Builder, and the People Manager - logging into any of them will log into all of them.
To login
- Provide your credentials when prompted by any of the above mentioned GUIs.
Root URL Access
The Plugin Manager can be accessed from the root url by using the following format:
<ROOT_URL>/manager/pluginManager.jsp
eg http://infinite.ikanow.com/manager/pluginManager.jsp).
This brings up username and password fields and a login button
The following URL parameters are supported:
- Use "?sudo" to view jobs you don't own but which are shared with you (or all jobs if you are an admin), eg http://infinite.ikanow.com/pluginManager.jsp?sudo
Uploading a MapReduce Plugin and Scheduling Jobs
You can use the Plugin Manager to upload MapReduce plugins (JARs) and to schedule jobs.
To upload and schedule jobs
- Provide a title, description and time/frequency settings as required.
- Fill in the remaining fields, as per your requirements. For more information, see section Plugin Manager Interface.
- Select the file from your hard drive/network drive, and click on Submit or Quick Run, as described below.
You can select either a single or multiple (CTRL+click) Communities.
Authorization Requirements:
You can share plugins with any available community. If you upload to only your personal community, only you (and system administrators) will have access to the file.
Submit vs Quickrun
When you create a MapReduce job there are two options:
- Submit (button to the right of "Title")
- QuickRun (button to the right of "Frequency")
Submit:
Submit will save the task (or update it if it already exists). If the frequency is not "Never" and the "Next Scheduled Time" is now or in the past, then the job is immediately scheduled. The page refreshes immediately and the progress can be monitored.
QuickRun:
"QuickRun" will set the frequency to "Once Only" and the time to "ASAP" (as soon as possible) and then will do 2 things:
- Submit as above
- It will wait for the job to complete before refreshing the page (all the "action" buttons are disabled in the meantime). You can't see the progress in the meantime, so this is best used on smaller jobs.
Debugging New and Existing Tasks
You can use the Plugin Manager to debug new and existing records. The following behavior is applicable:
- It will only run on the number of records specified in the text box next to the button.
- It will always run the Hadoop JAR in "local mode" (ie it won't be distributed to the Hadoop cluster, if one exists).
To debug a new or existing task
- Fill in the fields of the Plugin Manager, as per your requirements. For more information, see section Plugin Manager Interface.
- Specify the number of records in the text box.
- Click on Save and Debug.
Any log messages output by the Hadoop JAR (or in the javascript if running the prototype engine) are collected and output to the status message.
If running in local mode, then "QuickRun" will log error messages. However, when running in a typical cluster mode no logging is enabled, and so debug mode becomes necessary. Another alternative is running and testing in eclipse.
Stopping a Running Task (kill)
You can use the Plugin Manager to stop a running task.
To stop a running task
- Set the "Frequency" to "Don't Run"
- Click Submit. It may take up to a minute for the status to be returned.
Scheduling a New Saved Query
You can use the Plugin Manager to schedule a saved query instead of a MapReduce plugin
To schedule a new saved query
- Place a checkmark next to "re-use existing JAR."
- For JAR file, select the "query only" option.
- Leave the Mapper, Combiner, and Reducer classes blank.
- Submit a valid query for the query field.
A good place to start is saving queries from the Community Edition (CE) GUI. These can be pasted into the query or "User arguments" field.
Unlike the normal map/reduce case, the query in this instance must be a CE query, not a MongoDB query.
You can also paste a saved workspace link into the query field (eg from the CE GUI) instead of typing out the JSON.
Using the Query Field
A query must use the following format:
- MongoDB query (use the /wiki/spaces/INF/pages/4358642, or content format), or (from March 2014)
- Only indexed fields should be used in the query, this is discussed further here: Hadoop Plugin Guide
- Community Edition query JSON objects (note this will usually be slower than indexed MongoDB queries so should be used with care)
Adding Control Fields:
When using the query field you can specify additional control fields as specified here.
To add control fields
- The most common control fields can be specifically added with their default value by pressing the "Add Options" button
Using JSON Objects:
If using JSON objects for queries, you can validate your queries before running them
To validate JSON
- Press the "Check" button next to the "Query" field to validate the query JSON.
Using a Saved Workspace Link:
You can paste a saved workspace link into the query field (eg. from the GUI) instead of typing out the JSON to generate a CE style query.
To use a saved workspace
- Use the workspace UI to query the CE data set.
- Copy the URL from the browser and paste it into the query field of the Plugin Manager.
You can include JSON below that to add query qualifiers, as described here.
Editing Existing Plugins/Saved Queries
You can use the Plugin Manager to edit existing plugins and saved queries
Authorization Requirements:
After log-in, all plugins you own can be seen from the top drop-down menu (initially called "Upload New File").
You can also see all plugins in your community.
If you are an administrator, all plugins in the system can be seen. You can only edit plugins if you have admin/community moderator/owner though you can see the results of any plugin in the dropdown.
Apart from HadoopJavascriptTemplate, by default only shares you own are visible, even as admin/moderator (this will be fixed in a future release).
To see all plugins you can access via the API, add "&sudo" to the URL (eg "http://localhost:8080/manager/pluginManager.jsp?sudo").
To edit an existing plugin/query
- Select the file from the "upload new plugin" dropdown.
- Make the required changes to the fields.
- Click on Submit.
Copying an Existing Custom Task
You can use the Plugin Manager to copy an existing task
To copy the task
- Select the task to be copied from the top drop down menu.
- Select Copy Current Plugin. You must change the title.
Deleting files
You can use the Plugin Manager to delete files
To delete the file
- Select the file to be deleted from the top drop down menu.
- Click on Delete.
Running an Existing Job
To run the job
- For the parameter Next scheduled time provide a value of "Once Only." The job will be scheduled as soon as possible and only run once. If you want to schedule a job on a certain frequency you can adjust the frequency option to one of the other settings.
Following a Job's Progress
Once a job has been scheduled you will be able to track its progress.
To track job progress
- Click the Refresh button next to "Run status." The current map and reduce completion status as well as any errors that may have occurred when running are displayed in an informational header.
In this section:
Related Reference Documentation: