Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

This guide is meant to guide and give suggestions as to how to setup and configure the Hadoop installation once you have completed the RPM installation instructions found here: Infinit.e-hadoop-installer.

This guide will start off from the point which you navigate to http://server:7180 in your browser:

The default login is admin/admin.

Once logged in skip the prompts to register and you should come to a screen for adding hosts.  You can put a comma deliminated list of ip addresses or hostnames you want to be in your cluster and hit find hosts.  You will be greeted with a list of the machines you can add to your hadoop cluster.  Select each node you want and hit continue. (Your window will look slightly different to mine because I have already added these servers).

On the next screen select the basic install which includes Mapreduce, Hue, HDFS and follow the prompts. Once everything is installed you should come to a screen that looks like this:

There are some recommended configuration settings that Ikanow suggests changing before using the API server.  These are optional (you can skip down to Generating Client Configuration if you want to get started with the default settings.

To change the mapreduce settings click on the mapreduce1 service and then click the configuration button at the top, you should come to a screen like this:

We recommend changing these items:

  1. Number of tasks to run per JVM (mapred.job.reuse.jvm.num.tasks) to -1
  2. Use Compression on Map Outputs (mapred.compress.map.output) to false (uncheck the box)
  3. Maximum Number of Simultaneous Map Tasks to 2
  4. Maximum Number of Simultaneous Reduce Tasks to 1

After you make these changes if you go back to the main screen (click the Services cookie at the top or the cloudera manager header in the top left).

Here we need to click the button Generate Client Configuration which will download a zip file.  Open this zip file up and put the folder somewhere locally that you can find easily.  The files in this folder need to be moved to the configuration folder on all the nodes.  The configuration folder is by default set to /mnt/opt/hadoop-home/mapreduce/hadoop/
The files in the folder usually include: core-site.xml, hadoop-env.sh, hdfs-site.xml, log4j.properties, mapred-site.xml, README.txt, ssl-client.xml.example.  Just transfer all these files into the folder.

Once you have completed these steps you should be able to schedule map reduce jobs via the API call: custom/mapreduce/schedulejob


  • No labels