Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Overview

...

On each node in the cluster (API and DB nodes - regardless of whether Hadoop/HDFS will actually be running - BUT SEE INFO BOX BELOW), run the "/opt/hadoop-infinite/scripts/online_install.sh" script.

  • (eg 'sh infinite_run_script_el6.sh <CLUSTER> "sh /opt/hadoop-infinite/scripts/online_install.sh" <HOSTS>' for enterprise users)
Info

If you are not going to be including the node in the cluster install, then run

Code Block
sh /opt/hadoop-infinite/scripts/online_install.sh partial

ie adding the argument "partial" - otherwise the symbolic link "/opt/hadoop-infinite/lib" will point to the wrong place. If you do not do this, then it it can be fixed with

Code Block
rm -f /opt/hadoop-infinite/lib; ln -sf /opt/hadoop-infinite/standalone_lib /opt/hadoop-infinite/lib

Finally for the "command line phase", select a node on which to run the Cloudera Manager server, and run "sh /opt/hadoop-infinite/scripts/online_install.sh full" on that server in an interactive console. Select "<Next>/<Yes>/<OK>" whenever prompted - this console UI has no options.

...

  • "Cloudera recomments settings /proc/sys/vm/swappiness to 0"
  • "There are mismatched versions across the system, which will cause failures. See below for details on which hosts are running what versions of components"
    • (this just refers to Java)
  • "Cloudera supports versions 1.6.0_31 and 1.7.0_55 of Oracle's JVM and later. OpenJDK is not supported, and gcj is known to not work. Check the component version table below to identify hosts with unsupported versions of Java."

...

Using the "Search" bar to find them, the following configuration settings should be modified

  • Change "Number of Tasks to Run per JVM" to -1
  • Set "MapReduce Service Environment Advanced Configuration Snippet (Safety Valve)" to 
    • JAVA_HOME="/usr/java/default/jre/"
  • Find "MapReduce Child Java Opts Base" and append  "-Djava.security.policy=/opt/infinite-home/config/security.policy" after (the already present) "-Djava.net.preferIPv4Stack=true" (with a space between them)
  • Search for "Simultaneous" and set (eg) "Maximum Number of Simultaneous Map Tasks" to 2 and "Maximum Number of Simultaneous Reduce Tasks" to 1
    • (on larger instances than the typical 15GB instances, for heavy batch analytics use, this can be increased)

Then select the "Save Changes" button. This brings up two "Stale Configuration" notifications in the top left:

...

  • Copy the contents of the downloaded zip (the files in its "hadoop-conf" directory) to the "/opt/hadoop-infinite/mapreduce/hadoop" directory of each API node)
    • (eg "sh hadoop_config_deploy_el6.sh <CLUSTER-NAME> ~/Downloads/mapreduce-clientconfig.zip <API HOSTS>" for enterprise users
  • Upgrade the API nodes to v0.5 (if not already done - restart the "tomcat6-interface-engine" nodes if v0.5 is already installed)
  • Remove the "/opt/infinite-home/bin/STOP_CUSTOM"
    • (eg 'sh infinite_run_script_el6.sh <CLUSTER> "rm -f/opt/infinite-home/bin/STOP_CUSTOM" <HOSTS>' for enterprise users)

Installing CDH5 (YARN)

IKANOW is not yet integrated with CDH5 running YARN. Once it is, this section will explain how to:

...