Hadoop: How to add all JARs onto the Classpath

I’ve just begun with Apache Hadoop. Getting started isn’t as straightforward as one might wish. There is quite a bit of work to do after the initial installation (“brew install hadoop” in my case). Expect tweaks and workarounds. Fortunately, there is plenty of advice available, such as this excellent blog post.

After having installed the latest stable version (2.7.1) and going through the initial setup I was still unable to use the damn thing! Strange as it sounds, some of the key modules were not present on the classpath. See the failure I got when trying to format HDFS:

$ hdfs namenode -format
Could not find or load main class org.apache.hadoop.hdfs.server.namenode.Namenode

I tried altering the hadoop-env.sh, but after a while I realised it was too much of a hassle. I ended up looking for a way of how to add ALL of the JARs in the Hadoop package to the classpath. It didn’t take too long to find the answer at Stackoverflow.

So, here is my take on how to add all Hadoop JARs on the classpath.

Open your bash profile (~/.profile or ~/.bash_profile) for editing and add the following:

export HADOOP_HOME="/usr/local/Cellar/hadoop" # Replace with your own path
export HADOOP_CLASSPATH=$(find $HADOOP_HOME -name '*.jar' | xargs echo | tr ' ' ':')

Save the changes and reload:

source ~/.profile

Finally, check the classpath. I’ve truncated the output, but you get the idea.

$ hadoop classpath
hadoop/yarn/hadoop-yarn-server-common-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.7.1.jar:...

From this point on, all commands should work. Let’s get back to hdfs:

$ hdfs namenode -format
SLF4J: Class path contains multiple SLF4J bindings.
...
15/08/28 09:53:33 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = toms-macbook-pro.home/192.168.1.99
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.7.1
...

I would love to say that this is it, but you are likely to run into other issues. As usual, patience and Google are your best friends.

Similar Posts