Home > Software Center, Structured Storage > Setup Hadoop on Ubuntu 11.04 64-bit

Setup Hadoop on Ubuntu 11.04 64-bit

Hadoop documentation page has provided a clear statement for hadoop setup on Linux. However, in this entry I want to make the same process simpler and shorter, tailored to suit Ubuntu 11.04 64-bit OS.

1. Install Sun JDK

Sun JDK is unavailable in the official repository of Ubuntu Software Center. What a shame! Let’s resort to an external PPA (Personal Package Archives). Launch the Terminal and run the following commands:

sudo add-apt-repository ppa:ferramroberto/java
sudo apt-get update
sudo apt-get install sun-java6-bin
sudo apt-get install sun-java6-jdk

Add JAVA_HOME variable:

sudo gedit /etc/environment

Append a new line in the file:

export JAVA_HOME="/usr/lib/jvm/java-6-sun-"

Test the success of installation in Terminal:

java -version

2. Check SSH Setting

ssh localhost

If it says “connection refused”, you’d better reinstall SSH:

sudo apt-get install openssh-server openssh-client

If you cannot ssh to localhost without a passphrase, execute the following commands:

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

3. Setup Hadoop

Download a recent stable release and unpack it. Edit conf/hadoop-env.sh to define JAVA_HOME as "/usr/lib/jvm/java-6-sun-":

# The java implementation to use. Required.
export JAVA_HOME=/usr/lib/jvm/java-6-sun-

Pseudo-Distributed Operation:







Switch to hadoop root directory and format a new distributed file system:

bin/hadoop namenode -format

You’ll get info like “Storage directory /tmp/hadoop-jasper/dfs/name has been successfully formatted.” Remember this path is the HDFS home directory of namenode.

Start and stop hadoop daemons:


Web interfaces for the NameNode and the JobTracker:

4. Deploy An Example Map-Reduce Job

Let’s run the WordCount example job, which is already embedded in hadoop release. In your local directory, e.g., “/home/jasper/mapreduce/wordcount/”, put some text files. Then copy these files from local directory to HDFS directory and list them:

bin/hadoop dfs -copyFromLocal /home/jasper/mapreduce/wordcount /tmp/hadoop-jasper/dfs/name/wordcount

bin/hadoop dfs -ls /tmp/hadoop-jasper/dfs/name/wordcount

Run the job:

bin/hadoop jar hadoop*examples*.jar wordcount /tmp/hadoop-jasper/dfs/name/wordcount /tmp/hadoop-jasper/dfs/name/wordcount-output

If the output info looks no problem, copy the output file from HDFS to local directory:

bin/hadoop dfs -getmerge /tmp/hadoop-jasper/dfs/name/wordcount-output /home/jasper/mapreduce/wordcount/

Now you can open the output file in your local directory to view the results.

  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: