2. Untar/Unzip Hadoop and java using following command
$sudo tar -xzf java-7-oracle.tar.gz
$sudo tar -xzf hadoop-1.0.0.tar.gz
3. Rename the folders to some meaning full names. Here I renamed to
the folders to java and Hadoop
4. Now it's time to export environment variables and add the
file is the script every time a user logs in. bashrc file is
located under home directory of the user. To do so we have to do
You can append following entries to the .bashrc file.
5. Now you can close the terminal and re start it again so that the
bashrc changes get affected. You can run following commands to
verify Java and Hadoop installation.
6. Once this done, you can configure ssh. Configuring ssh is
two-step process, one is to generate keys and second is to copy
public key to the authorized_keys folder. Here are the commands you
need to run to do so,
$ssh-keygen -t rsa -P ""
$cat $HOME/.ssh/id_rsa.pub >>
7. Once done, the next step is to configure Hadoop. All Hadoop
configurations files are under $HADOOP_HOME/conf folder. Hadoop
configuration requires following three file configurations
1. hadoop-env.sh - In this file we need to set the JAVA_HOME. This file already contains place holder for JAVA_HOME which is commented out so you just need to search for that uncomment the code.
2. core-site.xml - This requires configurations about the HDFS. Here we need to configure minimum two configurations viz. hadoop.tmp.dir and fs.default.name as shown below
<description>A base for other temporary directories.</description>
<description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.
3. mapred-site.xml - This file is specific to Job Tracker
settings. We should set this file as follows.
The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task.
8. Now whatever folder we have created in step 7.3, we should
manually create that folder and give full rights to that folder.
9. Now format NameNode so that it creates all required folder
structure as shown below.
$hadoop namenode -format
10. And the last step is to start all daemons as shown below.
11. You can verify that all daemons have started by running command