Hadoop is everywhere and gaining attention like crazy. This is not an article which explains what’s it or how it works because there are a lot of good resources for that. So I don’t want to repeat the same stuff but I’m going to help you to go step further and deploy a Hadoop multi node cluster on ubuntu. Pretty interesting right? If you follow the steps given below you can get it done in 15 mins. Let’s start.

Prerequisites

All you need is

  • Java 1.7 should be installed.
  • 5 Nodes. In my case it’s 192.168.7.87, 192.168.7.88, 192.168.7.89, 192.168.7.90, 192.168.7.91

1. Configure Environment

  1. Let’s create a dedicated user for hadoop who’s hduser.
    useradd -m -d /home/hduser -s /bin/bash
  2. Configure password-less SSH
    1st you will have to decide which node is going to be the master, the secondary master and the slaves. Then make sure that the master node is able to do a password-less ssh to all the slaves and the secondary master. If you don’t know how to setup password-less ssh refer this article.
  3. Edit /etc/hosts and add the below. Also comment out IPV6.
    192.168.7.87 master
    192.168.7.88 master2
    192.168.7.89 slave1
    192.168.7.90 slave2
    192.168.7.91 slave3
  4.  Edit hostname file
    In the master node edit the hostname file as shown below.

    vim /etc/hostname

    just replace the content with master. Now follow the same steps and edit the hostname in other nodes as well. The hostname should be master2, slave1, slave2, slave3 respectively.

– Read More –