Hadoop is everywhere and gaining attention like crazy. This is not an article which explains what’s it or how it works because there are a lot of good resources for that. So I don’t want to repeat the same stuff but I’m going to help you to go step further and deploy a Hadoop multi node cluster on ubuntu. Pretty interesting right? If you follow the steps given below you can get it done in 15 mins. Let’s start.
All you need is
- Java 1.7 should be installed.
- 5 Nodes. In my case it’s 192.168.7.87, 192.168.7.88, 192.168.7.89, 192.168.7.90, 192.168.7.91
1. Configure Environment
- Let’s create a dedicated user for hadoop who’s hduser.
useradd -m -d /home/hduser -s /bin/bash
- Configure password-less SSH
1st you will have to decide which node is going to be the master, the secondary master and the slaves. Then make sure that the master node is able to do a password-less ssh to all the slaves and the secondary master. If you don’t know how to setup password-less ssh refer this article.
- Edit /etc/hosts and add the below. Also comment out IPV6.
192.168.7.87 master 192.168.7.88 master2 192.168.7.89 slave1 192.168.7.90 slave2 192.168.7.91 slave3
- Edit hostname file
In the master node edit the hostname file as shown below.
just replace the content with master. Now follow the same steps and edit the hostname in other nodes as well. The hostname should be master2, slave1, slave2, slave3 respectively.