| Welcome to Huynh's Collections. We hope you enjoy your visit. You're currently viewing our forum as a guest. This means you are limited to certain areas of the board and there are some features you can't use. If you join our community, you'll be able to access member-only sections, and use many member-only features such as customizing your profile, sending personal messages, and voting in polls. Registration is simple, fast, and completely free. Join our community! If you're already a member please log in to your account to access all of our features: |
| Set Up Hadoop Multi-Node Cluster on CentOS 6 | |
|---|---|
| Tweet Topic Started: Dec 16 2013, 08:30 AM (469 Views) | |
| Huynhnb8x | Dec 16 2013, 08:30 AM Post #1 |
|
Th1nk
![]() ![]() ![]() ![]() ![]()
|
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. Our earlier article about hadoop was describing to how to setup single node cluster. This article will help you for step by step installing and configuring Hadoop Multi-Node Cluster on CentOS/RHEL 6. Setup Details: Hadoop Master: 192.168.1.15 ( hadoop-master ) Hadoop Slave : 192.168.1.16 ( hadoop-slave-1 ) Hadoop Slave : 192.168.1.17 ( hadoop-slave-2 ) Step 1. Install Java Before installing hadoop make sure you have java installed on all your systems. If you do not have java installed use following article to install Java. Steps to install JAVA on CentOS 5/6 or RHEL 5/6 Step 2. Create User Account Create a system user account on both master and slave systems to use for hadoop installation # useradd hadoop # passwd hadoop Changing password for user hadoop. New password: Retype new password: passwd: all authentication tokens updated successfully. Step 3: Add FQDN Mapping Edit /etc/hosts file on all master and slave servers and add following entries. # vim /etc/hosts 192.168.1.15 hadoop-master 192.168.1.16 hadoop-slave-1 192.168.1.17 hadoop-slave-2 Step 4. Configuring Key Based Login Its required to set up hadoop user to ssh itself without password. Use following commands to confiure auto login between all hadoop cluster servers.. # su - hadoop $ ssh-keygen -t rsa $ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@hadoop-master $ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@hadoop-slave-1 $ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@hadoop-slave-2 $ chmod 0600 ~/.ssh/authorized_keys $ exit Step 5. Download and Extract Hadoop Source Download hadoop latest available version from its official site at hadoop-master server only. # mkdir /opt/hadoop # cd /opt/hadoop/ # wget http://apache.mesi.com.ar/hadoop/common/hadoop-1.2.0/hadoop-1.2.0.tar.gz # tar -xzf hadoop-1.2.0.tar.gz # mv hadoop-1.2.0 hadoop # chown -R hadoop /opt/hadoop # cd /opt/hadoop/hadoop/ Step 6: Configure Hadoop First edit hadoop configuration files and make following changes. 6.1 Edit core-site.xml # vim conf/core-site.xml #Add the following inside the configuration tag <property> <name>fs.default.name</name> <value>hdfs://hadoop-master:9000/</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> 6.2 Edit hdfs-site.xml # vim conf/hdfs-site.xml # Add the following inside the configuration tag <property> <name>dfs.data.dir</name> <value>/opt/hadoop/hadoop/dfs/name/data</value> <final>true</final> </property> <property> <name>dfs.name.dir</name> <value>/opt/hadoop/hadoop/dfs/name</value> <final>true</final> </property> <property> <name>dfs.replication</name> <value>1</value> </property> 6.3 Edit mapred-site.xml # vim conf/mapred-site.xml # Add the following inside the configuration tag <property> <name>mapred.job.tracker</name> <value>hadoop-master:9001</value> </property> 6.4 Edit hadoop-env.sh # vim conf/hadoop-env.sh export JAVA_HOME=/opt/jdk1.7.0_17 export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true export HADOOP_CONF_DIR=/opt/hadoop/hadoop/conf Set JAVA_HOME path as per your system configuration for java. Step 7: Copy Hadoop Source to Slave Servers After updating above configuration, we need to copy the source files to all slaves servers. # su - hadoop $ cd /opt/hadoop $ scp -r hadoop hadoop-slave-1:/opt/hadoop $ scp -r hadoop hadoop-slave-2:/opt/hadoop Step 8: Configure Hadoop on Master Server Only Go to hadoop source folder on hadoop-master and do following settings. # su - hadoop $ cd /opt/hadoop/hadoop $ vim conf/masters hadoop-master $ vim conf/slaves hadoop-slave-1 hadoop-slave-2 Format Name Node on Hadoop Master only # su - hadoop $ cd /opt/hadoop/hadoop $ bin/hadoop namenode -format 13/07/13 10:58:07 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = hadoop-master/192.168.1.15 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 1.2.0 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1479473; compiled by 'hortonfo' on Mon May 6 06:59:37 UTC 2013 STARTUP_MSG: java = 1.7.0_25 ************************************************************/ 13/07/13 10:58:08 INFO util.GSet: Computing capacity for map BlocksMap 13/07/13 10:58:08 INFO util.GSet: VM type = 32-bit 13/07/13 10:58:08 INFO util.GSet: 2.0% max memory = 1013645312 13/07/13 10:58:08 INFO util.GSet: capacity = 2^22 = 4194304 entries 13/07/13 10:58:08 INFO util.GSet: recommended=4194304, actual=4194304 13/07/13 10:58:08 INFO namenode.FSNamesystem: fsOwner=hadoop 13/07/13 10:58:08 INFO namenode.FSNamesystem: supergroup=supergroup 13/07/13 10:58:08 INFO namenode.FSNamesystem: isPermissionEnabled=true 13/07/13 10:58:08 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100 13/07/13 10:58:08 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 13/07/13 10:58:08 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0 13/07/13 10:58:08 INFO namenode.NameNode: Caching file names occuring more than 10 times 13/07/13 10:58:08 INFO common.Storage: Image file of size 112 saved in 0 seconds. 13/07/13 10:58:08 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/opt/hadoop/hadoop/dfs/name/current/edits 13/07/13 10:58:08 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/opt/hadoop/hadoop/dfs/name/current/edits 13/07/13 10:58:08 INFO common.Storage: Storage directory /opt/hadoop/hadoop/dfs/name has been successfully formatted. 13/07/13 10:58:08 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at hadoop-master/192.168.1.15 ************************************************************/ Step 9: Start Hadoop Services Use the following command to start all hadoop services on Hadoop-Master $ bin/start-all.sh |
| Knowledge crawling | |
![]() |
|
| « Previous Topic · Tài liệu sưu tầm · Next Topic » |
| Track Topic · E-mail Topic |
9:00 AM Jul 11
|
Theme by James... of the ZBTZ and themeszetaboards.com





![]](http://z5.ifrm.com/static/1/pip_r.png)




9:00 AM Jul 11