Use the following instructions to manually add a DataNode or a TaskTracker hosts:
- On each of the newly added slave nodes, add the HDP repository to yum: - wget -nv //public-repo-1.hortonworks.com/HDP-1.2.0/repos/centos6/hdp.repo -O /etc/yum.repos.d/hdp.repo yum clean all 
- On each of the newly added slave nodes, install HDFS and MapReduce. - On RHEL and CentOS: - yum install hadoop hadoop-libhdfs hadoop-native yum install hadoop-pipes hadoop-sbin openssl 
- On SLES: - zypper install hadoop hadoop-libhdfs hadoop-native zypper install hadoop-pipes hadoop-sbin openssl 
 
- On each of the newly added slave nodes, install Snappy compression/decompression library: - Check if Snappy is already installed: - rpm-qa | grep snappy 
- Install Snappy on the new nodes: - For RHEL/CentOS: - yum install snappy snappy-devel 
- For SLES: - zypper install snappy snappy-devel - ln -sf /usr/lib64/libsnappy.so /usr/lib/hadoop/lib/native/Linux-amd64-64/. 
 
 
- Optional - Install the LZO compression library. - On RHEL and CentOS: - yum install lzo-devel hadoop-lzo-native 
- On SLES: - zypper install lzo-devel hadoop-lzo-native 
 
- Copy the Hadoop configurations to the newly added slave nodes and set appropriate permissions. - Option I: Copy Hadoop config files from an existing slave node. - On an existing slave node, make a copy of the current configurations: - tar zcvf hadoop_conf.tgz /etc/hadoop/conf 
- Copy this file to each of the new nodes: - rm -rf /etc/hadoop/conf cd / tar zxvf $location_of_copied_conf_tar_file/hadoop_conf.tgz chmod -R 755 /etc/hadoop/conf 
 
- Option II: Manually add Hadoop configuration files. - Download core Hadoop configuration files from here and extract the files under - configuration_files -> core_hadoopdirectory to a temporary location.
- In the temporary directory, locate the following files and modify the properties based on your environment. Search for TODO in the files for the properties to replace. - Table 6.1. core-site.xml - Property - Example - Description - fs.default.name - hdfs://{namenode.full.hostname}:8020- Enter your NameNode hostname - fs.checkpoint.dir - /grid/hadoop/hdfs/snn- A comma separated list of paths. Use the list of directories from - $FS_CHECKPOINT_DIR..- Table 6.2. hdfs-site.xml - Property - Example - Description - dfs.name.dir - /grid/hadoop/hdfs/nn,/grid1/hadoop/hdfs/nn- Comma separated list of paths. Use the list of directories from - $DFS_NAME_DIR- dfs.data.dir - /grid/hadoop/hdfs/dn,grid1/hadoop/hdfs/dn- Comma separated list of paths. Use the list of directories from - $DFS_DATA_DIR- dfs.http.address - {namenode.full.hostname}:50070- Enter your NameNode hostname for http access - dfs.secondary.http.address - {secondary.namenode.full.hostname}:50090- Enter your SecondaryNameNode hostname - dfs.https.address - {namenode.full.hostname}:50470- Enter your NameNode hostname for https access. - Table 6.3. mapred-site.xml - Property - Example - Description - mapred.job.tracker - {jobtracker.full.hostname}:50300- Enter your JobTracker hostname - mapred.job.tracker.http.address - {jobtracker.full.hostname}:50030- Enter your JobTracker hostname - mapred.local.dir - /grid/hadoop/mapred,/grid1/hadoop/mapred- Comma separated list of paths. Use the list of directories from - $MAPREDUCE_LOCAL_DIR- mapreduce.tasktracker.group - hadoop- Enter your group. Use the value of - $HADOOP_GROUP- mapreduce.history.server.http.address - {jobtracker.full.hostname}:51111- Enter your JobTracker hostname - Table 6.4. taskcontroller.cfg - Property - Example - Description - mapred.local.dir - /grid/hadoop/mapred,/grid1/hadoop/mapred- Comma separated list of paths. Use the list of directories from - $MAPREDUCE_LOCAL_DIR
- Create the config directory on all hosts in your cluster, copy in all the configuration files, and set permissions. - rm -r $HADOOP_CONF_DIR mkdir -p $HADOOP_CONF_DIR - <copy the all the config files to $HADOOP_CONF_DIR> - chmod a+x $HADOOP_CONF_DIR/ chown -R $HDFS_USER:$HADOOP_GROUP $HADOOP_CONF_DIR/../ chmod -R 755 $HADOOP_CONF_DIR/../ 
 
 
- On each of the newly added slave nodes, start HDFS: - su -hdfs /usr/lib/hadoop/bin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start datanode 
- On each of the newly added slave nodes, start MapReduce: - su -mapred /usr/lib/hadoop/bin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start tasktracker 
- Add new slave nodes. - To add a new NameNode slave (DataNode): - On the NameNode host machine, edit the - /etc/hadoop/conf/dfs.includefile and add the list of slave nodes' hostnames (separated by newline character).![[Important]](../common/images/admon/important.png) - Important - Ensure that you create a new - dfs.includefile, if the NameNode host machine does not have an existing copy of this file.
- On the NameNode host machine, execute the following command: - su – hdfs –c “hadoop dfsadmin –refreshNodes” 
 
- To add a new JobTracker slave (TaskTracker): - One the JobTracker host machine, edit the - /etc/hadoop/conf/mapred.includefile and add the list of slave nodes' hostnames (separated by newline character).![[Important]](../common/images/admon/important.png) - Important - Ensure that you create a new - mapred.includefile, if the JobTracker host machine does not have an existing copy of this file.
- On the JobTracker host machine, execute the following command: - su – mapred –c “hadoop mradmin –refreshNodes” 
 
 
- Optional - Enable monitoring on the newly added slave nodes using the instructions provided here. 
- Optional - Enable cluster alerting on the newly added slave nodes using the instructions provided here. 


