Getting Ready to Upgrade
HDP Stack upgrade involves upgrading from HDP 2.1 to HDP-2.5.3 versions and adding the new HDP-2.5.3 services. These instructions change your configurations.
| ![[Note]](../common/images/admon/note.png) | Note | 
|---|---|
| You must use kinit before running the commands as any particular user. | 
Hardware recommendations
Although there is no single hardware requirement for installing HDP, there are some basic guidelines. The HDP packages for a complete installation of HDP-2.5.3 consumes about 6.5 GB of disk space.
The first step is to ensure you keep a backup copy of your HDP 2.1 configurations.
| ![[Note]](../common/images/admon/note.png) | Note | 
|---|---|
| The  | 
- Back up the HDP directories for any hadoop components you have installed. - The following is a list of all HDP directories: - /etc/hadoop/conf
- /etc/hbase/conf
- /etc/hive-hcatalog/conf
- /etc/hive-webhcat/conf
- /etc/accumulo/conf
- /etc/phoenix/conf
- /etc/hive/conf
- /etc/pig/conf
- /etc/sqoop/conf
- /etc/flume/conf
- /etc/mahout/conf
- /etc/oozie/conf
- /etc/hue/conf
- /etc/spark/conf
- /etc/storm/conf
- /etc/storm-slider/conf
- /etc/zookeeper/conf
- /etc/tez/conf
- /etc/storm/conf
- Optional - Back up your userlogs directories, - ${mapred.local.dir}/userlogs.
 
- Navigate to the $HIVE_HOME/lib directory. Backup the JDBC jar file for the type of Hive metastore you are using (Postgre, MySQL etc). 
- Run the - fsckcommand as the HDFS Service user and fix any errors. (The resulting file contains a complete block map of the file system.)- su - hdfs -c "hdfs fsck / -files -blocks -locations > dfs-old-fsck-1.log"
- Use the following instructions to compare status before and after the upgrade. - The following commands must be executed by the user running the HDFS service (by default, the user is hdfs). - Capture the complete namespace of the file system. (The following command does a recursive listing of the root file system.) ![[Important]](../common/images/admon/important.png) - Important - Make sure the NameNode is started. - su - hdfs -c "hdfs dfs -ls -R / > dfs-old-lsr-1.log"![[Note]](../common/images/admon/note.png) - Note - In secure mode you must have Kerberos credentials for the hdfs user. 
- Run the report command to create a list of DataNodes in the cluster. - su - "hdfs dfsadmin -c "-report > dfs-old-report-1.log"
- Optional: You can copy all or unrecoverable only data storelibext-customer directory in HDFS to a local file system or to a backup instance of HDFS. 
- Optional: You can also repeat the steps 3 (a) through 3 (c) and compare the results with the previous run to ensure the state of the file system remained unchanged. 
 
- Save the namespace by executing the following commands: - su - hdfs- hdfs dfsadmin -safemode enter- hdfs dfsadmin -saveNamespace
- Backup your NameNode metadata. - Copy the following checkpoint files into a backup directory: - The NameNode metadata is stored in a directory specified in the hdfs-site.xml configuration file under the configuration value "dfs.namenode.dir". - For example, if the configuration value is: - <property> <name>dfs.namenode.name.dir</name> <value>/hadoop/hdfs/namenode</value> </property> - Then, the NameNode metadata files are all housed inside the directory - /hadooop.hdfs/namenode.
- Store the layoutVersion of the namenode. - ${dfs.namenode.name.dir}/current/VERSION
 
- Finalize any prior HDFS upgrade, if you have not done so already. - su - hdfs -c "hdfs dfsadmin -finalizeUpgrade"
- If you have the Hive component installed, back up the Hive Metastore database. - The following instructions are provided for your convenience. For the latest backup instructions, see your database documentation. - Table 4.1. Hive Metastore Database Backup and Restore - Database Type - Backup - Restore - MySQL - mysqldump $dbname > $outputfilename.sqlsbr - For example: - mysqldump hive > /tmp/mydir/backup_hive.sql - mysql $dbname < $inputfilename.sqlsbr - For example: - mysql hive < /tmp/mydir/backup_hive.sql - PostgreSQL - sudo -u $username pg_dump $databasename > $outputfilename.sql sbr - For example: - sudo -u postgres pg_dump hive > /tmp/mydir/backup_hive.sql - sudo -u $username psql $databasename < $inputfilename.sqlsbr - For example: - sudo -u postgres psql hive < /tmp/mydir/backup_hive.sql - Oracle - Export the database: - exp username/password@database full=yes file=output_file.dmp - Import the database: - imp username/password@database file=input_file.dmp 
- If you have the Oozie component installed, back up the Oozie metastore database. - These instructions are provided for your convenience. Please check your database documentation for the latest backup instructions. - Table 4.2. Oozie Metastore Database Backup and Restore - Database Type - Backup - Restore - MySQL - mysqldump $dbname > $outputfilename.sql - For example: - mysqldump oozie > /tmp/mydir/backup_oozie.sql - mysql $dbname < $inputfilename.sql - For example: - mysql oozie < /tmp/mydir/backup_oozie.sql - PostgreSQL - sudo -u $username pg_dump $databasename > $outputfilename.sql - For example: - sudo -u- postgres pg_dump oozie > /tmp/mydir/backup_oozie.sql- sudo -u $username psql $databasename < $inputfilename.sql - For example: - sudo -u- postgres psql oozie < /tmp/mydir/backup_oozie.sql- Oracle - Export the database: - exp username/password@database full=yes file=output_file.dmp - Import the database: - imp username/password@database file=input_file.dmp 
- Optional: Back up the Hue database. - The following instructions are provided for your convenience. For the latest backup instructions, please see your database documentation. For database types that are not listed below, follow your vendor-specific instructions. - Table 4.3. Hue Database Backup and Restore - Database Type - Backup - Restore - MySQL - mysqldump $dbname > $outputfilename.sqlsbr - For example: - mysqldump hue > /tmp/mydir/backup_hue.sql - mysql $dbname < $inputfilename.sqlsbr - For example: - mysql hue < /tmp/mydir/backup_hue.sql - PostgreSQL - sudo -u $username pg_dump $databasename > $outputfilename.sql sbr - For example: - sudo -u postgres pg_dump hue > /tmp/mydir/backup_hue.sql - sudo -u $username psql $databasename < $inputfilename.sqlsbr - For example: - sudo -u postgres psql hue < /tmp/mydir/backup_hue.sql - Oracle - Connect to the Oracle database using sqlplus. Export the database. - For example: - exp username/password@database full=yes file=output_file.dmp mysql $dbname < $inputfilename.sqlsbr - Import the database: - For example: - imp username/password@database file=input_file.dmp - SQLite - /etc/init.d/hue stop - su $HUE_USER - mkdir ~/hue_backup - sqlite3 desktop.db .dump > ~/hue_backup/desktop.bak - /etc/init.d/hue start - /etc/init.d/hue stop - cd /var/lib/hue - mv desktop.db desktop.db.old - sqlite3 desktop.db < ~/hue_backup/desktop.bak - /etc/init.d/hue start 
- Stop all services (including MapReduce) and client applications deployed on HDFS: - Component - Command - Knox - cd /usr/lib/knox/ su knox -c "bin/gateway.sh stop”- Oozie - su $OOZIE_USER- /usr/lib/oozie/bin/oozied.sh stop- WebHCat - su - hcat -c "/usr/lib/hive-hcatalog/sbin/webhcat_server.sh stop"- Hive - Run this command on the Hive Metastore and Hive Server2 host machine: - ps aux | awk '{print $1,$2}' | grep hive | awk '{print $2}' | xargs kill >/dev/null 2>&1- HBase RegionServers - su - hbase -c "/usr/lib/hbase/bin/hbase-daemon.sh --config /etc/hbase/conf stop regionserver"- HBase Master host machine - su - hbase -c "/usr/lib/hbase/bin/hbase-daemon.sh --config /etc/hbase/conf stop master"- YARN - Run this command on all NodeManagers: - su - yarn -c "export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec && /usr/lib/hadoop-yarn/sbin/yarn-daemon.sh --config /etc/hadoop/conf stop nodemanager"- Run this command on the History Server host machine: - su - mapred -c "export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec && /usr/lib/hadoop-mapreduce/sbin/mr-jobhistory-daemon.sh --config /etc/hadoop/conf stop historyserver"- Run this command on the ResourceManager host machine(s): - su - yarn -c "export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec && /usr/lib/hadoop-yarn/sbin/yarn-daemon.sh --config /etc/hadoop/conf stop resourcemanager"- Run this command on the YARN Timeline Server node: - su -l yarn -c "export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec && /usr/lib/hadoop-yarn/sbin/yarn-daemon.sh --config /etc/hadoop/conf stop timelineserver"- HDFS - On all DataNodes: - su - hdfs -c "/usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf stop datanode"- If you are not running a highly available HDFS cluster, stop the Secondary NameNode by executing this command on the Secondary NameNode host machine: - su - hdfs -c "/usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf stop secondarynamenode”- On the NameNode host machine(s): - su - hdfs -c "/usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf stop namenode"- If you are running NameNode HA, stop the ZooKeeper Failover Controllers (ZKFC) by executing this command on the NameNode host machine: - su - hdfs -c "/usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf stop zkfc"- If you are running NameNode HA, stop the JournalNodes by executing these commands on the JournalNode host machines: - su - hdfs -c "/usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf stop journalnode"- ZooKeeper Host machines - su - zookeeper -c "export ZOOCFGDIR=/etc/zookeeper/conf ; export ZOOCFG=zoo.cfg ;source /etc/zookeeper/conf/zookeeper-env.sh ; /usr/lib/zookeeper/bin/zkServer.sh stop"- Ranger (XA Secure) - service xapolicymgr stop- service uxugsync stop
- Verify that edit logs in - ${dfs.namenode.name.dir}/current/edits*are empty.- Run: - hdfs oev -i ${dfs.namenode.name.dir}/current/edits_inprogress_* -o edits.out
- Verify edits.out file. It should only have OP_START_LOG_SEGMENT transaction. For example: - <?xml version="1.0" encoding="UTF-8"?> <EDITS> <EDITS_VERSION>-56</EDITS_VERSION> <RECORD> <OPCODE>OP_START_LOG_SEGMENT</OPCODE> <DATA> <TXID>5749</TXID> </DATA> </RECORD> 
- If edits.out has transactions other than OP_START_LOG_SEGMENT, run the following steps and then verify edit logs are empty. - Start the existing version NameNode. 
- Ensure there is a new FS image file. 
- Shut the NameNode down: - hdfs dfsadmin – saveNamespace
 
 
- Rename or delete any paths that are reserved in the new version of HDFS. - When upgrading to a new version of HDFS, it is necessary to rename or delete any paths that are reserved in the new version of HDFS. If the NameNode encounters a reserved path during upgrade, it prints an error such as the following: - /.reserved is a reserved path and .snapshot is a reserved path component in this version of HDFS. Please rollback and delete or rename this path, or upgrade with the -renameReserved key-value pairs option to automatically rename these paths during upgrade. - Specifying - -upgrade -renameReservedoptional key-value pairs causes the NameNode to automatically rename any reserved paths found during startup.- For example, to rename all paths named - .snapshotto- .my-snapshotand change paths named- .reservedto- .my-reserved, specify- -upgrade -renameReserved .snapshot=.my-snapshot,.reserved=.my-reserved.- If no key-value pairs are specified with - -renameReserved, the NameNode then suffixes reserved paths with:- .<LAYOUT-VERSION>.UPGRADE_RENAMED- For example: - .snapshot.-51.UPGRADE_RENAMED.![[Note]](../common/images/admon/note.png) - Note - We recommend that you perform a - -saveNamespacebefore renaming paths (running- -saveNamespaceappears in a previous step in this procedure). This is because a data inconsistency can result if an edit log operation refers to the destination of an automatically renamed file.- Also note that running - -renameReservedrenames all applicable existing files in the cluster. This may impact cluster applications.
- Upgrade the JDK on all nodes to JDK 7 or JDK 8 before upgrading HDP. 

