Getting Ready to Upgrade
HDP Stack upgrade involves upgrading from HDP 2.0 to HDP 2.2 versions and adding the new HDP 2.2 services.
The first step is to ensure you keep a backup copy of your HDP 2.2 configurations.
| ![[Note]](../common/images/admon/note.png) | Note | 
|---|---|
| You must use kinit before running the commands as any particular user. | 
- Hardware recommendations - Although there is no single hardware requirement for installing HDP, there are some basic guidelines. The HDP packages for a complete installation of HDP 2.2 will take up about 2.5 GB of disk space. 
- Back up the following HDP directories: - /etc/hadoop/conf 
- /etc/hbase/conf 
- /etc/hive-hcatalog/conf 
- /etc/hive/conf 
- /etc/pig/conf 
- /etc/sqoop/conf 
- /etc/flume/conf 
- /etc/mahout/conf 
- /etc/oozie/conf 
- /etc/hue/conf 
- /etc/zookeeper/conf 
- Optional: Back up your userlogs directories, ${mapred.local.dir}/userlogs. 
 
- Run the fsck command as the HDFS Service user and fix any errors. (The resulting file contains a complete block map of the file system.) - su -l <HDFS_USER>- hdfs fsck / -files -blocks -locations > dfs-old-fsck-1.log- where $HDFS_USER is the HDFS Service user. For example, hdfs. 
- Use the following instructions to compare status before and after the upgrade: - The following commands must be executed by the user running the HDFS service (by default, the user is hdfs). - Capture the complete namespace of the file system. (The second command does a recursive listing of the root file system.) - su -l <$HDFS_USER>- hdfs dfs -ls -R / > dfs-old-lsr-1.log![[Note]](../common/images/admon/note.png) - Note - In secure mode you must have kerberos credentials for the hdfs user. 
- Run the report command to create a list of DataNodes in the cluster. - su -l <HDFS_USER>- hdfs dfsadmin -report > dfs-old-report-1.log
- Optional: You can copy all or unrecoverable only data storelibext-customer directory in HDFS to a local file system or to a backup instance of HDFS. 
- Optional: You can also repeat the steps 3 (a) through 3 (c) and compare the results with the previous run to ensure the state of the file system remained unchanged. 
 
- As the HDFS user, save the namespace by executing the following command: - su -l <HDFS_USER>- hdfs dfsadmin -safemode enter- hdfs dfsadmin -saveNamespace
- Backup your NameNode metadata. - Copy the following checkpoint files into a backup directory: - dfs.namenode.name.dir/current/edits_*
- dfs.namenode.name.dir/current/fsimage
- dfs.namenode.name.dir/current/fsimage_*
 
- Store the layoutVersion of the namenode. - ${dfs.namenode.name.dir}/current/VERSION
 
- Finalize any PRIOR HDFS upgrade, if you have not done so already. - su -l <HDFS_USER>- hdfs dfsadmin -finalizeUpgrade
- Optional: Back up the Hive Metastore database. - The following instructions are provided for your convenience. For the latest backup instructions, please see your database documentation. - Table 2.1. Hive Metastore Database Backup and Restore - Database Type - Backup - Restore - MySQL - mysqldump $dbname > $outputfilename.sqlsbr - For example: - mysqldump hive > /tmp/mydir/backup_hive.sql - mysql $dbname < $inputfilename.sqlsbr - For example: - mysql hive < /tmp/mydir/backup_hive.sql - Postgres - sudo -u $username pg_dump $databasename > $outputfilename.sql sbr - For example: - sudo -u postgres pg_dump hive > /tmp/mydir/backup_hive.sql - sudo -u $username psql $databasename < $inputfilename.sqlsbr - For example: - sudo -u postgres psql hive < /tmp/mydir/backup_hive.sql - Oracle - Connect to the Oracle database using sqlplus. Export the database. - exp username/password@database full=yes file=output_file.dmp mysql $dbname < $inputfilename.sqlsbr - For example: - mysql hive < /tmp/mydir/backup_hive.sql - Import the database: - imp username/password@database file=input_file.dmp 
- Optional: Back up the Oozie metastore database. - These instructions are provided for your convenience. Please check your database documentation for the latest back up instructions. - Table 2.2. Oozie Metastore Database Backup and Restore - Database Type - Backup - Restore - MySQL - mysqldump $dbname > $outputfilename.sql - For example: - mysqldump oozie > /tmp/mydir/backup_hive.sql - mysql $dbname < $inputfilename.sql - For example: - mysql oozie < /tmp/mydir/backup_oozie.sql - Postgres - sudo -u $username pg_dump $databasename > $outputfilename.sql - For example: - sudo -u postgres pg_dump oozie > /tmp/mydir/backup_oozie.sql - sudo -u $username psql $databasename < $inputfilename.sql - For example: - sudo -u postgres psql oozie < /tmp/mydir/backup_oozie.sql 
- Optional: Back up the Hue database. - The following instructions are provided for your convenience. For the latest backup instructions, please see your database documentation. For database types that are not listed below, follow your vendor-specific instructions. - Table 2.3. Hue Database Backup and Restore - Database Type - Backup - Restore - MySQL - mysqldump $dbname > $outputfilename.sqlsbr - For example: - mysqldump hue > /tmp/mydir/backup_hue.sql - mysql $dbname < $inputfilename.sqlsbr - For example: - mysql hue < /tmp/mydir/backup_hue.sql - Postgres - sudo -u $username pg_dump $databasename > $outputfilename.sql sbr - For example: - sudo -u postgres pg_dump hue > /tmp/mydir/backup_hue.sql - sudo -u $username psql $databasename < $inputfilename.sqlsbr - For example: - sudo -u postgres psql hue < /tmp/mydir/backup_hue.sql - Oracle - Connect to the Oracle database using sqlplus export the database: - For example: - exp username/password@database full=yes file=output_file.dmp mysql $dbname < $inputfilename.sqlsbr - Import the database: - For example: - imp username/password@database file=input_file.dmp - SQLite - /etc/init.d/hue stop - su $HUE_USER - mkdir ~/hue_backup - sqlite3 desktop.db .dump > ~/hue_backup/desktop.bak - /etc/init.d/hue start - /etc/init.d/hue stop - cd /var/lib/hue - mv desktop.db desktop.db.old - sqlite3 desktop.db < ~/hue_backup/desktop.bak - /etc/init.d/hue start 
- Stop all services (including MapReduce) and client applications deployed on HDFS using the instructions provided in the Stopping HDP services. 
- Verify that edit logs in ${dfs.namenode.name.dir}/current/edits* are empty. - Run: - hdfs oev -i ${dfs.namenode.name.dir}/current/edits_inprogress_* -o edits.out
- Verify edits.out file. It should only have OP_START_LOG_SEGMENT transaction. For example: - <?xml version="1.0" encoding="UTF-8"?> <EDITS> <EDITS_VERSION>-56</EDITS_VERSION> <RECORD> <OPCODE>OP_START_LOG_SEGMENT</OPCODE> <DATA> <TXID>5749</TXID> </DATA> </RECORD> 
- If edits.out has transactions other than OP_START_LOG_SEGMENT run the following steps and then verify edit logs are empty. - Start the existing version NameNode. 
- Ensure there is a new FS image file. 
- Shut the NameNode down. - hdfs dfsadmin – saveNamespace
 
 
- Rename or delete any paths that are reserved in the new version of HDFS. - When upgrading to a new version of HDFS, it is necessary to rename or delete any paths that are reserved in the new version of HDFS. If the NameNode encounters a reserved path during upgrade, it will print an error such as the following: - /.reserved is a reserved path and .snapshot is a reserved path component in this version of HDFS. Please rollback and delete or rename this path, or upgrade with the -renameReserved key-value pairs option to automatically rename these paths during upgrade. - Specifying - -upgrade -renameReservedoptional key-value pairs causes the NameNode to automatically rename any reserved paths found during startup.- For example, to rename all paths named .snapshot to .my-snapshot and change paths named .reserved to .my-reserved, specify: - -upgrade -renameReserved .snapshot=.my-snapshot,.reserved=.my-reserved.- If no key-value pairs are specified with -renameReserved, the NameNode will suffix reserved paths with .<LAYOUT-VERSION>.UPGRADE_RENAMED. For example: - .snapshot.-51.UPGRADE_RENAMED.![[Note]](../common/images/admon/note.png) - Note - We recommend that you perform a - -saveNamespacebefore renaming paths (running- -saveNamespaceappears in a previous step in this procedure). This is because a data inconsistency can result if an edit log operation refers to the destination of an automatically renamed file.- Also note that running - -renameReservedwill rename all applicable existing files in the cluster. This may impact cluster applications.
- Get public keys for GPG. 

