Security in Hadoop
With Hadoop’s new security features and its integration with Kerberos, it is possible to verify that the user is who they claim to be and ensure they only have the correct access to data or resources. This allows corporations to allow finer grained access to information and reduce their operational overhead by coalescing their distinct clusters.
Secure Hadoop clusters provide solutions for the following threats:
- Prevent unauthorized access to HDFS and MapReduce communication 
- Prevent unauthorized access to the jobs submitted through Oozie 
- Prohibit the fraudulent servers to access your Hadoop cluster 
- Prevent impersonation attacks 
- Prevent access to root accounts 
Deployment options for secure Hadoop cluster
Depending on your environment set-up, following are the two different options to install a secure Hadoop cluster:
- OPTION I: Set-up a new Kerberos Key Distribution Center - Use the auxiliary script - - setupKerberos.sh. This auxiliary script file is responsible for performing following tasks:- Sets up a new Key Distribution Center (KDC) on the host machine specified in - kdcserverfile.
- Creates service keytabs for all processes - NameNode, JobTracker, Secondary NameNode, DataNodes, TaskTrackers, HBase Master, HBase Regionserver, and Hive Metastore 
- Places all the service keytabs (for respective hosts) under - /etc/security/keytabsdirectory
- Generates user keytabs for - HDFSand- Smoke Testusers and places these keytab files to- /tmpdirectory on all the nodes.
 
- OPTION II: Add existing Kerberos Key Distribution Center - You also have the option of adding an existing Kerberos Key Distribution Center for your Hadoop cluster. 


