View
994
Download
1
Embed Size (px)
Citation preview
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
BEST PRACTICES FOR ENTERPRISE USER MANAGEMENT IN HADOOP
ENVIRONMENT
Sailaja PolavarapuSr. Software EngineerHortonworks
Dataworks Summit 2017 Munich
Don Bosco DuraiCofounder & Chief Security ArchitectPrivacera
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Don Bosco Durai
⬢Cofounder and Chief Security Architect at Privacera
⬢Committer in Apache Ranger and Apache Ambari
⬢Contributor in most Apache projects for security
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Sailaja Polavarapu
⬢ Apache Ranger contributor since 2015
⬢ Apache Ranger Committer
⬢ Contributed major improvements for Usersync module in Ranger⬢Currently working at Hortonworks Security Team
⬢ Contact: [email protected]
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Agenda◆ Authentication and Users in Hadoop
◆ Integrating Ranger with AD/LDAP
◆ Common Use cases
◆ LDAP connection check tool
◆ Best practices
◆ Demo
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Most commonly asked question
If I have Ranger, do I need Kerberos?
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Why Authenticate Users?
Authentication
Authorization
Auditing
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Service Types
Infrastructure
HDFS
Oozie
Storm
YARNHive
ServerHBase
Zookeeper Kafka
Apps
ZeppelinAmbari
Views
Ambari
AdminRanger
Atlas
LogSearch
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Master Node
Infrastructure - Kerberos
YARN
Resource ManagerHive Server
HDFS
Name Node
Node 1
YARN
Node Manager
HDFS
Data Node
Linux
Process
Linux
Process
Node 2
YARN
Node Manager
HDFS
Data Node
Linux
Process
Linux
Process
2
3 3
4 4
5
6 6
Users
1
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
PortalsNotebooks/Viewer
Apps - Username & Password
Hive Server2
ZeppelinAmbari Views
HDFS
Ambari
Atlas
Ranger
BI Tools
Spark
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Knox - Gateway & SSO
Ambari
WebHDFS (HDFS)
Templeton (HCatalog)
Stargate (HBase)
Oozie
Hive/JDBC
Yarn RM
Storm
Name Node UI
Job History UI
Oozie UI
HBase UI
Yarn UI
Spark UI
Ambari UI
Ranger Admin Console
Services UIs
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Authentication and User Source
Hive JDBC
Web Apps
(Zeppelin, Ranger,
Ambari, Atlas)
CLI/ API(HDFS, Hive Beeline,
HBase, etc.)
LDAP/Kerberos
LDAP
Kerberos
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Ranger UserSync
Ranger Admin
Database
AD/ LDAP
Sync Users/Groups
User/Group Synchronization in Ranger
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
User sources
⬢ AD/LDAP –Syncs users and groups from LDAP Organizational Units (OU)
⬢Unix Native Users–Syncs users and groups from /etc/passwd and /etc/group files
⬢ File Sources
–Syncs users and groups from a file specified in the configuration.
–Supports many file formats like - CSV, JSON, etc...
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Integrating Ranger with AD/LDAP
⬢Understanding your deployment–What kind of directory server: Active Directory, OpenLdap
server, etc…?– Is the communication between hadoop cluster and directory
server secure or unsecure?–Do you have atleast a read-only LDAP user for binding?– Any firewall restrictions for communication between hadoop
and directory server? – Is Centrify being used as Ldap proxy?– Does your AD have spaces or special characters in username
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
⬢Gathering details of the directory server structure– AD/LDAP url and bind credentials– Any specific OU(s) for hadoop users and groups?–How many users and groups in the Domain and/or in Ous?– What kind of filters for user search and/or group search to be configured in order to limit the users and groups synced to hadoop?
–What all the available attributes on the directory server for users and groups like uid, sAMAccountname, memberof, objectclass, etc…– Authorization policies to be configured at user level or group level?
Requirements for User Management
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
DC=ad01,DC=hadoop,DC=comOU=Hadoop Users
OU=Hadoop Groups
sAMAccountName=jdoe cn=John Doe
sAMAccountName=bhall cn=Bob Hall
sAMAccountName=asmith cn=Andy Smith
sAMAccountName=acaroll cn=Ashley Caroll
(|(memberof=cn=hdp_testing,ou=Hadoop Groups,dc=hortonworks,dc=com)(memberof=cn=hdp_admin,ou=Hadoop Groups,dc=hortonworks,dc=com)(memberof=cn=dev_ops,ou=Hadoop Groups,dc=hortonworks,dc=com))
cn=hdp_testing
cn=dev_ops
cn=hdp_admin
sAMAccountName=jdoe cn=John Doe
sAMAccountName=bhall cn=Bob Hall
sAMAccountName=asmith cn=Andy Smith
sAMAccountName=acaroll cn=Ashley Caroll
Sample Active Directory Server Structure
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Use Case
⬢Sync all the users that belong to groups -“hdp_testing”, “hdp_admin”, or “dev_ops”
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
User based Search
⬢ Filter based on “memberof” attribute of the user
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
(| (memberof=cn=hdp_testing,ou=Hadoop Groups, dc=hortonworks,dc=com)
(memberof=cn=hdp_admin, ou=Hadoop Groups, dc=hortonworks,dc=com)
(memberof=cn=dev_ops, ou=Hadoop Groups, dc=hortonworks,dc=com) )
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
sAMAccountName
(|(memberof=cn=hdp_testing,ou=Hadoop Groups,
dc=hortonworks,dc=com)
(memberof=cn=hdp_admin, ou=Hadoop Groups,
dc=hortonworks,dc=com)
(memberof=cn=dev_ops, ou=Hadoop Groups,
dc=hortonworks,dc=com))
OU=Hadoop Users,dc=hortonworks,dc=com
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Group based Search⬢ Filter based on the group name or “cn” attribute of the group
(|(cn=hdp_*)(cn=dev_*))
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
cn
OU=Hadoop Groups,dc=hortonworks,dc=com
member
(|(cn=dev_*)(cn=hdp_*))
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
LDAP connection check tool
⬢ Command line tool
⬢ Used for–Discovering various LDAP attributes– Validate the LDAP settings in Ranger, Ambari, or HDFS LDAP
Group Mapping– To retrieve the total number of user and/or groups
⬢ Available as part of ranger installation
⬢ Requires basic information like ldap url, bind credentials, etc… – Command line interface – a template properties file to update the values specific to the
setup
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Tool usage
⬢usage: run.sh
-a ignore authentication properties-d <arg> {all|users|groups}-h show help.-i <arg> Input file name-o <arg> Output directory-r <arg> {all|users|groups}
⬢ All these above parameters are optional
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
CLI option for the Ldap tool
⬢CLI is provided when input file is not specified:
Ldap url [ldap://ldap.example.com:389]:Bind DN [cn=admin,ou=users,dc=example,dc=com]:Bind Password:User Search Base [ou=users,dc=example,dc=com]:User Search Filter [cn=user1]:Sample Authentication User [user1]:Sample Authentication Password:
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Demo
28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Best practices and Strategies
⬢ Use LDAP/AD for application service authentication⬢ Use Ranger for authorization⬢ Verify the truststore certs are updated across the system in case
of SSL⬢ Use LDAP Connection check tool to
–discover LDAP configuration attributes–verify the number of users and groups to be sync’d to ranger
⬢ Verify if same case conversion and special characters for user and group names are handled uniformly across hadoop environment
–Matching rules must be used in core-site.xml as well as in ranger
29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved