29
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved BEST PRACTICES FOR ENTERPRISE USER MANAGEMENT IN HADOOP ENVIRONMENT Sailaja Polavarapu Sr. Software Engineer Hortonworks Dataworks Summit 2017 Munich Don Bosco Durai Cofounder & Chief Security Architect Privacera

Best Practices for Enterprise User Management in Hadoop Environment

Embed Size (px)

Citation preview

Page 1: Best Practices for Enterprise User Management in Hadoop Environment

1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

BEST PRACTICES FOR ENTERPRISE USER MANAGEMENT IN HADOOP

ENVIRONMENT

Sailaja PolavarapuSr. Software EngineerHortonworks

Dataworks Summit 2017 Munich

Don Bosco DuraiCofounder & Chief Security ArchitectPrivacera

Page 2: Best Practices for Enterprise User Management in Hadoop Environment

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Don Bosco Durai

⬢Cofounder and Chief Security Architect at Privacera

⬢Committer in Apache Ranger and Apache Ambari

⬢Contributor in most Apache projects for security

Page 3: Best Practices for Enterprise User Management in Hadoop Environment

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Sailaja Polavarapu

⬢ Apache Ranger contributor since 2015

⬢ Apache Ranger Committer

⬢ Contributed major improvements for Usersync module in Ranger⬢Currently working at Hortonworks Security Team

⬢ Contact: [email protected]

Page 4: Best Practices for Enterprise User Management in Hadoop Environment

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Agenda◆ Authentication and Users in Hadoop

◆ Integrating Ranger with AD/LDAP

◆ Common Use cases

◆ LDAP connection check tool

◆ Best practices

◆ Demo

Page 5: Best Practices for Enterprise User Management in Hadoop Environment

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Most commonly asked question

If I have Ranger, do I need Kerberos?

Page 6: Best Practices for Enterprise User Management in Hadoop Environment

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Why Authenticate Users?

Authentication

Authorization

Auditing

Page 7: Best Practices for Enterprise User Management in Hadoop Environment

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Service Types

Infrastructure

HDFS

Oozie

Storm

YARNHive

ServerHBase

Zookeeper Kafka

Apps

ZeppelinAmbari

Views

Ambari

AdminRanger

Atlas

LogSearch

Page 8: Best Practices for Enterprise User Management in Hadoop Environment

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Master Node

Infrastructure - Kerberos

YARN

Resource ManagerHive Server

HDFS

Name Node

Node 1

YARN

Node Manager

HDFS

Data Node

Linux

Process

Linux

Process

Node 2

YARN

Node Manager

HDFS

Data Node

Linux

Process

Linux

Process

2

3 3

4 4

5

6 6

Users

1

Page 9: Best Practices for Enterprise User Management in Hadoop Environment

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

PortalsNotebooks/Viewer

Apps - Username & Password

Hive Server2

ZeppelinAmbari Views

HDFS

Ambari

Atlas

Ranger

BI Tools

Spark

Page 10: Best Practices for Enterprise User Management in Hadoop Environment

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Knox - Gateway & SSO

Ambari

WebHDFS (HDFS)

Templeton (HCatalog)

Stargate (HBase)

Oozie

Hive/JDBC

Yarn RM

Storm

Name Node UI

Job History UI

Oozie UI

HBase UI

Yarn UI

Spark UI

Ambari UI

Ranger Admin Console

Services UIs

Page 11: Best Practices for Enterprise User Management in Hadoop Environment

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Authentication and User Source

Hive JDBC

Web Apps

(Zeppelin, Ranger,

Ambari, Atlas)

CLI/ API(HDFS, Hive Beeline,

HBase, etc.)

LDAP/Kerberos

LDAP

Kerberos

Page 12: Best Practices for Enterprise User Management in Hadoop Environment

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Ranger UserSync

Ranger Admin

Database

AD/ LDAP

Sync Users/Groups

User/Group Synchronization in Ranger

Page 13: Best Practices for Enterprise User Management in Hadoop Environment

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

User sources

⬢ AD/LDAP –Syncs users and groups from LDAP Organizational Units (OU)

⬢Unix Native Users–Syncs users and groups from /etc/passwd and /etc/group files

⬢ File Sources

–Syncs users and groups from a file specified in the configuration.

–Supports many file formats like - CSV, JSON, etc...

Page 14: Best Practices for Enterprise User Management in Hadoop Environment

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Integrating Ranger with AD/LDAP

⬢Understanding your deployment–What kind of directory server: Active Directory, OpenLdap

server, etc…?– Is the communication between hadoop cluster and directory

server secure or unsecure?–Do you have atleast a read-only LDAP user for binding?– Any firewall restrictions for communication between hadoop

and directory server? – Is Centrify being used as Ldap proxy?– Does your AD have spaces or special characters in username

Page 15: Best Practices for Enterprise User Management in Hadoop Environment

15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

⬢Gathering details of the directory server structure– AD/LDAP url and bind credentials– Any specific OU(s) for hadoop users and groups?–How many users and groups in the Domain and/or in Ous?– What kind of filters for user search and/or group search to be configured in order to limit the users and groups synced to hadoop?

–What all the available attributes on the directory server for users and groups like uid, sAMAccountname, memberof, objectclass, etc…– Authorization policies to be configured at user level or group level?

Requirements for User Management

Page 16: Best Practices for Enterprise User Management in Hadoop Environment

16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

DC=ad01,DC=hadoop,DC=comOU=Hadoop Users

OU=Hadoop Groups

sAMAccountName=jdoe cn=John Doe

sAMAccountName=bhall cn=Bob Hall

sAMAccountName=asmith cn=Andy Smith

sAMAccountName=acaroll cn=Ashley Caroll

(|(memberof=cn=hdp_testing,ou=Hadoop Groups,dc=hortonworks,dc=com)(memberof=cn=hdp_admin,ou=Hadoop Groups,dc=hortonworks,dc=com)(memberof=cn=dev_ops,ou=Hadoop Groups,dc=hortonworks,dc=com))

cn=hdp_testing

cn=dev_ops

cn=hdp_admin

sAMAccountName=jdoe cn=John Doe

sAMAccountName=bhall cn=Bob Hall

sAMAccountName=asmith cn=Andy Smith

sAMAccountName=acaroll cn=Ashley Caroll

Sample Active Directory Server Structure

Page 17: Best Practices for Enterprise User Management in Hadoop Environment

17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Use Case

⬢Sync all the users that belong to groups -“hdp_testing”, “hdp_admin”, or “dev_ops”

Page 18: Best Practices for Enterprise User Management in Hadoop Environment

18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Page 19: Best Practices for Enterprise User Management in Hadoop Environment

19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

User based Search

⬢ Filter based on “memberof” attribute of the user

Page 20: Best Practices for Enterprise User Management in Hadoop Environment

20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

(| (memberof=cn=hdp_testing,ou=Hadoop Groups, dc=hortonworks,dc=com)

(memberof=cn=hdp_admin, ou=Hadoop Groups, dc=hortonworks,dc=com)

(memberof=cn=dev_ops, ou=Hadoop Groups, dc=hortonworks,dc=com) )

Page 21: Best Practices for Enterprise User Management in Hadoop Environment

21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

sAMAccountName

(|(memberof=cn=hdp_testing,ou=Hadoop Groups,

dc=hortonworks,dc=com)

(memberof=cn=hdp_admin, ou=Hadoop Groups,

dc=hortonworks,dc=com)

(memberof=cn=dev_ops, ou=Hadoop Groups,

dc=hortonworks,dc=com))

OU=Hadoop Users,dc=hortonworks,dc=com

Page 22: Best Practices for Enterprise User Management in Hadoop Environment

22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Group based Search⬢ Filter based on the group name or “cn” attribute of the group

(|(cn=hdp_*)(cn=dev_*))

Page 23: Best Practices for Enterprise User Management in Hadoop Environment

23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

cn

OU=Hadoop Groups,dc=hortonworks,dc=com

member

(|(cn=dev_*)(cn=hdp_*))

Page 24: Best Practices for Enterprise User Management in Hadoop Environment

24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

LDAP connection check tool

⬢ Command line tool

⬢ Used for–Discovering various LDAP attributes– Validate the LDAP settings in Ranger, Ambari, or HDFS LDAP

Group Mapping– To retrieve the total number of user and/or groups

⬢ Available as part of ranger installation

⬢ Requires basic information like ldap url, bind credentials, etc… – Command line interface – a template properties file to update the values specific to the

setup

Page 25: Best Practices for Enterprise User Management in Hadoop Environment

25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Tool usage

⬢usage: run.sh

-a ignore authentication properties-d <arg> {all|users|groups}-h show help.-i <arg> Input file name-o <arg> Output directory-r <arg> {all|users|groups}

⬢ All these above parameters are optional

Page 26: Best Practices for Enterprise User Management in Hadoop Environment

26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

CLI option for the Ldap tool

⬢CLI is provided when input file is not specified:

Ldap url [ldap://ldap.example.com:389]:Bind DN [cn=admin,ou=users,dc=example,dc=com]:Bind Password:User Search Base [ou=users,dc=example,dc=com]:User Search Filter [cn=user1]:Sample Authentication User [user1]:Sample Authentication Password:

Page 27: Best Practices for Enterprise User Management in Hadoop Environment

27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Demo

Page 28: Best Practices for Enterprise User Management in Hadoop Environment

28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Best practices and Strategies

⬢ Use LDAP/AD for application service authentication⬢ Use Ranger for authorization⬢ Verify the truststore certs are updated across the system in case

of SSL⬢ Use LDAP Connection check tool to

–discover LDAP configuration attributes–verify the number of users and groups to be sync’d to ranger

⬢ Verify if same case conversion and special characters for user and group names are handled uniformly across hadoop environment

–Matching rules must be used in core-site.xml as well as in ranger

Page 29: Best Practices for Enterprise User Management in Hadoop Environment

29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

[email protected]