APIs for Server Admins: REST, Extract, TSM Oh My! - Tableau … · 2020-01-06 · APIs for Server...

Preview:

Citation preview

APIs for Server Admins: REST, Extract, TSM Oh My!

William Lang

Senior Software Engineer

Tableau

@willlang

# T C 1 8

Tom O’Neil

Senior Software Engineer

Tableau

Automation

Extensions

Embedded Analytics

Data Connectivity

Data Science

Tableau Platform

Inte

gra

tion

s

Enabling Integrations for Developers

Introduction to the Tableau Server REST API

User Provisioning

Automate User Provisioning

High turnover in user accountsConference: Provide accounts to each attendee

University: Must provision hundreds (or thousands) of new users each semester

Corporate merger: Add accounts for the new employees

Will use Python and Tableau Server client

Sign In to Tableau Server

tableau_auth = TSC.TableauAuth(username, password, site)

server_client = TSC.Server(“https://us-west-2b.online.tableau.com”)

server_client.auth.sign_in(tableau_auth)

Creating a User

user = UserItem(username, role)user = server_client.users.add(user)

if user.id is not None:server_client.users.update(user, password)print("User added.")

server_client.groups.add_user(group, user)

Creating Multiple Users

Pull user data from a file, database, identity management system, or other source

Iterate through list:Generate username (on-prem)

Execute the user creation code from previous slide

E-mail the new user a welcome message (on-prem)

How Do I Get Tableau Server Client?

Installing Python3Linux: https://docs.python-guide.org/en/latest/starting/install3/linux/

MacOS: https://brew.sh/

brew install python3

Windows: https://www.python.org/downloads/release/python-362/

Installing tableauserverclientpip3 install tableauserverclient

Datasource Refresh

Tasks are groupings of a datasource (or workbook) with a

schedule

Schedules defined when a task runs

But what happens when you need a task to update a datasource

dependent on another datasource?

How do we ensure that our dependent task runs?

Tasks

Running a Taskif task_id:

task = server.tasks.get_by_id(task_id)

if dependency_id:dependency = server.tasks.get_by_id(dependency_id)if dependency.updated_at > task.updated_at:

server.tasks.run(task)

else:…

What is REST?

Representational State Transfer. Not a helpful title.

Defines a set of rules for messages to be sent from a client to a server and vice versa

What makes something RESTful?

• Stateless

• Lightweight

• Implementation of client and server are independent of one

another

Migrating Workbooks

Migrating Workbooks: Projects

Imagine you have a staging and production site

Workbooks are stored in project folders

Need to ensure that there exists a project in both the staging and production server or site

How do we ensure we maintain ownership?

Impersonation

Sample Code (default site):

tableau_auth = TSC.TableauAuth(username, password)

server = TSC.Server('https://10ay.online.tableau.com')

server.auth.sign_in(tableau_auth)

Sample Code (specific site):

tableau_auth = TSC.TableauAuth(username, password, site_name)

server = TSC.Server('https://10ay.online.tableau.com')

server.auth.sign_in(tableau_auth)

Sample Code (impersonation):

tableau_auth = TSC.TableauAuth(username, password, site_name, user_id)

server = TSC.Server('https://my-tableau-server.myorg')

server.auth.sign_in(tableau_auth)

Migrating Workbookssource_workbooks, pagination = source_server.workbooks.get()

for workbook in source_workbooks:

#If project doesn't exist on destination, just upload to default projectfile_path = source_server.workbooks.download(workbook.id, temp_dir)

if workbook.project_id in project_map:workbook.project_id = project_map[workbook.project_id]

else:workbook.project_id = project_map['default']

dest_server.workbooks.publish(workbook, file_path, dest_server.PublishMode.Overwrite)

Embedded Credentials

Stripped Embedded User Credentials

After you republish the workbook, you can update the credentials usingUpdate Workbook Connection API

Get the connection id using Query Workbook Connection API

Subscriptions

Users can subscribe to Workbooks and receive daily emails with the updates to those workbooks

New server admin joins the team and needs same subscriptions

Use Tableau Server REST API to copy the subscriptions from one admin user to the new one

We’re going to do this the hard way, using Python without Tableau Server client

Copy Subscriptions

Copy Subscriptions: Get User IDheaders = {

'X-Tableau-Auth': auth_token,

‘Content-Type’: ’application/json’,

‘Accept': ’application/json’

}

url = ’https://us-west-2b.online.tableau.com/api/3.1/sites/' + site_id + \

'/users?filter=name:eq:' + uname

user_response = requests.get(url, headers=headers).json

Tableau Server REST Sign In

REQUESTPOST https://us-west-2b.online.tableau.com/api/3.1/auth/signin

RESPONSE200 (OK)

{"credentials": {

"name": "admin","password": "p@ssword","site": {

"contentUrl": "mysite"}

}}

{"credentials": {

"site": {"id": "9a8b7c6d5-e4f3-a2b1-c0d9-e8f7a6b5c4d","contentUrl": "mysite"

},"user": {

"id": "9f9e9d9c-8b8a-8f8e-7d7c-7b7a6f6d6e6d"},"token": "12ab34cd56ef78ab90cd12ef34ab56cd"

}}

Include the token as an HTTP header in all further requests:

X-Tableau-Auth: 12ab34cd56ef78ab90cd12ef34ab56cd

By default, token is good for 4 hours

Tableau Server REST Auth Header

Copy Subscriptions: Get User ID

headers = {

'X-Tableau-Auth': auth_token,

‘Content-Type’: ’application/json’,

‘Accept': ’application/json’

}

url = ’https://us-west-2b.online.tableau.com/api/3.1/sites/' + site_id + \

'/users?filter=name:eq:' + uname

user_response = requests.get(url, headers=headers).json

If user_response.status_code == requests.codes.ok:

#continue

REST URLs

https://us-west-2b.online.tableau.com/api/3.1/sites/

9a8b7c6d-5e4f-3a2b-1c0d-9e8f7a6b5c4d/

users?filter=name:eq:adminuser

Copy Subscriptions: Duplication

#Set headers: token and content type

response = requests.get(‘https://us-west-2b.online.tableau.com/api/3.1/sites/’ + \

site_id + '/subscriptions', headers=headers).json

for subscription in response.subscriptions.subscription:

#Check that this subscription is for this particular user

if (subscription['user']['id']) != ref_user_id:

continue

#Build request JSON

Copy Subscriptions: Example Request JSON

{…"subscription": {

"subject": "Site Migration Dashboard","content": {

"id": "3cd4eed3-56c5-4253-974a-9e37660cef0b","type": "View"

},"schedule": {

"id": "7eeb09c4-da4e-47b7-b41c-62c68aff03a0"},"user": {

"id": "bdbe273e-c2d7-42db-a1f1-297ff7384e47"}

}…

}

Copy Subscriptions: Build Request JSON

request_data = {

'subscription’: {'subject': subscription['subject’],

'content’: { 'id': subscription['content']['id’],

'type': subscription['content']['type’] },

'schedule’: { 'id': subscription['schedule']['id’] },

'user’: { 'id': new_user_id }

}

}

request_json = json.dumps(request_data)

Copy Subscriptions: Duplication

response = requests.get(https://us-west-2b.online.tableau.com/api/3.1/sites/’ + \

site_id + '/subscriptions', headers=headers).json

for subscription in response.subscriptions.subscription:

#Check that this subscription is for this particular user

if (subscription['user']['id']) != ref_user_id:

continue

#Build request JSON

#Set headers: token and content type

response = requests.post(https://us-west-2b.online.tableau.com/api/3.1/sites/’ + \

site_id + '/subscriptions', headers=headers, data=request_json)

Copy Subscriptions: Duplication

#request to get the subscriptions

response = requests.get(server + '/api/' + version + '/sites/’ + \

site_id + '/subscriptions', headers=headers).json

#request to create the new subscription

response = requests.post(server + '/api/' + version + '/sites/’ + \site_id + \

'/subscriptions', headers=headers, data=request_json)

if response.status_code == 201:

print(‘Subscription copied.’)

Headers: Used for authentication, content type, and/or API versioning

Content type: Typically JSON (can also be XML)

Path: Server URL, (optionally) API version, resource(s), filters or sorting

Verbs: e.g. GET, POST, PUT, DELETE

Response codes: e.g. OK, created, no content, bad request, not found

Takeaways: HTTP

Webhooks!!!

Webhooks: Signin// make the request$response = $guzzle->post('/api/exp/auth/signin', [

'headers' => ['Content-Type' => 'application/json’,'Accept' => 'application/json’

],'json' => [

'credentials' => ['name' => $username,'password' => $password,'site' => [

'contentUrl' => $site]

]]

]);

Webhooks{"webhook": {"name": "My Webhook!","webhook-source": {"webhook-source-event-workbook-created": {}

},"webhook-destination": {"webhook-destination-http": {"method": "POST","url": "https://my-app.example.com/my-created-workbook-webhook"

}}

}}

Webhooks: Payload{"resource":"WORKBOOK","event-type":"WorkbookCreated","resource-name":"My Workbook","site-id":"8b2a95d8-52b9-40a4-8712-cd6da771bd1b","resource-id":"99"

}

Webhooks: Request// create our webhook$response = $guzzle->post(sprintf('/sites/%s/webhooks', $siteId), [

'json' => ['webhook' => [

'name' => $name,'webhook-source' => [

$event => new \stdClass(),],'webhook-destination' => [

'webhook-destination-http' => ['method' => 'POST’,'url' => $url

]]

]]

]);

Pruning

Pruning

Running out of disk space

Delete old, unused content

Don’t know Python? Don’t like Python? Don’t want to learn it?

We are going to use Java

Pruning

REST client and server implementations are completely independent

REST libraries exist for a variety of languages

We will use Spring’s RestTemplate Java library

Part of the spring-web package

Included in Spring Boot, which we will use to make our application runnable

Pruning

{"credentials": {"name": "admin","password": "p@ssword","site": {"contentUrl": "MySite"

}}

}

Remember exactly what that sign in JSON looks like? Probably Not.

Pruning: Credentials Class

@JsonIgnoreProperties(ignoreUnknown = true)@JsonInclude(Include.NON_NULL)@JsonTypeName(value = "credentials")@JsonTypeInfo(include = As.WRAPPER_OBJECT, use = Id.NAME)public class Credentials {

private String name;

private String password;

private String token;

private Site site;

private User user;

// getters and setters follow

// Get the credentials parametersCredentials requestCredentials = new Credentials();requestCredentials.setName(opts.getOptionValue("user"));requestCredentials.setPassword(opts.getOptionValue("password"));requestCredentials.setSite(new Site(opts.getOptionValue(”siteUrl")));

RestTemplate restTemplate = new RestTemplate();

// Make the POST requestCredentials responseCredentials = restTemplate.postForObject(”https://us-west-2b.online.tableau.com/api/3.1/auth/signin", requestCredentials, Credentials.class);

String token = responseCredentials.getToken();

Pruning: Login

Pruning: Get Workbooksint currentPage = 1;boolean hasMorePages = true;while (hasMorePages) {

// Get the current page of workbooksWorkbooksResponse workbooks = restTemplate.getForObject(”https://us-west-

2b.online.tableau.com/api/3.1/sites/” + getSiteId() + "/workbooks?pageNumber=" + currentPage+ "&sort=updatedAt:asc", WorkbooksResponse.class);

for (Workbook workbook: workbooks.getWorkbooks().getWorkbook()) {if ( ((new Date()).getTime() - workbook.getUpdatedAt().getTime()) >

(1000*60*60*24*maxAge)) {// Workbook is more than maxAge days oldworkbooksToDelete.add(workbook);

} else {System.out.println("Newest date reached.");hasMorePages = false;break;

}}

Pruning: Pagination

"pagination": {"pageNumber": "1","pageSize": "100","totalAvailable": "276"

}

Query responses are paginated and contain a pagination header:

Pruning: Pagination@JsonIgnoreProperties(ignoreUnknown = true)@JsonInclude(Include.NON_NULL)public class Pagination {

private int pageNumber;private int pageSize;private int totalAvailable;

Pruning: Paginate WorkbooksWhile (hasMorePages) {

// Get and process current workbook pageif (workbooks.getPagination().getTotalAvailable() >

(workbooks.getPagination().getPageSize() * currentPage) ) {// Still more pages to view, so increment the current pagecurrentPage++;

} else {// No more pages to view, so stop making querieshasMorePages = false;

}}

Pruning: Delete Workbooksfor (Workbook workbook: workbooksToDelete) {

System.out.println("Deleting workbook " + workbook.getName() + "...");restTemplate.delete(”https://us-west-2b.online.tableau.com/api/3.1/sites/" +

getSiteId() + "/workbooks/" + workbook.getId());}

Pruning

Process to work around caching:Retrieve all data

Process to find everything to delete

Delete entities

Why do we query all the results first before deleting?

Caching

Changing Content Owner

Changing Content Owner

Tableau Server users cannot be deleted if they still

own content

Need to delete a user who owns hundreds of workbooks

Get all of that user's workbooks and transfer ownership

to another user

Changing Content Owner: Get Userprivate function getUser(Client $guzzle, string $siteId, string $user) : array {

$response = $guzzle->get(sprintf('/api/3.1/sites/%s/users?filter=name:eq:%s', $siteId, $user));

// check to make sure we only found one$json = json_decode((string)$response->getBody(), true);if ($json['pagination']['totalAvailable'] != 1) {

throw new \Exception("Expecting exactly 1 user for user. Found " .$json['pagination']['totalAvailable']);

}

// store itreturn $json['users']['user'][0];

}

Changing Content Owner: JSON Response{

"pagination": {"pageNumber":"1","pageSize":"100","totalAvailable":"1"

},"users": {

"user": [{"id":"bcdc5fe9-c59c-11e8-9bbc-f23c9116ad45","name":"william@example.com","siteRole":"SiteAdministratorCreator","authSetting":"ServerDefault","lastLogin":"2018-10-01T17:04:44Z","externalAuthUserId":"...."

}]}

}

Changing Content Owner: Workbook Request

// let's get all the workbooks owned by oldOwner and update them to be owned by newOwner$response = $guzzle->get(sprintf('/api/3.1/sites/%s/users/%s/workbooks', $siteId, $oldOwner['id']));

// get the workbooks from the response$json = json_decode((string)$response->getBody(), true);foreach ($json['workbooks']['workbook'] as $workbook) {

$id = $workbook['id'];

// let's do the update now$response = $guzzle->put(sprintf('/api/3.1/sites/%s/workbooks/%s', $siteId, $id), [

'json' => ['workbook' => [

'owner' => ['id' => $newOwner['id’]

]]

]]);

}

Daily Digest

Automated Digest Using AWS Lambda

Automated daily e-mail that lists all workbooks with updates over prior 24 hours

Everyone is using “The Cloud” these days – we will too!

Use AWS Lambda to automate the process serverlessly

Lambdas run in the Amazon cloud on demand, or via schedule, without provisioning dedicated server resources

Will create a Lambda, written in Python using Tableau Server Client, that runs nightly

Automated Digest Using AWS Lambda

Create Lambda function:Use AWS Web UI

Navigate to Lambda Service

Create function

Author from scratch

Python 3.6 runtime

Specify or create IAM role

Automated Digest Using AWS Lambda

for workbook in all_workbooks:workbook_age = now - workbook.updated_atif workbook_age <= datetime.timedelta(1):

daily_digest_list.append([workbook.id, workbook.name, workbook.updated_at.__str__(), ''])

if pagination_item.page_number * pagination_item.page_size >=pagination_item.total_available:

break

def lambda_handler(event, context):

req_options = TSC.RequestOptions()req_options.sort.add(TSC.Sort(TSC.RequestOptions.Field.UpdatedAt, TSC.RequestOptions.Direction.Desc))all_workbooks, pagination_item = server_client.workbooks.get(req_options)

#Retrieve next page of workbooks from APIreq_options.page_number(req_options.pagenumber + 1)all_workbooks, pagination_item = server_client.workbooks.get(req_options)

Automated Digest Using AWS Lambda

Create CloudWatch event rule:

AWS Web UICloudWatch ServiceCreate Event RuleSpecify schedule:Fixed rate

Cron expression

Add TargetSelect Lambda function

Tagging Workbooks with the Document API

What is it?The document API provides a means to easily extract information from a Tableau Workbook file (twb)

Why is this awesome?No more manually editing XML! No more accidental XML syntax mistakes!

What kind of information?Dimensions, measures, workbook names, filters

What can I do with it?

Tag all workbooks that use a specific field

Report on workbooks that are using out of date data sources

Email owners of workbooks that use calculated fields that need to be updated

Tagging Workbooks: Downloading Workbooks

for workbook in all_workbooks:file_name = server.workbooks.download(workbook.id, '.cache/', no_extract=True)workbook.file_name = file_name

while total > len(all_workbooks):workbooks, pagination = server.workbooks.get(RequestOptions(pagenumber=page,

pagesize=page_size))total = pagination.total_available

for workbook in workbooks:all_workbooks.append(workbook)page += 1

Tagging Workbooks: Datasources and Fields

wb = Nonetry:

wb = Workbook(workbook.file_name)except FileNotFoundError:

continue

for datasource in wb.datasources:for count, ds_field in enumerate(datasource.fields.values()):

if field == ds_field.name:taggable_workbooks.append([workbook.id, workbook.name])workbook.tags.update([tag])server.workbooks.update(workbook)

from tableaudocumentapi import Workbookfrom tableaudocumentapi import Field

Checking Server Health

Sign In To TSM

// login to tsm$response = $guzzle->post('/api/1.0/login', [

'json' => ['authentication' => [

'name' => $username,'password' => $password

]]

]);

// guzzle client$guzzle = new Client([

'base_uri' => $server,'headers' => [

'Content-Type' => 'application/json’,'Accept' => 'application/json’

],'cookies' => true

]);

Get Nodes and Their Status// get the nodes$response = $guzzle->get('/api/1.0/nodes');$json = json_decode((string)$response->getBody(), true);$nodes = $json['clusterNodes'];

foreach ($nodes as $node) {// get more information on a node$response = $guzzle->get(sprintf('/api/1.0/status/nodes/%s', $node['id']));$json = json_decode((string)$response->getBody(), true);print_r($json);

}

Extract API

Extract API

Really easy to use

Lightweight, around ~1mb total!

Extract consists of tables and rows only

Generate a TDE or a hyper extract

Extract API: Create Your Extract

final Extract extract = new Extract(“tmp/output.hyper”);

Extract API: Table Definitionfinal TableDefinition employeeDef = new TableDefinition();employeeDef.addColumn("id", Type.INTEGER);employeeDef.addColumnWithCollation("name", Type.UNICODE_STRING, Collation.EN_US);employeeDef.addColumnWithCollation("position", Type.UNICODE_STRING, Collation.EN_US);employeeDef.addColumn("start_date", Type.DATE);extract.addTable("employees", employeeDef);

Extract API: Inserting Rowsfinal Table table = extract.openTable("employees");

Row r = new Row(employeeDef);r.setInteger(0, 1);r.setString(1, "John");r.setString(2, "Co-Founder");r.setDate(3, 2018, 01, 01);table.insert(r);

Resources

Postman: https://www.getpostman.com/

VSCode rest client: https://github.com/Huachao/vscode-

restclient/

Useful Tools

Tableau Server Client(Python):

https://github.com/tableau/server-client-pythonOnly officially supported client

Java(Spring RestTemplate):

https://spring.io/guides/gs/consuming-rest/

Guzzle Client: https://github.com/guzzle/guzzle

Tableau Rest API samples: https://github.com/tableau/rest-

api-samplesJava and Python

Useful Libraries

REST APIhttps://onlinehelp.tableau.com/current/api/rest_api/en-us/help.htm

Extract APIhttps://onlinehelp.tableau.com/current/api/sdk/en-us/help.htm#SDK/tableau_sdk.htm

TSMhttps://onlinehelp.tableau.com/v0.0/api/tsm_api/en-us/docs/tsm-reference.htm

Document APIhttp://tableau.github.io/document-api-python/

Documentation

AU T O M AT I O N R E L AT E D S E S S I O N S

Chalk Talk – Tableau Server Automation APIsOct-24 | 15:30 – 16:30

(Hands on Training) REST APIOct-23 | 14:15 – 16:45 Oct-24 | 10:15 – 12:45

Big Easy Data Security | Scalable…Roux Level SecurityOct-23 | 16:00 – 17:00 Oct-24 | 10:15 – 11:15

Using Tableau Server Client and the REST API…Oct-23 | 14:15 – 15:15 Oct-25 | 10:45 – 11:45

#DataDev Resources

TC18 Developer Track Contenthttp://tabsoft.co/tcdevtrack

Tableau Developer Programhttp://tableau.com/developer

Free environment for development

Early access to info and APIs

Tableau on GitHubhttp://github.com/tableau

Please complete the

session survey from the My

Evaluations menu

in your TC18 app

#DataDev Resources

TC18 Developer Track Contenthttp://tabsoft.co/tcdevtrack

Tableau Developer Programhttp://tableau.com/developer

Free environment for development

Early access to info and APIs

Tableau on GitHubhttp://github.com/tableau

Thank you!

#TC18

Contact or CTA info goes here

Recommended