56
1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcement learning for intensional data management Pierre Senellart ÉCOLE NORMALE SUPÉRIEURE RESEARCH UNIVERSITY PARIS 20 April 2018, SEQUEL Seminar

Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

1/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Reinforcement learningfor intensional data management

Pierre Senellart

ÉCOLE NORMALE

S U P É R I E U R E

RESEARCH UNIVERSITY PARIS

20 April 2018, SEQUEL Seminar

Page 2: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

2/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Uncertain data is everywhere

Numerous sources of uncertain data:

• Measurement errors

• Data integration from contradicting sources

• Imprecise mappings between heterogeneous schemas

• Imprecise automatic processes (information extraction,natural language processing, etc.)

• Imperfect human judgment

• Lies, opinions, rumors

Page 3: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

3/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Structured data is everywhere

Data is structured, not flat:• Variety of representation formats of data in the wild:

• relational tables• trees, semi-structured documents• graphs, e.g., social networks or semantic graphs• data streams• complex views aggregating individual information

• Heterogeneous schemas

• Additional structural constraints: keys, inclusiondependencies

Page 4: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

4/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Intensional data is everywhereLots of data sources can be seen as intensional: accessing all thedata in the source (in extension) is impossible or very costly,but it is possible to access the data through views, with someaccess constraints, associated with some access cost.

• Indexes over regular data sources• Deep Web sources: Web forms, Web services• The Web or social networks as partial graphs that can beexpanded by crawling

• Outcome of complex automated processes: informationextraction, natural language analysis, machine learning,ontology matching

• Crowd data: (very) partial views of the world• Logical consequences of facts, costly to compute

Page 5: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

5/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Interactions betweenuncertainty, structure, intensionality

• If the data has complex structure, uncertain models shouldrepresent possible worlds over these structures (e.g.,probability distributions over graph completions of aknown subgraph in Web crawling).

• If the data is intensional, we can use uncertainty torepresent prior distributions about what may happen if weaccess the data. Sometimes good enough to reach adecision without having to make the access!

• If the data is an RDF graph accessed by semantic Webservices, each intensional data access will not give a singledata point, but a complex subgraph.

Page 6: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

6/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Intensional Data Management

• Jointly deal with Uncertainty, Structure, and the fact thataccess to data is limited and has a cost, to solve a user’sknowledge need

• Lazy evaluation whenever possible

• Evolving probabilistic, structured view of the currentknowledge of the world

• Solve at each step the problem: What is the best access todo next given my current knowledge of the world and theknowledge need

• Knowledge acquisition plan (recursive, dynamic, adaptive)that minimizes access cost, and provides probabilisticguarantees

Page 7: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

formulation

answer & explanation

optimization

modeling

priors

intensional access

knowledge update

Knowledgeneed

Currentknowledgeof the world

Knowledgeacquisition plan

Uncertainaccess result

Structuredsource profiles

Page 8: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

formulation

answer & explanation

optimization

modeling

priors

intensional access

knowledge update

Knowledgeneed

Currentknowledgeof the world

Knowledgeacquisition plan

Uncertainaccess result

Structuredsource profiles

Page 9: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

formulation

answer & explanation

optimization

modeling

priors

intensional access

knowledge update

Knowledgeneed

Currentknowledgeof the world

Knowledgeacquisition plan

Uncertainaccess result

Structuredsource profiles

Page 10: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

formulation

answer & explanation

optimization

modeling

priors

intensional access

knowledge update

Knowledgeneed

Currentknowledgeof the world

Knowledgeacquisition plan

Uncertainaccess result

Structuredsource profiles

Page 11: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

formulation

answer & explanation

optimization

modeling

priors

intensional access

knowledge update

Knowledgeneed

Currentknowledgeof the world

Knowledgeacquisition plan

Uncertainaccess result

Structuredsource profiles

Page 12: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

formulation

answer & explanation

optimization

modeling

priors

intensional access

knowledge update

Knowledgeneed

Currentknowledgeof the world

Knowledgeacquisition plan

Uncertainaccess result

Structuredsource profiles

Page 13: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

formulation

answer & explanation

optimization

modeling

priors

intensional access

knowledge update

Knowledgeneed

Currentknowledgeof the world

Knowledgeacquisition plan

Uncertainaccess result

Structuredsource profiles

Page 14: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

formulation

answer & explanation

optimization

modeling

priors

intensional access

knowledge update

Knowledgeneed

Currentknowledgeof the world

Knowledgeacquisition plan

Uncertainaccess result

Structuredsource profiles

Page 15: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

8/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

What this talk is about

• A personal perspective on how to approach intensionaldata management

• Various applications of intensional data management andhow we solved them (own research and my students’)

• Main tool used: reinforcement learning (bandits, Markovdecision processes)

• Focus on one such application: database tuning

Page 16: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

9/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Outline

Intensional data management

Reinforcement learning

Applications

Focus: Database Tuning

Conclusion

Page 17: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

10/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Reinforcement learning

• Deals with agents learning to interact with a partiallyunknown environment

• Agents can perform actions, resulting in rewards (orpenalties) and in a possible state change

• Agents learn about the world by interacting with it

• Goal: find the right sequence of actions that maximizesoverall reward, or minimizes overall penalty

• Classic tradeoff: exploration vs exploitation

Page 18: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

11/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Multi-armed bandits

• Stateless model

• k > 1 different actions, withunknown rewards

• often assumed that therewards are from aparametrized probabilisticdistribution (Bernoulli,Gaussian, exponential, etc.)

Page 19: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

12/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Markov decision process (MDP)

• Finite set of states• In each state, actions lead:

• to a state change• to a reward

• Depending on cases:• state changes may be

deterministic orprobabilistic

• state changes may be knownor unknown (to be learned)

• rewards may be known orunknown (to be learned)

• current state may even beunknown! (poMDP)

Page 20: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

13/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Outline

Intensional data management

Reinforcement learning

Applications

Focus: Database Tuning

Conclusion

Page 21: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

14/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Adaptive focused crawling (stateless)[Gouriten et al., 2014]

with

3

5 0

4

0

0

3

5

3

4

2

2

3

3

5 0

4

0

0

3

5

3

4

2

2

3

2 3

0

00

10

0

1

0

1

3

00

0

0

• Problem: Efficiently crawl nodesin a graph such that total score ishigh

• Challenge: The score of a node isunknown till it is crawled

• Methodology: Use variouspredictors of node scores, andadaptively select the best one sofar with multi-armed bandits

Page 22: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

14/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Adaptive focused crawling (stateless)[Gouriten et al., 2014]

with

3

5 0

4

0

0

3

5

3

4

2

2

3

3

5 0

4

0

0

3

5

3

4

2

2

3

2 3

0

00

10

0

1

0

1

3

00

0

0

• Problem: Efficiently crawl nodesin a graph such that total score ishigh

• Challenge: The score of a node isunknown till it is crawled

• Methodology: Use variouspredictors of node scores, andadaptively select the best one sofar with multi-armed bandits

Page 23: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

14/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Adaptive focused crawling (stateless)[Gouriten et al., 2014]

with

3

5 0

4

0

0

3

5

3

4

2

2

3

3

5 0

4

0

0

3

5

3

4

2

2

3

2 3

0

00

10

0

1

0

1

3

00

0

0

• Problem: Efficiently crawl nodesin a graph such that total score ishigh

• Challenge: The score of a node isunknown till it is crawled

• Methodology: Use variouspredictors of node scores, andadaptively select the best one sofar with multi-armed bandits

Page 24: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

15/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Adaptive focused crawling (stateful)[Han et al., 2018]

with

• Problem: Efficiently crawl nodesin a graph such that total score ishigh, taking into accountcurrently crawled graph

• Challenge: Huge state space

• Methodology: MDP, clustering ofstate space, together with linearapproximation of value functions

Page 25: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

16/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Optimizing crowd queries for data mining[Amsterdamer et al., 2013a,b]

with

• Problem: To find patterns in thecrowd, what is the best questionto ask the crowd next?

• Challenge: No a prioriinformation on crowd data

• Methodology: Model all possiblequestions as actions in amulti-armed bandit setting, andfind a trade-off betweenexploration and exploitation

Page 26: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

17/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Online influence maximization[Lei et al., 2015]

with

Action Log

User | Action | Date

1 1 2007-10-102 1 2007-10-123 1 2007-10-142 2 2007-11-104 3 2007-11-12

. . .

1

42

3

0.5

0.1 0.9

0.50.2

. . .

Real World Simulator

1

42

3

. . .

Social Graph

1

42

3

. . .

Weighted Uncertain Graph

One Trial

Sampling

1

2

Activation SequencesUpdate

3

Solution Framework

• Problem: Run influencecampaigns in social networks,optimizing the amount ofinfluenced nodes

• Challenge: Influence probabilitiesare unknown

• Methodology: Build a model ofinfluence probabilities and focuson influent nodes, with anexploration/exploitation trade-off

Page 27: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

18/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Routing of Autonomous Taxis[Han et al., 2016]

with

• Problem: Route a taxi tomaximize its profit

• Challenge: Real-world data, no apriori model of the world

• Methodology: MDP, withstandard Q-learning andcustomizedexploration/exploitation strategy

Page 28: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

19/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Cost-Model–Free Database Tuning[Basu et al., 2015, 2016]

with

• Problem: Automatically findwhich indexes to create in adatabase for optimal performance

• Challenge: The workload andcost model are unknown

• Methodology: Model databasetuning as a Markov decisionprocess and use reinforcementlearning techniques to iterativelylearn a cost model and workloadcharacteristics

Page 29: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

20/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Outline

Intensional data management

Reinforcement learning

Applications

Focus: Database TuningMotivationProblem FormulationAdaptive Tuning AlgorithmCOREIL: Index TunerPerformance Evaluation

Conclusion

Page 30: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

21/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Motivation

• Current query optimizers depend on pre-determined costmodels

• But cost models can be highly erroneous

the cardinality model. In my experience, the cost modelmay introduce errors of at most 30% for a givencardinality, but the cardinality model can quite easilyintroduce errors of many orders of magnitude! I’llgive a real-world example in a moment. With sucherrors, the wonder isn’t “Why did the optimizer pick abad plan?” Rather, the wonder is “Why would theoptimizer ever pick a decent plan?”

Guy Lohman, IBM Research, ACM SIGMOD Blog 2014

Page 31: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

22/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Proposed Solution

• We propose and validate a tuning strategy to do withoutsuch a pre-defined model

• The process of database tuning is modeled as a Markovdecision process (MDP)

• A reinforcement learning based algorithm is developed tolearn the cost function

• COREIL replaces the need of pre-defined knowledge of costin index tuning

Page 32: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

23/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Problem

Database Schema: R

WarehouseW

Customer 2Customer 1 Customer 3

Queries

1) New order

2) Delivery

3) Stock

Tables

1) History

2) Stock

3) New orders

4) Stocks

s

Set of all Database Configurations: S = {s}

… …..

t = 201 Customer 1, New order

t = 202 Stock

t = 203 Customer 2, Delivery

… …..

Schedule of queries and updates: Q

Page 33: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

24/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Transition

1ts ts

tqQuery

DB configurationUpdated

DB configuration

1,t ts s

cost ,t ts q

Query execution

1 1( , , ) , cost ,t t t t t t tC s s q s s s q Per-stage cost

Configuration update

Page 34: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

25/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Mapping to MDP

1ts ts

tq

1,t ts s

cost ,t ts q

Query execution

1 1( , , ) , cost ,t t t t t t tC s s q s s s q Per-stage cost

Configuration update

States

Action

Penalty function

Page 35: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

26/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

MDP Formulation

• State: Database configurations s 2 S

• Action: Configuration changes st�1 ! st along with queryqt execution

• Penalty function: Per-stage cost of the action C(st�1; st; q̂t)

• Transition function: Transition from one state to anotheron an action are deterministic

• Policy: A sequence of configuration changes depending onthe incoming queries

Page 36: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

27/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Problem Statement

• For a policy � and discount factor 0 < < 1 thecumulative penalty function or the cost-to-go function canbe defined as,

V �(s) , E

"1Xt=1

t�1C(st�1; st; q̂t)

#satisfying

8>><>>:s0 = s

st = �(st�1; q̂t);

t � 1

• Goal: Find out an optimal policy �� that minimizes thecumulative penalty or the cost-to-go function

Page 37: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

28/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Features of The Model

• The schedule is sequential

• The issue of concurrency control is orthogonal

• Query qt is a random variable generated from an unknownstochastic process

• It is always cheaper to do a direct configuration change

• There is no free configuration change

Page 38: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

29/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Policy Iteration

A dynamic programming approach to solve MDP

• Begin with an initial policy �0 and initial configuration s0

• Find an estimate V�0(s0) of the cost-to-go function

• Incrementally improve the policy using the currentestimate of the cost-to-go function. Mathematically,

V�t(s) = min

s02S

��(s; s0) + E

�cost(s0; q)

�+ V

�t�1(s0)�

• Carry on the improvement till there is no (or �) change inpolicy

Page 39: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

30/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Problems with Policy Iteration

• Problem 1: The curse of dimensionality makes directcomputation of V hard

• Problem 2: There may be no proper model availablebeforehand for the cost function cost(s; q)

• Problem 3: The probability distribution of queries beingunknown, it is impossible to compute the expected cost ofquery execution

Page 40: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

31/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Solution: Reducing the Search Space

PropositionLet s be any configuration and q̂ be any observed query. Let�� be an optimal policy. If ��(s; q̂) = s0, thencost(s; q̂)� cost(s0; q̂) � 0: Furthermore, if �(s; s0) > 0, i.e., ifs 6= s0 and there is no free configuration change, thencost(s; q̂)� cost(s0; q̂) > 0:

Thus, the reduced subspace of interest

Ss;q̂ = fs0 2 S j cost(s; q̂) > cost(s0; q̂)g

Page 41: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

32/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Solution: Learning the Cost Model

• Changing the configuration from s to s0 can be consideredas executing a special query q(s; s0)

• Then the cost model can be approximated as

�(s; s0) = cost(s; q(s; s0)) � ���T���(s; q(s; s0))

• This approximation can be improved recursively usingRecursive Least Square Estimation (RLSE) algorithm

• Similar linear projection ���(s) can be used to approximatethe cost-to-go function V

�t(s)

Page 42: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

33/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

What is COREIL?

COREIL is an index tuner, that

• instantiates our reinforcement learning framework

• tunes the configurations differing in their secondary indexes

• handles the configuration changes corresponding to thecreation and deletion of indexes

• inherently learns the cost model and solves an MDP foroptimal index tuning

Page 43: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

34/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

COREIL: Reducing the State Space

• I be the set of all possible indexes

• Each configuration s 2 S is an element of the power set 2jIj

• r(q̂) be the set of recommended indexes for a query q̂

• d(q̂) be the set of indexes being modified (update, insertionor deletion) by q̂

• The reduced search space is

Ss;q̂ = fs0 2 S j (s� d(q̂)) � s0 � (s [ r(q̂))g

• For B+ trees, prefix closure hr(q̂)i replaces r(q̂) for betterapproximation

Page 44: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

35/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

COREIL: Feature Mapping Cost-to-go Function

• We can define

�s0(s) ,

8<:1; if s0 � s

�1; otherwise:8s; s0 2 S

TheoremThere exists a unique ��� = (�s0)s02S which approximates thevalue function as

V (s) =Xs02S

�s0�s0(s) = ���T���(s)

Page 45: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

36/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

COREIL: Feature Mapping Per-stage Cost

• ���(s; q̂) captures the difference between the index setrecommended by the database system and that of thecurrent configuration

• ���(s; q̂) takes values either 1 or 0 whether a query modifiesany index in the current configuration

• We define the feature mapping

��� = (���T ; ���T )T

to approximate the functions � and cost

Page 46: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

37/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Dataset and Workload

• The dataset and workload conform to the TPC-Cspecification

• They are generated by the OLTP-Bench tool

• Each of the 5 transactions are associated with 3 � 5 SQLstatements (query/update)

• Response time of processing corresponding SQL statementis measured using IBM DB2

• The scale factor (SF) used here is 2

Page 47: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

38/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Efficiency

0 500 1;000 1;500 2;000 2;500 3;0000

1;000

2;000

3;000

4;000

5;000

Query #

Tim

e(m

s)

COREILWFIT

Page 48: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

39/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Box-plot Analysis

COREIL WFIT

200

400

600

800Tim

e(m

s)

Page 49: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

40/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Overhead Cost Analysis

0 500 1;000 1;500 2;000 2;500 3;0000

500

1;000

1;500

Query #

Tim

e(m

s)

COREILWFIT

Page 50: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

41/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Effectiveness

0 500 1;000 1;500 2;000 2;500 3;000

101

102

103

Query #

Tim

e(m

s)

COREILWFIT

Page 51: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

42/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Outline

Intensional data management

Reinforcement learning

Applications

Focus: Database Tuning

Conclusion

Page 52: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

43/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

In brief

• Intensional data management arises in a large variety ofsettings, whenever there is a cost to accessing data

• Reinforcement learning (bandits, MDPs) is a key tool indealing with such data

• Various complications in data management settings: hugestate space, no a priori model for rewards/penatlies,delayed rewards. . .

• Rich field of applications for RL research!

Page 53: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

44/44

Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion

Merci.

Page 54: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

Bibliography I

Yael Amsterdamer, Yael Grossman, Tova Milo, and PierreSenellart. Crowd mining. In Proc. SIGMOD, pages 241–252,New York, USA, June 2013a.

Yael Amsterdamer, Yael Grossman, Tova Milo, and PierreSenellart. Crowd miner: Mining association rules from thecrowd. In Proc. VLDB, pages 241–252, Riva del Garda,Italy, August 2013b. Demonstration.

Debabrota Basu, Qian Lin, Weidong Chen, Hoang Tam Vo,Zihong Yuan, Pierre Senellart, and Stéphane Bressan.Cost-model oblivious database tuning with reinforcementlearning. In Proc. DEXA, pages 253–268, Valencia, Spain,September 2015.

Page 55: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

Bibliography IIDebabrota Basu, Qian Lin, Weidong Chen, Hoang Tam Vo,

Zihong Yuan, Pierre Senellart, and Stéphane Bressan.Regularized cost-model oblivious database tuning withreinforcement learning. Transactions on Large-Scale Dataand Knowledge-Centered Systems, 28:96–132, 2016.

Georges Gouriten, Silviu Maniu, and Pierre Senellart. Scalable,generic, and adaptive systems for focused crawling. In Proc.Hypertext, pages 35–45, Santiago, Chile, September 2014.Douglas Engelbart Best Paper Award.

Miyoung Han, Pierre Senellart, Stéphane Bressan, and HuayuWu. Routing an autonomous taxi with reinforcementlearning. In Proc. CIKM, Indianapolis, USA, October 2016.Industry track, short paper.

Miyoung Han, Pierre Senellart, and Pierre-Henri Wuillemin.Focused crawling through reinforcement learning. In Proc.ICWE, June 2018.

Page 56: Reinforcement learning for intensional data management · 2018. 4. 20. · 1/44 Intensional data management Reinforcement learning Applications Focus: Database Tuning Conclusion Reinforcementlearning

Bibliography III

Siyu Lei, Silviu Maniu, Luyi Mo, Reynold Cheng, and PierreSenellart. Online influence maximization. In Proc. KDD,pages 645–654, Sydney, Australia, August 2015.