Controller Synthesis and the Hydac Case Study fileJF Raskin U.L.B. Quasimodo Workshop - Oct. 24,...

Preview:

Citation preview

JF RaskinU.L.B.

Quasimodo Workshop - Oct. 24, 2010ES Week Tutorial

Controller Synthesis and the Hydac Case Study

Content

❶ Context and motivations

❷ Flavor of the underlying techniques

A. Untimed safety gamesB. Timed games - Controllable predecessorsC. Recent progresses - Imperfect info D. Recent progresses - Optimality

❸ Application: The Hydac case study

Context - Motivations

Embedded systems

System

Hybrid systems mix discrete and continuous components : non trivial interactions

☛ difficult to develop

They are often safety critical

☛ need for formal methods

VerificationAn introduction to hybrid automta 9

Fig. 4. Hybrid automata for the burner and the thermometer

3. Jump({l11, l21}, !, {l12, l

22}) = Jump((l11, !, l12)) ! Jump((l21 , !, l22)) if ! "

"1 # "2;Conditions 1 and 2 express that discrete changes that are local to oneautomaton have the enabling condition and the e!ect described by thejump predicate of that automaton and the variables which are not sharedremain unchanged. Condition 3 expresses that discrete changes sharedby the two automata have as enabling condition the conjunction of theenabling conditions of each discrete change. Their e!ect is the conjunctionof the e!ects of each discrete change.

10 J.-F. Raskin

In our example, we obtain the complete system by composing the threeautomata. It is easy to show that the product operation that we have definedis commutative and associative, so we can write Sys = Tank!Burner!Thermo.Fig. 5 shows the hybrid automaton obtained by composing the automaton forthe tank and the automaton for the thermometer. We have omitted transitionsthat are incompatible with the invariant of their starting location. That is,edges e = (l, !, l!) such that [[Jump(e) " Inv(l)]]= # are not depicted.

Fig. 5. Product of tank and thermometer

3 Properties of Hybrid Systems

Properties assign values to trajectories of hybrid systems. In this introduc-tion, we restrict ourselves to properties that classify trajectories as good orbad according to whether or not they stay or not in a given set of (good)states. Those properties are called safety properties [AS85], and, are the mostimportant class of properties when considering safety critical systems.

Let us go back to our running example. Now that we have a complete modelof our system, we would like to design a controller that enforces some desiredbehaviors. The controller will be an additional hybrid automaton that, whencomposed with the automata modeling our system, must enforce the followingproperties on the trajectories of the entire system:

satisfies ?!(low ! x ! high)

Math. model

Hybrid automata

System

SynthesisAn introduction to hybrid automta 9

Fig. 4. Hybrid automata for the burner and the thermometer

3. Jump({l11, l21}, !, {l12, l

22}) = Jump((l11, !, l12)) ! Jump((l21 , !, l22)) if ! "

"1 # "2;Conditions 1 and 2 express that discrete changes that are local to oneautomaton have the enabling condition and the e!ect described by thejump predicate of that automaton and the variables which are not sharedremain unchanged. Condition 3 expresses that discrete changes sharedby the two automata have as enabling condition the conjunction of theenabling conditions of each discrete change. Their e!ect is the conjunctionof the e!ects of each discrete change.

construct ? such that!(low ! x ! high)

?

?

is satisfied no matterhow Env behaves !

Hybrid automata

System

An introduction to hybrid automta 9

Fig. 4. Hybrid automata for the burner and the thermometer

3. Jump({l11, l21}, !, {l12, l

22}) = Jump((l11, !, l12)) ! Jump((l21 , !, l22)) if ! "

"1 # "2;Conditions 1 and 2 express that discrete changes that are local to oneautomaton have the enabling condition and the e!ect described by thejump predicate of that automaton and the variables which are not sharedremain unchanged. Condition 3 expresses that discrete changes sharedby the two automata have as enabling condition the conjunction of theenabling conditions of each discrete change. Their e!ect is the conjunctionof the e!ects of each discrete change.

?

?

Hybrid automata

Synthesis implies “correct by construction”

System

construct ? such that!(low ! x ! high)

is satisfied no matterhow Env behaves !

Synthesis

Technical highlight(Untimed) Safety Games

Two-player game structures

0000

0101

1010

0100

1000

1101

1110

1111

Rounded positions belong to Player I

(Controller)

Square positions belong to Player 2

(Environment)

A game is played as follows: in each round, the game is in a position, if the game is in a rounded position, Player I resolves the choice for the next state, if the game is in a square position, Player 2 resolves the choice. The game is played for an infinite number of rounds.

0000

0101

1010

0100

1000

1101

1110

1111

Rounded positions belong to Player ISquare positions belong to Player 2

0000

0101

1010

0100

1000

1101

1110

1111

Play : 0000

0000

0101

1010

0100

1000

1101

1110

1111

Play : 0000 0100

0000

0101

1010

0100

1000

1101

1110

1111

Play : 0000 0100 0101

0000

0101

1010

0100

1000

1101

1110

1111

Play : 0000 0100 0101 1101

0000

0101

1010

0100

1000

1101

1110

1111

Play : 0000 0100 0101 1101 ...

0000

0101

1010

0100

1000

1101

1110

1111

Play : 0000 0100 0101 1101 ...

Who is winning ?

0000

0101

1010

0100

1000

1101

1110

1111

Play : 0000 0100 0101 1101 ...

Is this a good or a bad play for Player 1 ?

Who is winning ?

0000

0101

1010

0100

1000

1101

1110

1111

Who is winning ?

A winning condition (for Player 1) is a set of playsW ! (Q1 " Q2)

!

Game=

Two-player game structure+

Winning condition for Player 1

Strategies

Players are playing according to strategies.

λ1(0011 1001 1101 0011)=1110

prefix of play

Player I’sposition

Choice for the next position

λ1 : S*•S1 → S

Strategies

Players are playing according to strategies.

λ1(0011 1001 1101 0011)=1110

prefix of play

Player I’sposition

Choice for the next position

λ1 : S*•S1 → SSymmetrically for Player 2

Outcome of strategiesIf we fix a strategy for the two players and we let the two players apply their strategies, we get a play:

Outcome(λ1,λ2)=1100 0011 0001 0011 ...

If we fix a strategy only for Player I, we get a set of plays

Outcome(λ1)=∪λ2Outcome (λ1,λ2)

A strategy for Player I is winning for objective W iff

Outcome(λ1) ⊆ W

Outcome of strategiesA strategy for Player I is winning for objective W iff

Outcome(λ1) ⊆ W

That is, no matter how Player 2 resolves his choices, when player I plays according to λI, the resulting play belongs to W.

⟹ Player I can force the play to be in W.

Winning strategies

=

Controllers that enforce winning plays

Safety Games

0000

0101

1010

0100

1000

1101

1110

1111

A Safety Game

Does Player I (rounded positions) have a strategy (against any choices of Player II) to stay within the set of states

?Q \ {1111}

0000

0101

1010

0100

1000

1101

1110

1111

A Safety Game

W = { s0 s1 s2 .... | si ≠ 1111, for all i≥0 }

Symbolic algorithms to solve games

Symbolic algorithm for safety games

Bad

From where can Player I avoid Bad ?

Bad

From where can Player I avoid Bad ?

Symbolic algorithm for safety games

Bad

From where can Player I avoid Bad ?

Symbolic algorithm for safety games

1 step unsafe

1 step unsafe

Bad

From where can Player I avoid Bad ?

Iterate untilstabilization.

Symbolic algorithm for safety games

Bad

From where can Player I avoid Bad ?

Symbolic algorithm for safety games

2 steps unsafe

2 steps unsafe

Bad

From where can Player I avoid Bad ?

Symbolic algorithm for safety games

3 steps unsafe

3 steps unsafe

Bad

From where can Player I avoid Bad ?

Stabilization !

Symbolic algorithm for safety games

This is exactly the set of states where Player I has a strategy

to avoid the bad states.

Symbolic algorithm for safety games

0000

0101

1010

0100

1000

1101

1110

1111

Similar fixed point algorithms exist for any omega-regular objectives

This extends to rich winning conditions !

Technical highlight:Timed Game

Controllable Predecessors

• Timed games are not turn-based

• Objective: reach goal (and so avoid L4).

• Strategies:f1 : history → (delay,controllable action)

• Existence of memoryless strategies:

f1 : (l,v) → (delay,controllable action)

f1(L0,0)=(0.5,L0→L1)

Note that this action may not happen if Player 2 plays faster.

Timed Game Structures

• Timed games are not turn-based

• Objective: reach goal (and so avoid L4).

• Strategies:f1 : history → (delay,controllable action)

• Existence of memoryless strategies:

f1 : (l,v) → (delay,controllable action)

f1(L0,0)=(0.5,L0→L1)

Note that this action may not happen if Player 2 plays faster.

Timed Game Structures

Player 1 controls plain arcsPlayer 2 controls dashed arcs

⚠Player 2 can play (L0,L4) only if Player 1 does not play before clock x has reached value 1 !⚠

Delays in playing are essential here !

Timed Game Structures• Timed games are not turn-based

• Strategy for player 1:f1 : history → (delay,controllable action)

• Objective: reach goal (and so avoid L4).

• Existence of memoryless strategies:

f1 : (l,v) → (delay,controllable action)

f1(L0,0)=(0.5,L0→L1)

Note that this action may not happen if Player 2 plays faster.

Questions that we want to answer:

• Does there exist a winning strategy for Player 1 ?

• If yes, get one which is as simple as possible.

Timed Game Structures

Timed Games

Theorem[AMPS98][HK99]. ➢ Safety, reachability, (co)-Büchi are EXPTIME-C. ➢ Region based strategies are sufficient.

➣ Region are not practical !

In [CDFLL05] Larsen et al propose an efficient on-the-fly algorithms for the analysis of timed games.

This algorithm is implemented in UppAal-Tiga.

Timed Game Structures

L0

L1

L2

L3

0 1 2 3 ...

0 1 2 3 ...

0 1 2 3 ...

0 1 2 3 ...

Timed Game Structures

L0

L1

L2

L3

0 1 2 3 ...

0 1 2 3 ...

0 1 2 3 ...

0 1 2 3 ...

Timed Game Structures

L0

L1

L2

L3

0 1 2 3 ...

0 1 2 3 ...

0 1 2 3 ...

0 1 2 3 ...

Timed Game Structures

L0

L1

L2

L3

0 1 2 3 ...

0 1 2 3 ...

0 1 2 3 ...

0 1 2 3 ...

Timed Game Structures

L0

L1

L2

L3

0 1 2 3 ...

0 1 2 3 ...

0 1 2 3 ...

0 1 2 3 ...

Timed Game Structures

L0

L1

L2

L3

0 1 2 3 ...

0 1 2 3 ...

0 1 2 3 ...

0 1 2 3 ...

x≤1

x≤1

Timed GamesTheorem[AMPS98][HK99]. ➢ Safety, reachability, (co)-Büchi are EXPTIME-C. ➢ Region based strategies are sufficient.

➣ Region are not practical !

In [CDFLL05] Larsen et al. propose an efficient on-the-fly algorithm.

OTFUR [CDFLL05]

Forward exploration Backward propagation

Unsafe

Init

Unsafe

∧∨× ××

Only reachable states are exploredNon trivial use of Difference Bound Matrices.

Pl.1 Pl.2

OTFUR [CDFLL05]

Unsafe

Implemented in UppAal-Tiga

Recent Progressesin Timed Games

Imperfect information

• Embedded controllers get information through sensors.

• Sensors = finite precision = imperfect information.

• Timed games with imperfect information are undecidable [BDMP03].

• But interesting sub-cases are decidable [CDLLR07]:

• restrict the class of strategies to be regular functions from observation sequences to controllable actions;

• if not sufficient for control then add new observations.

!"#$ %&"'(&) '# !*)) +,-."/&'(#0 10#2).34. 5&-.3 6*,-.' 7#0-'"*8'(#0

!"#$%&'(& )*+&' ,-.+&/ 0#"+/1-2/3#"+,-."/&'(#0-9

••••: ! " ;<

;< # ! " ;=;= # !

6.0-#"

$ # <

6.0-.3

$ # ><

%&(0'

$ " ><

%(-'#0

$ # ><

?03

+@@

$ % <

A9B<

$ % C

$ 9= <

$ % C

$ 9= <

$ % C

$ 9= <

D(8DE

FG.0 ;< H ! " ;=I 'G. )#8&'(#0 (- %(-'#0J

KLMKN<O P;QR><R;<<OS L($.3 7#0'"#) 2('G +,-."/&'(#0 5&-.3 &03 6'*''."(04 T0/&"(&0' 6'"&'.4(.- >= R ;>

!"#$ %&"'(&) '# !*)) +,-."/&'(#0 10#2).34. 5&-.3 6*,-.' 7#0-'"*8'(#0

!"#$%&'(& )*+&' ,-.+&/ 0#"+/1-2/3#"+,-."/&'(#0-9

••••: ! " ;<

;< # ! " ;=;= # !

6.0-#"

$ # <

6.0-.3

$ # ><

%&(0'

$ " ><

%(-'#0

$ # ><

?03

+@@

$ % <

A9B<

$ % C

$ 9= <

$ % C

$ 9= <

$ % C

$ 9= <

D(8DE

FG.0 ;< H ! " ;=I 'G. )#8&'(#0 (- %(-'#0J

KLMKN<O P;QR><R;<<OS L($.3 7#0'"#) 2('G +,-."/&'(#0 5&-.3 &03 6'*''."(04 T0/&"(&0' 6'"&'.4(.- >= R ;>

Imperfect information

Kick!

Regular observation based strategyOnly observation changes are observed !

Optimal strategies

Specifications for embedded controllers:

✓ Regular objectives (safety, liveness, etc.)

+✓ Optimality criteria

(ex: energy consumption, time to reachability, etc.)

Weighted Timed Automata

2

More recently, the ability to consider more general performance measures has been

given. Priced extensions of timed automata have been introduced where a cost c is asso-ciated with each location ! giving the cost of a unit of time spent in !. In [3] cost-boundreachability has been shown decidable. [8] and [5] independently solve the cost-optimal

reachability problem for priced timed automata. Efficient incorporation in UPPAAL is

provided by use of so-called priced zones as a main data structure [25]. In [29] the im-

plementation of cost-optimal reachability is improved considerably by exploiting the

duality with linear programming problems over zones (min-cost flow problems). More

recently [11], the problem of computing optimal infinite schedules (in terms of minimal

limit-ratios) is solved for the model of priced timed automata.

The Optimal Cost Control Problem for Timed Games. In this paper we combine the

notions of game and price and solve the problem of cost-optimal winning strategies

for priced timed game automata.The problem we consider is: “Given a timed game

automaton A, a goal location Goal, what is the optimal cost we can achieve to reachGoal inA?”. We refer to this problem as the Optimal Cost Problem (OCP). Consider theexample of a priced timed game automaton given in Fig. 1. Here the cost-rates (cost per

time unit) in locations !0, !2 and !3 are 5, 10 and 1 respectively. In !1 the environmentmay choose to move to either !2 or !3 (dashed arrows are uncontrollable). However, dueto the invariant y = 0 this choice must be made instantaneous. Obviously, once !2 or !3has been reached the optimal strategy for the controller is to move to Goal immediately(however there is a discrete cost (resp. 1 and 7) on each discrete transition). The crucial(and only remaining) question is how long the controller should wait in !0 before takingthe transition to !1. Obviously, in order for the controller to win this duration must beno more than two time units. However, what is the optimal choice for the duration in the

sense that the overall cost of reaching Goal is minimal? Denote by t the chosen delayin !0. Then 5t + 10(2 ! t) + 1 is the minimal cost through !2 and 5t + (2 ! t) + 7 isthe minimal cost through !3. As the environment chooses between these two transitionsthe best choice for the controller is to delay t " 2 such that max(21 ! 5t, 9 + 4t) isminimum, which is t = 4

3 giving a minimal cost of 14 13 .

!0

cost(!0) = 5

!1

[y = 0]

!2

cost(!2) = 10

!3

cost(!3) = 1

Goalx ! 2; c1 ; y := 0

u

u

x " 2; c2; cost = 1

x " 2; c2 ; cost = 7

Fig. 1. A Reachability Priced Time Game Automaton A

Related Work. Acyclic priced (or weighted) timed games have been studied in [23] and

the more general case of non-acyclic games have been recently considered in [1]. In [1],

the problem they consider is “compute the optimal cost within k steps” (we refer to this

Locations are annotated with a cost (weight) per time unit (derivative)Transitions are annotated with a cost (weight).

[Alur et al. & Larsen et al., 2001]

Example from [Bouyer et al, 2004]

!!! Costs are observers !!!

Weighted Timed Automata

2

More recently, the ability to consider more general performance measures has been

given. Priced extensions of timed automata have been introduced where a cost c is asso-ciated with each location ! giving the cost of a unit of time spent in !. In [3] cost-boundreachability has been shown decidable. [8] and [5] independently solve the cost-optimal

reachability problem for priced timed automata. Efficient incorporation in UPPAAL is

provided by use of so-called priced zones as a main data structure [25]. In [29] the im-

plementation of cost-optimal reachability is improved considerably by exploiting the

duality with linear programming problems over zones (min-cost flow problems). More

recently [11], the problem of computing optimal infinite schedules (in terms of minimal

limit-ratios) is solved for the model of priced timed automata.

The Optimal Cost Control Problem for Timed Games. In this paper we combine the

notions of game and price and solve the problem of cost-optimal winning strategies

for priced timed game automata.The problem we consider is: “Given a timed game

automaton A, a goal location Goal, what is the optimal cost we can achieve to reachGoal inA?”. We refer to this problem as the Optimal Cost Problem (OCP). Consider theexample of a priced timed game automaton given in Fig. 1. Here the cost-rates (cost per

time unit) in locations !0, !2 and !3 are 5, 10 and 1 respectively. In !1 the environmentmay choose to move to either !2 or !3 (dashed arrows are uncontrollable). However, dueto the invariant y = 0 this choice must be made instantaneous. Obviously, once !2 or !3has been reached the optimal strategy for the controller is to move to Goal immediately(however there is a discrete cost (resp. 1 and 7) on each discrete transition). The crucial(and only remaining) question is how long the controller should wait in !0 before takingthe transition to !1. Obviously, in order for the controller to win this duration must beno more than two time units. However, what is the optimal choice for the duration in the

sense that the overall cost of reaching Goal is minimal? Denote by t the chosen delayin !0. Then 5t + 10(2 ! t) + 1 is the minimal cost through !2 and 5t + (2 ! t) + 7 isthe minimal cost through !3. As the environment chooses between these two transitionsthe best choice for the controller is to delay t " 2 such that max(21 ! 5t, 9 + 4t) isminimum, which is t = 4

3 giving a minimal cost of 14 13 .

!0

cost(!0) = 5

!1

[y = 0]

!2

cost(!2) = 10

!3

cost(!3) = 1

Goalx ! 2; c1 ; y := 0

u

u

x " 2; c2; cost = 1

x " 2; c2 ; cost = 7

Fig. 1. A Reachability Priced Time Game Automaton A

Related Work. Acyclic priced (or weighted) timed games have been studied in [23] and

the more general case of non-acyclic games have been recently considered in [1]. In [1],

the problem they consider is “compute the optimal cost within k steps” (we refer to this

[Alur et al. & Larsen et al., 2001]

Example from [Bouyer et al, 2004]

(l0,(0,0),0)−0.8→(l0,(0.8,0.8),4)−c1→(l1,(0.8,0),4)−u− (l2,(0.8,0),4)−3.1→(l2,(3.9,3.1),35) −c2→(Goal,(3.9,3.1),36)

Costs are accumulated but are not “tested” along the run.

!!! Costs are observers !!!

2

More recently, the ability to consider more general performance measures has been

given. Priced extensions of timed automata have been introduced where a cost c is asso-ciated with each location ! giving the cost of a unit of time spent in !. In [3] cost-boundreachability has been shown decidable. [8] and [5] independently solve the cost-optimal

reachability problem for priced timed automata. Efficient incorporation in UPPAAL is

provided by use of so-called priced zones as a main data structure [25]. In [29] the im-

plementation of cost-optimal reachability is improved considerably by exploiting the

duality with linear programming problems over zones (min-cost flow problems). More

recently [11], the problem of computing optimal infinite schedules (in terms of minimal

limit-ratios) is solved for the model of priced timed automata.

The Optimal Cost Control Problem for Timed Games. In this paper we combine the

notions of game and price and solve the problem of cost-optimal winning strategies

for priced timed game automata.The problem we consider is: “Given a timed game

automaton A, a goal location Goal, what is the optimal cost we can achieve to reachGoal inA?”. We refer to this problem as the Optimal Cost Problem (OCP). Consider theexample of a priced timed game automaton given in Fig. 1. Here the cost-rates (cost per

time unit) in locations !0, !2 and !3 are 5, 10 and 1 respectively. In !1 the environmentmay choose to move to either !2 or !3 (dashed arrows are uncontrollable). However, dueto the invariant y = 0 this choice must be made instantaneous. Obviously, once !2 or !3has been reached the optimal strategy for the controller is to move to Goal immediately(however there is a discrete cost (resp. 1 and 7) on each discrete transition). The crucial(and only remaining) question is how long the controller should wait in !0 before takingthe transition to !1. Obviously, in order for the controller to win this duration must beno more than two time units. However, what is the optimal choice for the duration in the

sense that the overall cost of reaching Goal is minimal? Denote by t the chosen delayin !0. Then 5t + 10(2 ! t) + 1 is the minimal cost through !2 and 5t + (2 ! t) + 7 isthe minimal cost through !3. As the environment chooses between these two transitionsthe best choice for the controller is to delay t " 2 such that max(21 ! 5t, 9 + 4t) isminimum, which is t = 4

3 giving a minimal cost of 14 13 .

!0

cost(!0) = 5

!1

[y = 0]

!2

cost(!2) = 10

!3

cost(!3) = 1

Goalx ! 2; c1 ; y := 0

u

u

x " 2; c2; cost = 1

x " 2; c2 ; cost = 7

Fig. 1. A Reachability Priced Time Game Automaton A

Related Work. Acyclic priced (or weighted) timed games have been studied in [23] and

the more general case of non-acyclic games have been recently considered in [1]. In [1],

the problem they consider is “compute the optimal cost within k steps” (we refer to this

Optimal strategy problem

2

More recently, the ability to consider more general performance measures has been

given. Priced extensions of timed automata have been introduced where a cost c is asso-ciated with each location ! giving the cost of a unit of time spent in !. In [3] cost-boundreachability has been shown decidable. [8] and [5] independently solve the cost-optimal

reachability problem for priced timed automata. Efficient incorporation in UPPAAL is

provided by use of so-called priced zones as a main data structure [25]. In [29] the im-

plementation of cost-optimal reachability is improved considerably by exploiting the

duality with linear programming problems over zones (min-cost flow problems). More

recently [11], the problem of computing optimal infinite schedules (in terms of minimal

limit-ratios) is solved for the model of priced timed automata.

The Optimal Cost Control Problem for Timed Games. In this paper we combine the

notions of game and price and solve the problem of cost-optimal winning strategies

for priced timed game automata.The problem we consider is: “Given a timed game

automaton A, a goal location Goal, what is the optimal cost we can achieve to reachGoal inA?”. We refer to this problem as the Optimal Cost Problem (OCP). Consider theexample of a priced timed game automaton given in Fig. 1. Here the cost-rates (cost per

time unit) in locations !0, !2 and !3 are 5, 10 and 1 respectively. In !1 the environmentmay choose to move to either !2 or !3 (dashed arrows are uncontrollable). However, dueto the invariant y = 0 this choice must be made instantaneous. Obviously, once !2 or !3has been reached the optimal strategy for the controller is to move to Goal immediately(however there is a discrete cost (resp. 1 and 7) on each discrete transition). The crucial(and only remaining) question is how long the controller should wait in !0 before takingthe transition to !1. Obviously, in order for the controller to win this duration must beno more than two time units. However, what is the optimal choice for the duration in the

sense that the overall cost of reaching Goal is minimal? Denote by t the chosen delayin !0. Then 5t + 10(2 ! t) + 1 is the minimal cost through !2 and 5t + (2 ! t) + 7 isthe minimal cost through !3. As the environment chooses between these two transitionsthe best choice for the controller is to delay t " 2 such that max(21 ! 5t, 9 + 4t) isminimum, which is t = 4

3 giving a minimal cost of 14 13 .

!0

cost(!0) = 5

!1

[y = 0]

!2

cost(!2) = 10

!3

cost(!3) = 1

Goalx ! 2; c1 ; y := 0

u

u

x " 2; c2; cost = 1

x " 2; c2 ; cost = 7

Fig. 1. A Reachability Priced Time Game Automaton A

Related Work. Acyclic priced (or weighted) timed games have been studied in [23] and

the more general case of non-acyclic games have been recently considered in [1]. In [1],

the problem they consider is “compute the optimal cost within k steps” (we refer to this

Mint(Max(5t + 10(2 ! t) + 1, 5t + (2 ! t) + 7))

Player I’s choice

Player II’s choice

Optimal strategy problem

t = time to wait in l0.

Mint(Max(5t + 10(2 ! t) + 1, 5t + (2 ! t) + 7))

Which is when t =3

4and the cost is 14

3

4

Optimal strategy problem

2

More recently, the ability to consider more general performance measures has been

given. Priced extensions of timed automata have been introduced where a cost c is asso-ciated with each location ! giving the cost of a unit of time spent in !. In [3] cost-boundreachability has been shown decidable. [8] and [5] independently solve the cost-optimal

reachability problem for priced timed automata. Efficient incorporation in UPPAAL is

provided by use of so-called priced zones as a main data structure [25]. In [29] the im-

plementation of cost-optimal reachability is improved considerably by exploiting the

duality with linear programming problems over zones (min-cost flow problems). More

recently [11], the problem of computing optimal infinite schedules (in terms of minimal

limit-ratios) is solved for the model of priced timed automata.

The Optimal Cost Control Problem for Timed Games. In this paper we combine the

notions of game and price and solve the problem of cost-optimal winning strategies

for priced timed game automata.The problem we consider is: “Given a timed game

automaton A, a goal location Goal, what is the optimal cost we can achieve to reachGoal inA?”. We refer to this problem as the Optimal Cost Problem (OCP). Consider theexample of a priced timed game automaton given in Fig. 1. Here the cost-rates (cost per

time unit) in locations !0, !2 and !3 are 5, 10 and 1 respectively. In !1 the environmentmay choose to move to either !2 or !3 (dashed arrows are uncontrollable). However, dueto the invariant y = 0 this choice must be made instantaneous. Obviously, once !2 or !3has been reached the optimal strategy for the controller is to move to Goal immediately(however there is a discrete cost (resp. 1 and 7) on each discrete transition). The crucial(and only remaining) question is how long the controller should wait in !0 before takingthe transition to !1. Obviously, in order for the controller to win this duration must beno more than two time units. However, what is the optimal choice for the duration in the

sense that the overall cost of reaching Goal is minimal? Denote by t the chosen delayin !0. Then 5t + 10(2 ! t) + 1 is the minimal cost through !2 and 5t + (2 ! t) + 7 isthe minimal cost through !3. As the environment chooses between these two transitionsthe best choice for the controller is to delay t " 2 such that max(21 ! 5t, 9 + 4t) isminimum, which is t = 4

3 giving a minimal cost of 14 13 .

!0

cost(!0) = 5

!1

[y = 0]

!2

cost(!2) = 10

!3

cost(!3) = 1

Goalx ! 2; c1 ; y := 0

u

u

x " 2; c2; cost = 1

x " 2; c2 ; cost = 7

Fig. 1. A Reachability Priced Time Game Automaton A

Related Work. Acyclic priced (or weighted) timed games have been studied in [23] and

the more general case of non-acyclic games have been recently considered in [1]. In [1],

the problem they consider is “compute the optimal cost within k steps” (we refer to this

Optimal strategy problem

2

More recently, the ability to consider more general performance measures has been

given. Priced extensions of timed automata have been introduced where a cost c is asso-ciated with each location ! giving the cost of a unit of time spent in !. In [3] cost-boundreachability has been shown decidable. [8] and [5] independently solve the cost-optimal

reachability problem for priced timed automata. Efficient incorporation in UPPAAL is

provided by use of so-called priced zones as a main data structure [25]. In [29] the im-

plementation of cost-optimal reachability is improved considerably by exploiting the

duality with linear programming problems over zones (min-cost flow problems). More

recently [11], the problem of computing optimal infinite schedules (in terms of minimal

limit-ratios) is solved for the model of priced timed automata.

The Optimal Cost Control Problem for Timed Games. In this paper we combine the

notions of game and price and solve the problem of cost-optimal winning strategies

for priced timed game automata.The problem we consider is: “Given a timed game

automaton A, a goal location Goal, what is the optimal cost we can achieve to reachGoal inA?”. We refer to this problem as the Optimal Cost Problem (OCP). Consider theexample of a priced timed game automaton given in Fig. 1. Here the cost-rates (cost per

time unit) in locations !0, !2 and !3 are 5, 10 and 1 respectively. In !1 the environmentmay choose to move to either !2 or !3 (dashed arrows are uncontrollable). However, dueto the invariant y = 0 this choice must be made instantaneous. Obviously, once !2 or !3has been reached the optimal strategy for the controller is to move to Goal immediately(however there is a discrete cost (resp. 1 and 7) on each discrete transition). The crucial(and only remaining) question is how long the controller should wait in !0 before takingthe transition to !1. Obviously, in order for the controller to win this duration must beno more than two time units. However, what is the optimal choice for the duration in the

sense that the overall cost of reaching Goal is minimal? Denote by t the chosen delayin !0. Then 5t + 10(2 ! t) + 1 is the minimal cost through !2 and 5t + (2 ! t) + 7 isthe minimal cost through !3. As the environment chooses between these two transitionsthe best choice for the controller is to delay t " 2 such that max(21 ! 5t, 9 + 4t) isminimum, which is t = 4

3 giving a minimal cost of 14 13 .

!0

cost(!0) = 5

!1

[y = 0]

!2

cost(!2) = 10

!3

cost(!3) = 1

Goalx ! 2; c1 ; y := 0

u

u

x " 2; c2; cost = 1

x " 2; c2 ; cost = 7

Fig. 1. A Reachability Priced Time Game Automaton A

Related Work. Acyclic priced (or weighted) timed games have been studied in [23] and

the more general case of non-acyclic games have been recently considered in [1]. In [1],

the problem they consider is “compute the optimal cost within k steps” (we refer to this

Optimal moves on rational points (not necessarily integer points !)

Optimization problems - HighlightsTheorem[ALP01,BGHRV01,BBBR07]. Optimal reachability in timed automata is PSpaceC.

Theorem[BCFL04,ABM04]. Optimal reachability in timed game automata with strictly positive costs is decidable.

Theorem[BBR04]. Cost optimal branching time logic model-checking on timed automata is undecidable.

Theorem[BBR05]. Cost optimal reachability in timed game automata is undecidable.

Theorem[BLMR06]. ε-Cost optimal reachability for 1-clock timed automata is decidable.

Theorem[BLM07]. Cost optimal branching time logic model-checking on 1-clock timed automata is decidable.

And more recently: (multi-)energy games, generalized mean-payoff games, ...

The Hydac Case-Study

2/ 30

In a few words

Case study submitted by the Hydac Gmbh Company, within the EuropeanResearch Project Quasimodo.

Concerns a controller for a Plastic Injection Molding Machine

Robust and optimal control

Tool chainSynthesis: UppAal TiGaVerification: PHAVerPerformance: Simulink

The gain is ! 45% w.r.t.classical controllers

5/ 30

Problem Description

Pump

Reservoir

Accumulator

Machine/Consumer

Vmax

Vmin

+2.2 litres/second

The Machine consumes oil inthe Accumulator

The Machine returns oil backinto the Reservoir

The Pump can move oil fromthe Reservoir into theAccumulator

The total amount of oil in thesystem is constant

6/ 30

Synthesis Objectives

Pump

Reservoir

Accumulator

Machine/Consumer

Vmax

Vmin

+2.2 litres/second Company’s Objectives:

Safety:the volume must stay within asafe interval [Vmin, Vmax ]

Optimization:minimize average/overall oil

volumeR t=T

t=0v(t)dt/T

Realistic Controllers:

Simple design

Implementability (tolerance toimprecise measures)

7/ 30

The Machine and the Pump

The Machine:

Infinite cyclic consumption to be satisfied by our control strategy.One cycle: 20 seconds ! Deterministic behaviour

0 2 4 6 8 10 12 14 16 18 200.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

2.2

2.4

2.6

2.8

3.0

Time (second)

1.2 1.2

2.5

1.7

0.5

Mac

hin

e R

ate

(lit

re/s

eco

nd

)

Fluctuations in consumption of 0.1 liter/sec! Non Determinism!

The Pump:

Latency constraint: 2 seconds between state change of the pump

7/ 30

The Machine and the Pump

The Machine:

Infinite cyclic consumption to be satisfied by our control strategy.One cycle: 20 seconds ! Deterministic behaviour

0 2 4 6 8 10 12 14 16 18 200.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

2.2

2.4

2.6

2.8

3.0

Time (second)

1.2 1.2

2.5

1.7

0.5

Mac

hin

e R

ate

(lit

re/s

eco

nd

)

Fluctuations in consumption of 0.1 liter/sec! Non Determinism!

The Pump:

Latency constraint: 2 seconds between state change of the pump

Noise iscontrolled by the

adversary

8/ 30

Two existing controllers

Bang-Bang controller:Use two min/max bounds on the volume to

control the pump:

if V < Vminsafe , start the pump;

if V > Vmaxsafe , stop the pump;

0 20 40 60 80 100 120 140 160 180 2000

5

10

15

20

25

Time [s]

Vo

lum

e [

l],

pu

mp

[o

n/o

ff]

2!point controller

Volume

Pump on/off

Company’s controller:400 lines of C-code in Simulink

Roughly, use sampling (measures every10ms) of the volume to compute adequatedates of control.

0 20 40 60 80 100 120 140 160 180 2000

5

10

15

20

25

Time [s]

Vo

lum

e [

l],

pu

mp

[o

n/o

ff]

Hydac controller

Volume

Pump on/off

! this shows less energy used: in average ! 15%

10/ 30

An hybrid modelization

i1 i2 i3 i4 i5

i6i7i8i9

[t ! 2][t ! 4] [t ! 8] [t ! 10] [t ! 12]

[t ! 14][t ! 16][t ! 18][t ! 20]

mr := 0t := 0

t = 2mr :=1.2

t = 4mr :=0

t = 8mr :=1.2

t = 10mr :=2.5

t = 12mr :=0

t = 14mr :=1.7

t = 16mr :=0.5

t = 18mr :=0

t = 20t :=0

(a) The Machine

dvdt

" [pr # m+r (!); pr # m!

r (!)]

dVaccdt

= v

[true]

v := 10.0Vacc := 0

(b) The Accumulator

O! On

[true] [true]

z := 2pr := 0

z $ 2, switch on!

pr := 2.2, z := 0

z $ 2, switch o!!

pr := 0, z := 0

(c) The Pump

! synthesize a controller for the pump ensuring safety, optimality, with asimple design, and robust to imprecisions in measures of volume and time.

Undecidable problem!

Our approach“Very” abstract UppAal-Tiga model

Abstract control strategy

Automatic synthesis

PhaVer HA modelfor robustness

verification

Simulink modelfor peformances

evaluation

Embed Embed

Model of one cycle

✓The controller gets only level at the start✓The controller can decide to open/close the pump k times during the cycle

➡ simple controller✓The noise is controlled by the adversary

➡ noise is out of control of the system✓The noise is not observable during run

➡ explicit modeling of imperfect information✓Oil level is discretized

➡ model is simple enough (but nor an over-approximation nor under-approximation)

Additional automata

Pump: on/off+delays Scheduler models the interaction between cycle || controller || pump

Complete model for one cycle

But we want a solution for an

y number of cycles

Complete model for one cycle

But we want a solution for an

y number of cycles

Idea:formulate a control

objective that ensures that the strategy is inductive

Complete model for one cycle

Inductive strategiesO

il le

vel

② Wherever we start in should end in① Safe with Min-Max ③ ⊆

④ Find ( , ) that minimizes mean-level. +margin(robustness)

Time - activations

1 cycle

18/ 30

Results - An Example

Margin: m = 0.4 l

Granularity: D = 1

Best stable interval: I = [5.1, 10]

Representation of the strategy: (for any initial volume, it gives two periods ofactivation)

50

time (s)

initia

lvo

lum

e(l

)

5 10 15

6

7

8

9

10

PhaVer model

PhaVer HA modelfor robustness

verification

One of our automatically generated controller

☛ Robustness OK☛ Any number of cycles OK☛ Correctness w.r.t. over-approx. OK

Verification necessarybecause our game model

is “very” abstract !

Performace evaluation

Mathlab - Simulink

Bang-BangHydac solution

or Automatically generated controller

Simulation

Stochastic model of noise

27/ 30

Plots of execution

0 20 40 60 80 100 120 140 160 180 2000

5

10

15

20

25

Time [s]

Vo

lum

e [

l],

pu

mp

[o

n/o

ff]

2!point controller

Volume

Pump on/off

0 20 40 60 80 100 120 140 160 180 2000

5

10

15

20

25

Time [s]

Vo

lum

e [

l],

pu

mp

[o

n/o

ff]

Hydac controller

Volume

Pump on/off

0 20 40 60 80 100 120 140 160 180 2000

5

10

15

20

25

Time [s]

Vo

lum

e [

l],

pu

mp

[o

n/o

ff]

m4g1

Volume

Pump on/off

28/ 30

Numerical results

Performances obtained according to Simulink simulations:

Controller Acc. volume Mean volume Mean volume (Tiga)

Bang-Bang 2689 13.45 -Hydac 2232 11.16 -

G1M4 1511 7.56 8.45G1M3 1511 7.56 8.35G1M2 1518 7.59 8.25G1M1 1518 7.59 8, 2

G2M4 1527 7.64 8.05G2M3 1513 7.57 7.95G2M2 1500 7.5 7.95G2M1 1489 7.44 7.95

This shows a vaste reduction of energy used:! 45% w.r.t. Bang-bang controller! 33% w.r.t. Hydac’s controller

Conclusion

• Automatically generated controllers

• Correct and robust !

• More efficient than BB and Hydac solutions.

• Aggressive abstractions: need for verification.

• Explicit modeling of imperfect information (algorithms for imperfect information not yet available).

• Discretization of costs:

• we need further research efforts !

• new interesting theoretical questions !