Unità di Perugia e di Roma “Tor Vergata” "Uncertain production systems: optimal feedback control of the single site and extension to the multi-site case"

Unità di Perugia e di Roma “Tor Vergata”

"Uncertain production systems: optimal feedback controlof the single site and extension to the multi-site case"

workshop

Ottimizzazione e Controllo delle Supply Chain

Siena, Certosa di Pontignano, 23-25 ottobre 2005

Francesco MartinelliFabio Piedimonte

Università di Roma “Tor Vergata”

Mauro BoccadoroPaolo Valigi

Università di Perugia

Unità di Perugia e di Roma "Tor Vergata" 2/31

x(t)

(t)u(t)

],0[)( tu

backlog/inventory level at time t (fluid model)

x(t):

dtux )(

(t): The machine is failure prone, (t)=1 if the machine is up at time t, (t)= 0 if the machine is down, with failures and working times characterized by some deterministic or random law, depending on the production control

d


Two main objectives:

In the literature, in the Markov case, it has been observed (mainly numerically) a relevant difference between the case the failure rate is a convex function of the production rate and the case it is concave [Hu Vakili Yu, 1994; Liberopoulos Caramanis, 1994]

Explore this analytically in the Markovian and in the non Markovian (deterministic) case

Several papers on single failure prone machines:

Explore the multi-site case where the production of each site may be increased by the production of the other, with some penalty (modeling for example transportation costs)


Minimize

T

TdttxgE

TJ

0

)]([1

lim

u(t) 0

x

d

cp

cm

g(x)

BacklogInventory


0 1

Machinedown

Machineup

qu

qd(u)

Markov

The site is modeled as a failure prone machine with a failure-repairprocess which can be:

Deterministic

Deterioration rate:

The machine is stopped for a repair/maintainance operation when z(t)=1

btuatz )()(

The single site case


Optimal policy: hedging point policy (Kimemia and Gershwin, 1983; Bielecki and Kumar, 1988)

*

*

*

*

0

)(

zx

zxd

zx

xu

t

x(t)

z

otherwisez

p

mp

mp

m

c

cc

ccc

)1(log

01

*

))(()(

ud

udu

qqdqqdq

)(

)(ddqqdq udu

Single site, Markov: the homogeneous case (qd constant)


],0(

],()(

2

1

Uuq

Uuquq

d

dd

u

qd(u)

U

qd1

qd2

d

Single site, Markov: a non homogeneous case (qd=qd(u))


*

*

**

*

*

0

),()(

Zx

Zxd

ZXxU

Xx

xu

(OPT)

t

x(t)

Z

X




Procedure followed for the proof and for the computation of the optimal thresholds X* and Z*

Take X Z and apply policy (OPT). At steady state the buffer level is a random variable with pdf:

Xx

ZXxxp

deK

dUUeK

XZ xXXZ

xZ

)(1)(20

)(20 ),(

)(

where:

22

)(2

1

)(2 1100

/1),(

d

ZXZX

qUee

dZXKK

and )( 11 udu qqdq )( 22 udu qqdUq

)(11

dd

)(22

dUd



For the level x=Z, there is a point mass probability (X,Z):=K0(X,Z)d/qd2

Z X have to be properly selected to minimize:

)(),()()(),( ZgZXdxxpxgZXJZ

XZ

Once X* and Z* have been found and the optimal J* has been derived, compute the cost-to-go functions solving the HJB equations where the min operationhas been replaced by the (supposed) optimal policy u*(x):

)()()()]([))(( *01

*1* xgJxVxVxuqdx

dVdxu d

*01

0 )()()( JxgxVxVqdx

dVu



Once the cost-to-go functions V0(x) and V1(x) have been computed, show thatthese functions, with the policy considered to compute them, satisfy the followingHJB equations:

)()()()()(min *01

1

],0[xgJxVxVuq

dx

dVdu d

u

*01

0 )()()( JxgxVxVqdx

dVu

If these equations are satisfied and the cost-to-go functions are C1 and boundedby a quadratic function, then the considered policy is optimal.


Single site, Markov: a non homogeneous case (qd=qd(u))Computation of X* and Z*







=30; U=22; qd1=0.06; d=20; cm=100; cp=1; qu=0.5Example


Single site, Markov: a general heuristic approach for the non homogeneous case

In the general case we propose the following heuristic approach:

discretize qd(u) obtaining a multi-value failure rate function with production levels Ui and corresponding failure rates qdi

apply the results of the two level failure rate case to the multi-value case by considering each couple (Ui, Uj) and the corresponding qdi and qdj: this gives a threshold X*

ij, such that

select the longest sequence of all the X*

ij computed

*

**

*

* ),(

0

)(

ijj

ijiji

ij

XxU

ZXxU

Zx

xu *24X

*45X

Example: x2U

4U

5U


For multi-value failure rate functions (as the ones obtained by discretizing qd(u) = a u+ b), Liberopoulos and Caramanis (IEEE TAC 1994) numerically found that:

if ≤1, the optimal feedback policy will operate the machine at maximum rate until a safety stock Z* is reached (i.e. it is a hedging point policy)

if >1, the optimal feedback policy will operate the machine progressively reducing the production rate from its maximum value as the inventory level increases

The heuristic proposed above confirms these findings.

Z*

x

u*(x)

x

u*(x)

Z*




Example=50; d=1; cm=1000; cp=1; qu=0.5



Example

For qd2=0.01 the points (Ui,qdi) lie on a line.

U1=50; U2=25; U3=5; qd1=0.02; qd3=0.002; d=1; cm=1000; cp=1; qu=0.5


The discussion above seems in conflict with the results of Hu, Vakili and Yu (IEEE TAC, 1994) where hedging policy is proved optimal iff =0 or 1.

Remark.

This is not a conflict: if 0<<1 the optimal policy probably is a switched non-feedback policy, with the hedging point policy remaining optimal among feedback policies.

Single site, deterministic

To clarify this we have considered a deterministic system and approached it through the Maximum Principle.

g(x) =c x2

To simplify the analysis we have considered a symmetric system and a quadratic cost function

Deterioration rate:

The machine is stopped when z(t)=1. After each repair z=0.

btuatz )()(

The system is stable if and only if there exists a constant production rate (not larger than ) which is large enough to meet the demand


The analysis of this case confirms the heuristic and the numerical results of the Markov system:

Single site, deterministic

If =0 or =1 (affine case) the optimal policy is -d- (similar to the hedging point policy)

x(t)

0

If 0<1, the optimal policy looks macroscopically like the -d- but an infinite number of switches between 0 and is performed to obtain a production rate equal to d

If >1, the optimal policyreduces the productionrate around 0

0

lim

0

x(t) x(t)

0


Multi site, Markov, homogeneous

Each site is like the one considered by the classical paper of Bielecki and Kumar, for which the optimal policy is optimal.

x(t)

(t)u(t)

x(t)(t)

u(t)

u(t)

d

d

u(t)

A penalty cost (a) is incurred whenever a site receives items produced by the other site

A two site system

1221

2

1

),( uuaxcxcuxgi

imip

T

TdttutxgE

TJ

0

)](),([1

lim



Using a dynamical programming approach, in the s=(1,1) operational state, it is possible to expect the following regions, whose shape in the state space (x1,x2) is usually very complex to derive:

ii x

Vv

11

V11(x) being the cost-to-go function in the operational state (1,1)



Through a numerical integration of the HJB equations (for a finite inventorysystem with loss cost R, x=0.1), we have derived the following solutions, corresponding to the s=(1,1) state (arrows denote the production flow):

a=10

a=50

a=System parameters:=5, d=4, qu=1, qd=0.01, cm=50, cp=1, R=2500



In the case the operational state is s=(0,1) and a=50:



Single site theoretical values: z*=3.8, J*=7.73

Hedging point and total cost as a function of the cost parameter a:



Numerical solution through Hamilton Jacobi Bellman (HJB) equations

Performance index to minimizeOptimal value: J*

Js(k)(x) The minimum average expected cost on a time horizon kt, starting in

(s,x), hence it is 0 for k=0 for all s and x

Iterative equation(discretized space):

limk!1Js(k)(x) = J*

It gives the optimal minimum cost J* but not the optimal policy



Applying a stable stationary policy, let at steady state J=E[g(x,u)]

Then define a differential cost:

The total (not average) expected cost in [0,T] from x(0)=x and s(0)=s can be written as J T + Vs(x). For the optimal policy, J=J* and we have for its differential cost:

Vs(k)(x) The minimum expected differential cost on a time horizon kt, starting in

(s,x), hence it is 0 for k=0 for all s and x

Iterative equation(discretized space):

limk!1Vs(k)(x) = V*

s(x) From V*s(x) it is straightforward to get the optimal policy




A single site and a multi site system have been considered in this research.

As for the single site problem:

A similar behavior has been observed in a deterministic scenario where the machine is characterized by a deterioration rate which is a deterministic function of the production rate

The optimal analytical solution for a non homogeneous Markov failure prone system has been completely derived

This solution has been used to investigate (through a heuristic approach) the property observed in the literature that a major difference arises when the failure rate of the machine is a concave or a convex function of the production rate

As for the multi site problem, a HJB approach has been used to analyze a Markov, homogeneous, two site system, and the optimal solution has been completely derived numerically for some examples


The general Markov non homogeneous case could be better analyzed, improving the heuristic and studying its validity

The deterministic case should be generalized and possibly approached through a numerical algorithm to solve the maximum principle equations

As for the single site problem:

As for the multi site problem:

More general models to describe some typical dynamical phenomena of supply chains are under investigation

Documents

Unità di Perugia e di Roma “Tor Vergata” "Uncertain production systems: optimal feedback control of the single site and extension to the multi-site case"