T. Messelis, S. Haspeslagh, P. De Causmaecker B. Bilgin, G. Vanden Berghe

Hardness Studies for NRP

T. Messelis, S. Haspeslagh, P. De CausmaeckerB. Bilgin, G. Vanden Berghe

2

Introduction Method Our work Conclusions Future work

Overview

T. Messelis, S. Haspeslagh, P. De Causmaecker, B. Bilgin, G. Vanden Berghe

T. Messelis, S. Haspeslagh, P. De Causmaecker, B. Bilgin, G. Vanden Berghe 3

predict performance◦ of one or more algorithms◦ on a specific problem instance

to be able to◦ know in advance how good an algorithm will do◦ choose the ‘best’ algorithm out of a portfolio◦ choose the ‘best’ parameter setting

Introduction


Build empirical hardness models◦ empirical: performance of some algorithm ◦ hardness: measured by some performance

criteria time spent by an algorithm searching for a solution quality of an (optimal) solution gap between found and optimal solution

Model hardness as a function of features◦ computationally inexpensive ‘properties’

e.g. clauses-to-variables ratio (SAT) e.g. maximum consecutive working days (NRP)

Method


Introduced by K. Leyton-Brown et al.1. Select problem instance distribution2. Select one or more algorithms3. Create a set of features4. Generate an instance set, calculate features and

determine the algorithm performances5. Eliminate redundant or uninformative features6. Use machine learning techniques to select functions

of the features that approximate the algorithm’s performances

K. Leyton-Brown, E. Nudelman, Y. Shoham. Learning the empirical hardness of optimisation problems: The case of combinatorial auctions. In LNCS, 2002

General procedure


This strategy has been successful in different areas:◦ combinatorial auction: winner determination

problem◦ uniform random 3-SAT

accurate algorithm performance prediction algorithm portfolio approach (SATzilla)

won several gold medals in SAT competitions

Apply it to Nurse Rostering!

Our motivation


problem of assigning nurses to shifts, given a set of hard and soft constraints

Performance:◦ time spent by a complete search algorithm to find

the optimal roster◦ quality of this optimal roster◦ quality of a roster obtained by a heuristic

algorithm, ran for some fixed period of time◦ quality gap between both solutions

Nurse Rostering Problem


Translate NRP instances to SAT instances and use existing SAT features to build models

translation based on numberings

solve instances to optimum using CPLEX run a metaheuristic for 10 seconds

NRP: first approach


Regression results on predicting CPLEX objective

NRP: first approach

Regression Statistics

Multiple R 0,994111737

R Square 0,988258146

Adjusted R Square 0,986086846

Standard Error 3,055746115

Observations 500

ANOVA df SS MS F Significance F

Regression 7 387449,5709 55349,93875927,65075

2 0

Residual 493 4603,429069 9,337584318

Total 500 392053

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 0 - - - - -

VCG CN variation 549,4532437 26,55991161 20,68731446 6,6305E-69 497,2686628 601,6378245

VCG CN max 16,14493411 0,510688309 31,61406642 1,1547E-120 15,14154014 17,14832809

VCG VN min -27,68374404 0,867307099 -31,91919458 4,7809E-122 -29,38781814 -25,97966995

CG mean -126,0651055 3,028079093 -41,63203852 1,6608E-163 -132,0146372 -120,1155737

BAL PL/C variation 14819,45307 331,7582798 44,66942944 1,9828E-175 14167,61857 15471,28757

BAL 1 -20743,13734 555,6276197 -37,33280457 8,8022E-146 -21834,82751 -19651,44717

BAL 3 -1719,072905 46,94112572 -36,62189346 9,3813E-143 -1811,30224 -1626,843571


Come up with a feature set specifically for NRP and build models from these features

on the same set of NRP instances for the same performance indicators

NRP: second approach


very simple set of features:◦ some problem parameters

max & min number of assignments max & min number of consecutive working days max & min number of consecutive free days

◦ and ratio’s of those parameters max cons working days / min cons working days max num assignments / min cons working days availability / coverage requirements (tightness) ...



Regression results on predicting CPLEX objective


Regression Statistics

Multiple R 0,970029662

R Square 0,940957544

Adjusted R Square 0,940117509

Standard Error 2,589352208

Observations 500

ANOVA df SS MS F Significance F

Regression 7 52571,81553 7510,2593621120,14096

4 1,2709E-297

Residual 492 3298,734469 6,704744855

Total 499 55870,55

CoefficientsStandard

Error t Stat P-value Lower 95% Upper 95%

Intercept 66,15482266 1,55195585 42,62674268 2,6721E-167 63,10554404 69,20410128

max num assignments (6 - 10) -6,522149568 0,134955581 -48,32812042 5,9072E-189 -6,787309925 -6,256989211

min cons working days (2 - 6) 2,783980539 0,249717561 11,14851728 6,86075E-26 2,293336157 3,274624921

max cons working days (3 - 8) -2,616555252 0,078539247 -33,31525785 3,2938E-128 -2,770868949 -2,462241554

min cons free days (1 - 3) 7,971520213 0,658767454 12,10065884 1,09346E-29 6,677175717 9,26586471

max cons free days (2 - 3) -2,153326636 0,589449923 -3,6531120850,00028689

1 -3,311476237 -0,995177036max num assignments / min cons work days 1,589055655 0,337448761 4,709027968 3,24188E-06 0,926037249 2,252074061

max cons FD / min cons FD 2,646476238 0,638865523 4,142462133 4,0444E-05 1,391235001 3,901717476


We can build accurate models to predict algorithm performance, based on very basic properties of NRP instances.◦ objective values

models for the objective values of both CPLEX and the metaheuristic are fairly accurate

◦ gap less accurate predictions, however with a standard

error of 0.95◦ CPLEX running time

not very accurate, due to very high variability in the running time

Conclusions


building models on a larger scale◦ now only a very limited dataset

more (sophisticated) features for NRP instances◦ now only a very basic set with some aggregate

functions of it

combining both SAT features and NRP features

Future work


Questions?

Documents

T. Messelis, S. Haspeslagh, P. De Causmaecker B. Bilgin, G. Vanden Berghe