36
A New Model for Automated Table Layout Mihai Bilauca Patrick Healy DocEng2010, September 21– 24, 2010, Manchester, United Kingdom Department of Computer Science and Information Systems University of Limerick, Ireland Supported by Science Foundation Ireland under the research programme 01/P1.2/C009, Mathematical Foundations, Practical Notations, and Tools for Reliable Flexible Software.

DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Embed Size (px)

DESCRIPTION

In this presentation we consider the table layout problem. We present a combinatorial optimization modeling method for the table layout optimization problem, the problem of minimizing a table’s height subject to it fitting on a given page (width). We present two models of the problem and report on their evaluation. http://doi.acm.org/10.1145/1860559.1860594

Citation preview

Page 1: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

A New Model for Automated Table Layout

Mihai Bilauca Patrick Healy

DocEng2010, September 21– 24, 2010, Manchester, United Kingdom

Department of Computer Science and Information Systems University of Limerick, Ireland

Supported by Science Foundation Ireland under the research programme 01/P1.2/C009, Mathematical Foundations, Practical Notations, and Tools for Reliable Flexible Software.

Page 2: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

A New Model for Automated Table Layout

A New Model for Automated Table Layout

Overview

• An exact combinatorial optimization modelling method

for tables that do not contain spanning cells and

provide examples using OPL:

– Mixed Integer Programming model - MIP

– Constraint Programming Model - CP

• Report experimental results for tables with a size of up

to 40x40 (1,600 cells and 9,000 text configurations);

• Conclusions

Slide 2 of 35

Page 3: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

A New Model for Automated Table Layout

A New Model for Automated Table Layout

Why a new method?

• For applications where finding the layout with the minimum height for a

given width is important.

• Because it is exact (not based on heuristics)

• Priority should be given to user constraints imposed by space limitations or

other aesthetic criteria;

• To the best of our knowledge this is the first attempt to report on run time for

large table sizes and specific (rather than heuristic) cell configurations

Slide 3 of 35

Page 4: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

The Table Layout Problem

Find a layout of a table with minimum

such that

W – a given page width

λ ℑ

Wwidth <)(λ

)(λheight

A New Model for Automated Table Layout Slide 4 of 35

Page 5: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Definitions (Anderson and Sobti)

, m x n table, m rows, n columns, is a layout of

is the set of configurations for cell i,j with

the index of the configuration selected from Ci,j

where hi = max(hi,j) for each row i, wj = max(wi,j) for each column j

ijkij

kijji KkhwC ≤≤1|),{(:,

ℑ λ ℑ

ijji Kk ≤≤ ,1

∑=

=m

iihheight

1

)(λ ∑=

=n

jjwwidth

1

)(λ

jikijij hh ,= kij

ijij ww =

A New Model for Automated Table Layout Slide 5 of 35

Page 6: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Cell configurations

Example:

Cell configurations for cell i,j

this blue sky

this blue grey

this blue sky

ijkij

kijji KkwhC ≤≤1|),{(:,

A New Model for Automated Table Layout Slide 6 of 35

Page 7: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

A New Model for Automated Table Layout Slide 7 of 35

∑= =

⋅m

ikjikji

n

jxh

1,,,,

1maxminimize

subject to

1)

,1,, =∑ kjix

Wxwn

jkjikji

m

i≤⋅∑

= =1

,,,,1

max

2)

Integer Programming definition

}1,0{∈x

jiji Kk ,,1 ≤≤where

Page 8: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Table Layout problem is NP-complete

Demonstrated by:

• 1996 Wang - demonstration using large integers;

• 1999 Anderson and Sobti - using reductions of the clique problem to the table layout, on simple tables.

A New Model for Automated Table Layout Slide 8 of 35

Page 9: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

OPL

OPL – Optimization Programming Language designed for solving combinatorial optimization problems.

• support for: – MIP and constraint programming including search specification;– logical and higher order constraints;– support for scheduling and resource allocation applications;

• shares structure and syntax features with mathematical programming languages such as AMPL or GAMS

• problems can be formulated in a language similar to their algebraic notation

A New Model for Automated Table Layout Slide 9 of 35

Page 10: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

OPL Keywords

dvar – decision variable. The purpose of an OPL model is to find values for the decision variables such that all constraints are satisfied

dexpr - to express decision variables in a more compact way

{dataType} - set of type dataType

<x,y> - represents a tuple value;

A New Model for Automated Table Layout Slide 10 of 35

Page 11: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Data types

tuple Conf {int w; int h;}tuple CellConf {int i; int j; Conf c;}

int pageW; // page width{CellConf} configs; // set of cell configurations

sample data: configs = {<0,0, <127,10>>, <0,0, <92,20>>,

<0,1, <75,20>>,<0,2, <65,10>>,…,}

{Cell} cells = {<i,j> | <i,j,k> in configs}{int} rows = {i | <i,j> in cells}{int} cols = {j | <i,j> in cells}

A New Model for Automated Table Layout Slide 11 of 35

Page 12: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Basic MIP model - BMIP

dvar int cellSel[configs] in 0..1

minimize tableH

constraints{ ct1: tableW <= pageW

ct2: // select only one cell configuration forall(i in rows,j in cols) sum(<i,j,k> in configs) cellSel[<i,j,k>]== 1}

A New Model for Automated Table Layout Slide 12 of 35

Page 13: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

BMIP – Table width/height

// cell width

dexpr int cellW[<i,j> in cells] =

sum(<i,j,k> in configs) cellSel[<i,j,k>] * k.w

// column width

dexpr int colW[j in cols] =

max(i in rows) cellW[<i,j>]

// table width

dexpr int tableW = sum(j in cols) colW[j]

• Table height is computed in a similar mannerA New Model for Automated Table Layout Slide 13 of 35

Page 14: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

MIP model

MIP model is based on two observations

a) The minimum column width minW[j] is the maximum of minimum cell widths for a column j;

a) for each column, its width is one of the values selected from the union of its cell configuration widths colWset<j,k.w>;

A New Model for Automated Table Layout Slide 14 of 35

Page 15: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

MIP model

// compute minW[j]

int minW[j in columns] = max(i in rows) min(<i,j,k> in configs) k.w

int minH[i in rows] = … //and similarly minH[i]

// for each column j, the set of possible widths

{ColW} colWset = {<j, k.w> | <i,j,k> in configs:

k.w >= minW[j] &&

k.h >= minH[i]}

A New Model for Automated Table Layout Slide 15 of 35

Page 16: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

MIP model

Two decision variables:

// column width selector

dvar int colSel[colWset] in 0..1

// cell configuration selector

dvar int cellSel[configs] in 0..1

// objective function

minimize tableH

A New Model for Automated Table Layout Slide 16 of 35

Page 17: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

MIP modelconstraints{

ct1:… // table width < page width

ct2: // only one configuration per column

forall(j in columns)

sum(<j,k> in colWset) colSel[<j,k>] == 1;

ct3: // cell width <= column width

forall(j in columns, i in rows)

cellW[<i,j>] <= colW[j];

ct4:…} //select only one cell configuration

A New Model for Automated Table Layout Slide 17 of 35

Page 18: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Constraint Programming model - CP

Advantage is given by support for two main activities of combinatorial optimization algorithms: – constraint reasoning for domain reduction– search strategy specification for improved performance

Binary decision variables collSel and cellSel in the MIP model become integer decision variables colW[j] and rowH[i]

A New Model for Automated Table Layout Slide 18 of 35

Page 19: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Strategy: find colW[j] and rowH[i] from the set of possible widths and heights colWset and rowHset such that

• the height of the table layout is minimized;• table width is less than page width;• for each cell i,j there is at least one configuration k

with a width less than or equal to colW[j] and the height less than or equal to rowH[i]

][,, jrowHh kji ≤][,, jcolWw kji ≤

Constraint Programming model - CP

A New Model for Automated Table Layout Slide 19 of 35

Page 20: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

CP model - definitionstuple Dim{int index; int value;}

Reduce the set of possible widths an heights:

{Dim} colWset = {<j, k.w> | <i,j,k> in configs : k.w >= minW[j] && k.h >= minH[i]}

{Dim} rowHset = {<i, k.h> | <i,j,k> in configs : k.w >= minW[j] && k.h >= minH[i]}

A New Model for Automated Table Layout Slide 20 of 35

Page 21: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

CP model - definitions// two decision variables

dvar int colW[j in columns] in min(<j,w> in colWset) w .. max(<j,w> in colWset) w;

dvar int rowH[i in rows] in min(<i,h> in rowHset) h .. max(<i,h> in rowHset) h;

// define table width/height

dexpr int tableW = sum(j in columns) colW[j];

dexpr int tableH = sum(i in rows) rowH[i];

//objective function

minimize tableH;A New Model for Automated Table Layout Slide 21 of 35

Page 22: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

CP model - constraintsconstraints{ ct1:… // table width < page width

ct2: forall(j in columns) sum(<j,w> in colWset) (colW[j]==w) == 1

ct3: forall(i in rows) sum(<i,h> in rowHset) (rowH[i]==h) == 1

ct4: forall(i in rows, j in columns) sum(<i,j,k> in fConfigs) (k.w <= colW[j] && k.h <= rowH[i]) >= 1}

A New Model for Automated Table Layout Slide 22 of 35

Page 23: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

CP model – search strategy

In OPL 5.5 there is no search language. The default search can be tuned, by specifying search phases.

Maximal regret heuristic: the regret of a column is the difference between its first and second choice of possible width value

A New Model for Automated Table Layout Slide 23 of 35

Page 24: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

CP model – setting search strategy

var f = cp.factory

var phase0 = f.searchPhase(colW, f.selectLargest(f.regretOnMax()), f.selectLargest(f.value()));

var phase1 = f.searchPhase(rowH, f.selectLargest(f.regretOnMin()), f.selectSmallest(f.value()));

cp.setSearchPhases(phase0, phase1);

A New Model for Automated Table Layout Slide 24 of 35

Page 25: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Adding user imposed constraints

Adding the user constraint that all columns must be of equal width is easy:

constraints{

ct6:

forall(ordered j1, j2 in columns )

colW[j1] == colW[j2] ;

…}

the forall expression is equivalent to:

forall(j1,j2 in 1..n : j1 < j2).

A New Model for Automated Table Layout Slide 25 of 35

Page 26: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Experimental results

What are the factors that impact on computational time?

We tested the proposed models on two types of tables:• tables with a small number of cells but a large

number of cell configurations• large tables with cells that contain text with up to 6

words (size up to 40x40).

A New Model for Automated Table Layout Slide 26 of 35

Page 27: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Experimental results

Tables with large amounts of text

Running time for a 3x3 table and up to 200 words per cell.

MIP model running time for a 3x3 table; word-count per cell varies from 200 to 600.

A New Model for Automated Table Layout Slide 27 of 35

Page 28: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

A New Model for Automated Table Layout Slide 28 of 35

Page 29: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Experimental results

Tables with increasing size and configurations

Running time for a 10x10 table and up to 6 words per cell.

MIP model running time for 10x10 tables; word count per cell varies from 6 to 100.

A New Model for Automated Table Layout Slide 29 of 35

Page 30: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Experimental results

Tables with fixed number of cell configurations

Table size wSet

2x3x200 1,160

4x3x100 1,113

10x10x12 844

20x20x3 356

A New Model for Automated Table Layout Slide 30 of 35

Page 31: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Conclusions

1. OPL provides viable solutions. The MIP model which uses the CPLEX optimization engine generally provides faster results than CP

2. The CP model can find a feasible solution faster than the MIP model, but it continues to explore the solution space until it finds whether the solution found is the final objective;

3. Performance depends highly on the input data. When the page width is closer to the minimum table width the problem is harder to solve

A New Model for Automated Table Layout Slide 31 of 35

Page 32: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Conclusions

1. It is more difficult to solve the table layout problem when there is a large number of rows, columns and column configurations than when the table has a large number of cell configurations;

1. Hardware is essential:The CP engine reported that 1.76 Gb of memory was used to solve a table with 10.000 cells;

A New Model for Automated Table Layout Slide 32 of 35

Page 33: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Conclusions

1. The constraint programming model may be slower than the MIP model due to the version of the CP engine used in testing. ILOG stopped supporting the CP engine in 2005 but reintroduced it in 2007. Not all search features are available

1. Using a modelling language allows user imposed constraints to be easily added to the model

1. Paragraphing – is an essential step as it impacts on the quality and computational time

A New Model for Automated Table Layout Slide 33 of 35

Page 34: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Future work

To develop a model for tables with• spanning cells• inner tables• other user constraints, i.e. constraints on group of

rows/columns

Note: We already developed a model for tables with spanning cells ☺

A New Model for Automated Table Layout Slide 34 of 35

Page 35: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Acknowledgements

We would like to express our gratitude to

Prof. David Parnas

for motivating much of this work

A New Model for Automated Table Layout Slide 35 of 35

Page 36: DocEng2010 Bilauca Healy - A New Model for Automated Table Layout

Acknowledgements

Questions ?

www.tabularlayout.org

A New Model for Automated Table Layout