36
1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000 http://www.cs.washington.edu/homes/anderson/msrcn.ppt

1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

Embed Size (px)

Citation preview

Page 1: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

1

Combinatorial Optimization for Text Layout

Richard Anderson

University of Washington

Microsoft Research, Beijing, September 6, 2000

http://www.cs.washington.edu/homes/anderson/msrcn.ppt

Page 2: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

2

Biography Background

Education PhD Stanford (1985), Post Doc MSRI, Berkeley

Experience University of Washington, since 1986. Associate Chair for

outreach. Visiting prof. IISc, Bangalore, 1993-1994

Professional Interests Algorithms

Parallel algorithms, N-Body Simulation, Model Checking for Software, Text Layout

Distance Learning Tutored Video Instruction, Professional Master’s Program

Page 3: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

3

Optimization for Text Layout Express text placement as a geometric

optimization problem.

Why??? Generate best layouts Body of algorithmic research to build on, as well

as high performance hardware Problem specification and formalization Flexibility via parameterization

Page 4: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

4

TeX [Knuth] Typography as optimization

Optimal paragraphing via dynamic programming algorithm

Flexibility Tradeoff between uneven lines and

hyphenation frequency Penalty: weighted sum of whitespace and

hyphenation penalties

Page 5: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

5

Outline

Survey of problems studied 1) Generating all paragraphs of text 2) Picture layout with anchors to text 3) Optimal table layout 4) Customized content compression

Page 6: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

6

Paragraphing problem Given geometric constraints, find line breaks

Fixed width, find minimum height Greedy Algorithm

Fixed height, find minimum width Only need to consider n2 widths: O(n3) algorithm. Most practical approach – binary search on width.

O(nlog W) algorithm Theoretical O(n) algorithm

Page 7: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

7

All minimal paragraph sizes Find minimum width paragraph for a given height. Solve for each height: best known: O(n3/2)

Malfoy couldn’t believe his eyes when he saw that Harry and Ron were still at Hogwarts the next day, looking tired but perfectly cheerful.

Malfoy couldn’t believe his eyes when he saw that Harry and Ron were still at Hogwarts the next day, looking tired but perfectly cheerful.

Malfoy couldn’t believe his eyes when he saw that Harry and Ron were still at Hogwarts the next day, looking tired but perfectly cheerful.

Page 8: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

8

All minimal paragraph sizes

Motivation Placement of floating text Formatting tables with text entries

Basic approach Break into segments of roughly n1/2 words each Compute possibilities for these, and then combine

Much work still to do on this problem

Page 9: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

9

Placement of text and pictures

Given text with embedded pictures and tables

Place pictures close to their references (anchors)

This is a major headache when using LaTeX! Futher complications

Multi-column layouts Partial column width pictures Typographic considerations for text and headings Other graphical layout considerations

Page 10: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

10

Placement of text and pictures Given text and pictures, where each picture

has a location in the text, find a layout which minimizes the sum of the text-anchor distances

Single page and multi page problems Horizontal placement of pictures fixed wrt

column boundaries May require that picture order is consistent

with text order

Page 11: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

11

Page 12: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

12

Results 2-d bin packing problem – do the pictures fit

on the page. May not be the problem of interest – simper

cases – pictures fit in columns, align with text rows, fixed horizontal position in columns.

Easy for one column. NP-complete for three or more columns. NP-complete even if picture area is very

small.

Page 13: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

13

Fixed horizontal bin packing Two-d bin packing, except that rectangles have fixed

horizontal positions Motivated by picture placement Best known result: 3-approximation algorithm Problem arises in memory allocation

Page 14: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

14

Practical results The number of pictures and columns is small.

(columns <= 5, pictures <= 10). Enumeration works well for pictures <= 3. Branch and bound works well for pictures

<=6. Heuristics + B&B work well for given range. Prototypes developed, including typography

and aesthetic considerations. Very interesting layouts generated

Page 15: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

15

Tables General Problem

Given a set of configurations for each cell, find the maximum value table that satisfies size constraints

Special Cases Layout Problem

No values, minimize table height for fixed width Compression Problem

Configurations for a cell satisfy nesting property Value decreases with size

Page 16: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

16

Layout Problem (with S. Sobti)

NP complete Restricted instances: {(1,2), (2,1)}, {(1,1)}

Divination. Sybill Trelawney

Defense against dark arts. R. J. Lupin

Potions. Severus Snape

Care of magical creatures. Rubeus Hagrid

Divination. Sybill Trelawney

Defense against dark arts. R. J. Lupin

Potions. Severus Snape

Care of magical creatures. Rubeus Hagrid

Page 17: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

17

Layout Problem: results

Fixed W, minimize H, NP complete

Minimize W+H solvable with mincut algorithm

Compute convex hull of feasible table configurations

Heuristic algorithm

Page 18: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

18

Table compression problem Display a table in less than the required

area, with a penalty for shrinking cellsDivination. Sybill Trelawney

Defense against dark arts. R. J. Lupin

Potions. Severus Snape

Care of magical creatures. Rubeus Hagrid

Divin. Sybill T.

Defense against dark arts. Lupin

Potions. Severus Snape

Care of magical creatures. Hagrid

Divin. Sybill T.

Def. dark arts. Lupin

Potions. Severus Snape

Care of magical critters. Hagrid

Divin. Sybill T.

Def. dark arts. Lupin

Potions. S. Snape

Care of creatures. Hagrid

Divin. Sybill T.

Dark arts. Lupin

Potions. S. Snape

Critr care. Hagrid

Div D. arts. Lupin

Pot

Critters.Hagrid

Page 19: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

19

Compression Problem NP complete for simple case

Choice cells: 1 x 1 (value 1), 0 x 0 (value 0) Dummy cells: 0 x 0 (value 0) Maximize number of full size choice cells in

when table n x n table compressed to n/2 x n/2.

Reduction from clique problem Incidence matrix reduction

Page 20: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

20

Attacking the 0-1 problem

1

2

1

3 3

2

4 4

Choose n/2 vertices from each side to maximize the number of edges between chosen vertices

Equivalent problem: maximum density (n/2,n/2)-subgraph of a (n,n)-bipartite graph

Page 21: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

21

Greedy Algorithm Find MDS of G=(X,Y,E)

Choose X’, the set of n/2 vertices of highest degree w.r.t. Y

Choose Y’, the set of n/2 vertices of highest degree w.r.t. X’

Claim: (X’,Y’) is a 1/2 approximation of the MDS

Proof: (X’,Y) has at least as many edges as the MDS.

(X’,Y’) has at least half as many edges as (X’,Y)

Page 22: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

22

Greedy Algorithms

Non-bipartite graphs Add vertices of maximum degree starting

with empty graph Remove vertices of minimum degree,

starting with full graph 4/9 approximation algorithm (Asahiro et al.)

Open problem: generalize and analyze greedy algorithms for tables

Page 23: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

23

Semidefinite programming Maxcut problem: divide vertices of a graph into two sets to

maximize number of edges between the sets. Goemans-Williamson SDP result:

Improved approximation bound from 0.5 to 0.878 Introduced new technique to the field Idea - solve the problem on an n-dimensional sphere, use a random

projection to divide vertices.

MDS problem can also be attacked with SDP. Technical problems with bipartiteness and equal division lead to a weak result.

Page 24: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

24

Research directions

Can semidefinite programming beat the greedy algorithm on the 0-1 problem?

Develop greedy algorithms for the general case. Linear programming: fractional solution to table

problems has a natural interpretation. Results on rounding? Combinatorial algorithms for the fractional problem.

Develop/analyze fast heuristic algorithms

Page 25: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

25

Content Choice If information does not fit, allow substitutionsThe Dark Forces: A Guide to Self-Protection, Quenton Trimble, Hogwarts Academic Press, Hogsmeade, 1999, 2nd Edition, 238 pages, Albus Dumbledore editor.

The Dark Forces: A Guide to Self-Protection, Quenton Trimble, Hogwarts, Hogsmeade, 1999, 2nd Ed., 238 pp.

The Dark Forces: A Guide to Self-Protection, Quenton Trimble, Hogwarts Ac. Press, Hogsmeade, 1999, 2nd Edition, 238 pages

The Dark Forces: A Guide to Self-Protection, Quenton Trimble, Hogwarts Ac. Press, Hogsmeade, 1999, 2nd Ed., 238 pp, Albus Dumbledore ed.

Page 26: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

26

The Dark Forces: A Guide to Self-Protection, Q. Trimble, HAP, Hogs., `99, 2nd, 238 pp.

The Dark Forces, Q. Trimble, HAP, Hogs., 1999, 2nd, 238 pp.

The Dark Forces: Self-Protection, Q. Trimble, HAP, 1999, 2nd, 238 pp.

The Dark Forces Q. Trimble, HAP, `99, 2nd, 238 pp.

Dark Forces, Q. Trimble, HAP, `99, 2nd.

Dark Forces, Q. Trimble, HAP, 1999.

Dk. Forces, Q. Trimble, HAP, 1999.

Dark Forces, Trimble.

Page 27: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

27

Source representation

<text> <choice> <fragment val=90> The Dark Forces: A Guide to Self-Protection </fragment> <fragment val=50> The Dark Forces: Self-Protection </fragment> <fragment val=30> The Dark Forces</fragment> <fragment val=20> Dark Forces</fragment> <fragment val=10> Dk. Forces</fragment> </choice> <choice> <fragment val=30> Hogwarts Academic Press </fragment> <fragment val=20> Hogwarts Ac. Press </fragment> <fragment val=15> Hogwarts </fragment> <fragment val=10> HAP </fragment> <fragment val=0> </fragment> </choice> . . . </text>

Page 28: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

28

Typography with content choice

Problem 1: Given a fixed area for the text, find the

optimal choice of content Problem 2:

Find the set of all maximal configurations Problem 3:

Find a good approximation to the set of all maximal configurations

Page 29: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

29

Content Choice

Algorithmic choice: rectangles with values. Place one rectangle from each set to maximize value.

4040

25 20 15

Page 30: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

30

Warm up problem: Lists Optimally display the

list for a fixed height Set of configurations

for each list item. (height, value)

Solvable with knapsack dynamic programming algorithm

Page 31: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

31

List compression

Harry Potter and the Prisoner of Azkaban ~ J. K. Rowling / Hardcover / Published 1999 Our Price: $9.98 Harry Potter and the Sorcerer's Stone J. K. Rowling / Hardcover / Published 1998 Our Price: $8.98 Harry Potter and the Chamber of Secrets J. K. Rowling / Hardcover / Published 1999 Our Price: $8.98

Harry Potter and the Prisoner of Azkaban ~ Usually ships in 24 hours J. K. Rowling / Hardcover / Published 1999 Our Price: $9.98 ~ You Save: $9.97 (50%) Harry Potter and the Sorcerer's Stone ~ Usually ships in 24 hours J. K. Rowling / Hardcover / Published 1998 Our Price: $8.98 ~ You Save: $8.97 (50%) Harry Potter and the Chamber of Secrets J. K. Rowling / Hardcover / Published 1999 Our Price: $8.98 ~ You Save: $8.97 (50%)

Harry Potter and the Prisoner of Azkaban ~ J. K. Rowling / HC / Publ 1999 Our Price: $9.98 Harry Potter and the Sorcerer's Stone J. K. Rowling / HC / 1998 $8.98 Harry Potter and the Chamber of Secrets J. K. Rowling / HC / 1999 $8.98

Harry Potter and the Prisoner of Azkaban J. K. Rowling $9.98 Harry Potter and the Sorcerer's Stone Rowling HP : Chamber of Secrets

Page 32: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

32

Implementation goal

Real time resizing of lists Maintain optimal display as window size

changes. Recompute at refresh rate Knapsack/dynamic programming

algorithm http://www.cs.washington.edu/homes/anderson/demo2/Page1.htm

Page 33: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

33

Customization

Choice-content generation Generate choices for fields

Automatic abbreviations Dictionary lookup

Assign weights Based on compression and component Based on user profile

Page 34: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

34

Browsing applications Browsing book lists

User sets degree of compression Issues query Source gives default weights

Value of field Strength of match Value of item

Weights modified based on user profile Optimal list display done for given compression

factor

Page 35: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

35

Display of 2-d time tables

Show most likely routes and times at highest precision

Based on user profile and travel data

Memory of user interactions (expanding items)

Page 36: 1 Combinatorial Optimization for Text Layout Richard Anderson University of Washington Microsoft Research, Beijing, September 6, 2000

36

Summary Graphical layout as geometric optimization Theoretical background

Basic algorithms for rectangle placement Algorithm implementation

Performance requirements are significant Application

Do these techniques work for universal, customized display?