48
Building Table Formatting Tools Mihai Bilauca Patrick Healy DocEng2011, September 19– 22, 2011, Mountain View, California Department of Computer Science and Information Systems University of Limerick, Ireland Supported by Science Foundation Ireland under the research programme 01/P1.2/C009, Mathematical Foundations, Practical Notations, and Tools for Reliable Flexible Software.

DocEng2011 Bilauca Healy - Building Table Formatting Tools

Embed Size (px)

DESCRIPTION

We present an overview of the challenges to overcome when developing table authoring tools, including a review of logical table models, typographical issues and automated table layout optimization. We present a Table Drawing Tool prototype which implements an automated solution for the table layout optimization problem for tables with spanning cells using a mathematical modelling method. We report on the performance improvements of this new optimization method compared to previous solutions. http://doi.acm.org/10.1145/2034691.2034696

Citation preview

  • 1. DocEng2011, September 19 22, 2011, Mountain View, California Building Table Formatting Tools Mihai Bilauca Patrick Healy Department of Computer Science and Information Systems University of Limerick, Ireland Supported by Science Foundation Ireland under the research programme 01/P1.2/C009, Mathematical Foundations, Practical Notations, and Tools for Reliable Flexible Software.
  • 2. Building Table Formatting Tools Why this paper? Tables are widely used for presenting logical relationships between data items; Widely spread WYSIWYG tools only control presentation parameters; Existing tools have limitations; Authoring tables is hard, time consuming and error prone; Building Table Formatting Tools Slide 2 of 48
  • 3. Building Table Formatting Tools Overview A) A review of logical table models; B) Table Drawing Tool Prototype: Typographical issues; C) Automated table layout optimization for tables with spanning cells: Report experimental results for large tables Comparison with previous models; A simplified solution for nested tables; A) Conclusions Building Table Formatting Tools Slide 3 of 48
  • 4. Logical table models Building Table Formatting Tools Slide 4 of 48
  • 5. Logical table models 1985 Beach An for Tioga DS that formats itself based on given dimensions A grid with table topology and table geometry 1989 Cameron Identifies 3 mental processes: structure, content & visual editing 1989 Vanoirbeek a multidimensional collection of logically connected data items each dimension is divided in rubrics to classify data items merged items where an item is connected to multiple rubrics defines tables in a language based on attribute grammars Building Table Formatting Tools Slide 5 of 48
  • 6. Logical table models 1996 Wang layout structure (topology & typographic style) logical structure: a mathematical model based on unordered sets identified operations at table, category and label level 2010 Parnas meaningful expressions which can be evaluated and validated indexation separates table semantics from appearance An indexed set of grids < GS, I, x > and each grid is an indexed set of expressions A grid < SetExp, J, y > is a set of expressions SetExp, an indexed set J and a mapping y between SetExp and J Building Table Formatting Tools Slide 6 of 48
  • 7. Table Drawing Tool Prototype: Typographical issues; Building Table Formatting Tools Slide 7 of 48
  • 8. Table Drawing Tool - TDT Part of a larger project developed at University of Limerick Ireland, a tabular expressions tool: Table Input Tool; Kernel; Evaluation module; Table Drawing Tool; concerned only with drawing tables on a rendering device depending on the table layout using a layout optimizer; current version supports rectangular tables which can be rendered in PDF using a layout optimizer based on IBM ILOG OPL; Building Table Formatting Tools Slide 8 of 48
  • 9. TDT Circular layout? Table 1 is represented with both rectangular and a circular layout Building Table Formatting Tools Slide 9 of 48
  • 10. TDT Device The device is an abstract object that must provide functions to: open and close the device; set colours, fill colours, line styles; font settings (font family, style, size); vertical and horizontal alignment; text spacing (word, character and line spacing); draw curves (bezier); support for images; support for paging the content; The device must return measurement units in the TDT coordinate system; Building Table Formatting Tools Slide 10 of 48
  • 11. TDT Hierarchical model A table layout tool needs to be as powerful as a page layout tool TDT implements a hierarchical structure of containers (similar to the CSS) It is each containers responsibility to format itself Each container must implement computeLayout and render on device functions For drawing tables TDT implements a box model; Support for Hierarchical propagation of typographical parameters Layered control of typographical parameters Building Table Formatting Tools Slide 11 of 48
  • 12. TDT The Box model For drawing tables TDT implements a box model (similar to CSS) Building Table Formatting Tools Slide 12 of 48
  • 13. TDT Typographical issues table layout: a caption and a layout (rectangular, circular, custom) caption: label, number and caption text; Global numbering styles must be defined; example: Table A.12 Test data laying containers: fixed and floating positioning. lay floating (nonfixed) containers on lines; complexity is increased with fixed position containers lines: styles and joins: pattern, line cap, line join and miter limit; The pattern is defined as a dash array and a phase margins: add or merge adjacent margins? Building Table Formatting Tools Slide 13 of 48
  • 14. TDT Typographical issues coordinates space: TDT uses a coordinate system with points (1/72 inch) text direction: right-to-left, top-to-bottom equations: TDT uses WebEQ to render MathML expressions images: conversion from pixels in the units of the coordinate system; borders: without a clear definition of the lines styles, dealing with borders can be a major cause of frustration: border space; border alignment; border priority; Building Table Formatting Tools Slide 14 of 48
  • 15. Border conflicts in Open Office Building Table Formatting Tools Slide 15 of 48
  • 16. Automated table layout optimization Building Table Formatting Tools Slide 16 of 48
  • 17. Building Table Formatting Tools Table layout optimization For applications where finding the layout with the minimum height for a given width is important; Because it is exact (not based on heuristics); Priority should be given to user constraints imposed by space limitations or other aesthetic criteria; Building Table Formatting Tools Slide 17 of 48
  • 18. The Table Layout Problem Find a layout of a table with minimum such that height ( ) width( ) < W W a given page width Building Table Formatting Tools Slide 18 of 48
  • 19. Definitions (Anderson and Sobti) , m x n table, m rows, n columns, is a layout of k k Ci , j : {( wij , hij ) | 1 k K ij } is the set of configurations for cell i,j with 1 ki , j K ij the index of the configuration selected from Ci,j m height ( ) = hi i =1 n width( ) = w j j =1 where hi = max(hi,j) for each row i, wj = max(wi,j) for each column j ki , j i, j hi , j = h Building Table Formatting Tools ki , j i, j wi , j = w Slide 19 of 48
  • 20. Cell configurations Example: k k Ci , j : {(hij , wij ) | 1 k K ij } Cell configurations for cell i,j this blue sky Building Table Formatting Tools this blue sky this blue sky Slide 20 of 48
  • 21. IP definition simple tables m n minimize max hi , j ,k xi , j ,k i =1 j =1 subject to n 1) m max w i =1 j =1 2) where Building Table Formatting Tools x i , j ,k i , j ,k xi , j ,k W = 1, x {0,1} 1 ki , j K i , j Slide 21 of 48
  • 22. IP Tables with spanning cells The height and width of a spanning cell i,j is 1) 2) S h i, j S w i, j hi , j = wi , j = i + S ih, j h r r =i j + S iwj , w c= j c the number of additional rows spanned by cell i,j the number of additional columns spanned by cell i,j Building Table Formatting Tools Slide 22 of 48
  • 23. Tables with spanning cells Two cases: A) cell 1,1 spans wide columns Building Table Formatting Tools B) cell 1,1 spans narrow columns Slide 23 of 48
  • 24. Tables with spanning cells A cell spanning narrow columns requires glue Glue is a concept similar to the glue introduced by Knuth - it stretches and shrinks as required; - its value can be further constraint; Building Table Formatting Tools Slide 24 of 48
  • 25. Tables with spanning cells For a spanning cell i,j the equations 1) and 2) become: 1) 2) g hi , j = wi , j = i + S ih, j (h r r =i j + S iwj , (w c= j c +g ) h r +g ) h r the vertical glue for row r w c w c the horizontal glue for column c g Building Table Formatting Tools Slide 25 of 48
  • 26. Tables with spanning cells a) hi , j = hi , j ,k xi , j ,k m c) i =1 k b) wi , j = wi , j ,k xi , j ,k n d) where x, y, z {0,1} Building Table Formatting Tools h Hi , r i wj = w z j =1 k r i hi = h y r i c j c j w W j c j 1 k Ci , j , 1 c W j , 1 r Hi Slide 26 of 48
  • 27. IP definition simple tables m minimize hir yir + g ih i =1 subject to n 1) wcj z cj + g w W j 4)* j =1 2) 3) hi , j wi , j i + S ih, j (hr + g rh ) ( wc + g cw ) i , j ,k = 1, i, j m 5) r =i j + S iwj , x yir = 1 i =1 n 6) z cj = 1 j =1 c= j * dummy cells (0,0) are inserted for spanned cells Building Table Formatting Tools Slide 27 of 48
  • 28. Table Layout problem is NP-complete Demonstrated by: 1996 Wang - demonstration using large integers; 1999 Anderson and Sobti - using reductions of the clique problem to the table layout, on simple tables. Building Table Formatting Tools Slide 28 of 48
  • 29. OPL OPL Optimization Programming Language designed for solving combinatorial optimization problems. support for: MIP and constraint programming including search specification; logical and higher order constraints; support for scheduling and resource allocation applications; shares structure and syntax features with mathematical programming languages such as AMPL or GAMS problems can be formulated in a language similar to their algebraic notation Building Table Formatting Tools Slide 29 of 48
  • 30. OPL Keywords dvar decision variable. The purpose of an OPL model is to find values for the decision variables such that all constraints are satisfied dexpr - to express decision variables in a more compact way {dataType} - set of type dataType - represents a tuple value; Building Table Formatting Tools Slide 30 of 48
  • 31. TSC Model - Data types tuple Conf {int w; int h;} tuple CellConf {int i; int j; Conf c;} int pageW; // page width {CellConf} configs; // set of cell configurations sample data: configs = {, , ,,,} {Cell} cells = { | in configs} {int} rows = {i | in cells} {int} cols = {j | in cells} Building Table Formatting Tools Slide 31 of 48
  • 32. TSC model - Additional data // for each column j, the set of possible widths {Pair} colWset = { | in configs: Sw[==0}; // for each row i, the set of possible heights {Pair} rowHset = { | in configs: Sh[==0}; Building Table Formatting Tools Slide 32 of 48
  • 33. TSC model Decision variables Decision variables: x,y,z // cell configuration selector dvar int x[configs] in 0..1 // row height/colum width selectors dvar int y[rowHset] in 0..1 dvar int z[colWset] in 0..1 // glue dvar int Gh[rows] in 0..maxint dvar int Gw[columns] in 0..maxint Building Table Formatting Tools Slide 33 of 48
  • 34. TSC model Expressions // cell width dexpr int cellW[ in cells] = sum( in configs) x[] * k.w; // row height dexpr int rowH[i in rows] = sum( in rowHset) y[] * h; // column width dexpr int colW[j in cols] = sum( in colWset) z[] * w; // table height/width dexpr int tableH = sum(i in rows) rowH[i]; dexpr int tableW = sum(j in cols) colW[j]; Building Table Formatting Tools Slide 34 of 48
  • 35. TSC Model - Constraints minimize tableH; constraints { ct1: tableW 0) cellH[]