Upload
shawn-villaron
View
35
Download
3
Tags:
Embed Size (px)
Citation preview
Components of WordprocessingML
• Main Document• Paragraphs & Rich Formatting
– Runs– Run Content
• Tables• Custom Markup• Sections• Styles
– Paragraph– Character– Numbering– Table– Document Defaults
• Fonts• Numbering• Headers/Footers• Footnotes/Endnotes• Glossary Document• Annotations
– Comments– Revisions– Bookmarks
• Mail Merge• Document Settings
– Web Settings– Compatibility Settings
• Fields & Hyperlinks• Odds & Ends (Textboxes, Subdocuments, Extensibility)
Ecma/TC45/2006/011 (Rev.)
Paragraphs
• The most basic unit of a WordprocessingML document
• Analogous to the HTML <p> tag
• A paragraph contains three pieces of information:
– Paragraph properties
– Inline content
– (optionally) a set of revision IDs used for document merge and compare
Paragraph Example
• A basic paragraph with three different text formats:
Paragraph properties
Paragraph contents
Paragraph Properties
• The paragraph properties are stored on the pPr element
• This contains all information on the formatting applied at the paragraph level, as well as to the paragraph mark character
Paragraph Properties
• Paragraph Style
• Keep on same page with previous/next paragraph
• Page break before
• Text frame
– Text frame properties
• Widow/Orphan control
– Prevents one line of a paragraph from being on a different page
• Numbering properties
• Paragraph borders
Paragraph Properties (cont'd)
• Suppress line numbering
• Paragraph shading
• Tab stops
• Override hyphenation
• RTL vs. LTR
• East Asian typography settings
• Line spacing
• Document grid settings
– Adjust text to grid
– Snap margins to grid
• Paragraph alignment
Paragraph Properties (cont'd)
• Indentation
– Mirror indents?
• Text orientation (vertical vs. horizontal)
• Outline level
• HTML <div> references
• Conditional formatting properties (in tables)
• Formatting properties for the paragraph mark character
• Section properties
Runs
• A run is a region of text with a common set of properties
• All text in a word processing document is contained within runs
• A run contains three pieces of information:
– Run properties
– Run content (e.g. text)
– (optionally) A set of revision IDs for document comparison
Runs
• All runs must be contained within a paragraph
• Producers may break runs whenever they choose, as long as the net property set for each run is correct
Run Properties
• The run properties are stored on the rPrelement
• This contains all information on the formatting applied to the characters in this run
Run Properties
• Character style
• Font face
• Font size
• Bold
• Italic
• ALL CAPS
• Small caps
• Strikethrough
• Double Strikethrough
• Outline
• Shadow
• Emboss
• Engrave
• Hidden text
Run Properties (cont'd)
• Run property revisions
• Fit text (for East Asian typography)
• Vertical alignment
• RTL vs. LTR
• Complex script flag
• Emphasis mark
• Language ID of text
• Horizontal in vertical
• Two lines in one
• Math
Run Content
• Runs may contain 'run content':
– Text
– Deleted text
– Soft line breaks
– Field codes
– Deleted field codes
– Footnote/endnote reference marks
– Fields
Run Content
• Runs may contain 'run content' (cont'd):
• Page numbers
• Tabs
• Ruby text
• DrawingML content
• Embedded objects
• Pictures
Text
• The only elements in the main story that can contain a text node(!)
– All other text is in an attribute value
• There are four types of text in WordprocessingML:
– Text
– Deleted text
– Field code
– Deleted field codes
Text
• Why do we use a different element for deleted text?
– Good question!
• This allows simple consumers to get the text of the document easily by just grabbing the contents of the t node (text)
• They don't need to check where revisions start and end, etc. to extract the visible contents
Disclaimer
This presentation is for informational purposes only, and should not be relied upon as a substitute or replacement for Microsoft formal file format documentation, which is available at the following website: https://msdn.microsoft.com/en-us/library/cc313118(v=office.12).aspx. Any views or opinions presented in this material are solely those of the author and do not necessarily represent those of Microsoft. Microsoft disclaims all liability for mistakes or inaccuracies in this presentation.