14
Word Object Model / OpenXML Brett Clouser

OpenXML: What is it? XML-based file format which describes documents, presentations, spreadsheets, etc. Replacement for binary file formats used in

Embed Size (px)

Citation preview

Page 1: OpenXML: What is it?  XML-based file format which describes documents, presentations, spreadsheets, etc.  Replacement for binary file formats used in

Word Object Model / OpenXML

Brett Clouser

Page 2: OpenXML: What is it?  XML-based file format which describes documents, presentations, spreadsheets, etc.  Replacement for binary file formats used in

OpenXML: What is it? XML-based file format which describes

documents, presentations, spreadsheets, etc.

Replacement for binary file formats used in previous versions of Office

Page 3: OpenXML: What is it?  XML-based file format which describes documents, presentations, spreadsheets, etc.  Replacement for binary file formats used in

Why use OpenXML? Readable – plaintext representation Smaller - compressed as a ZIP archive Straightforward - images are

respresented within <pic> tags All the benefits of regular XML!

Page 4: OpenXML: What is it?  XML-based file format which describes documents, presentations, spreadsheets, etc.  Replacement for binary file formats used in

Docx Structure (Containers) Paragraph <w:p>

Most basic unitOne for each line break in the documentContainer element

Run <w:r>Region of content with a common set of

propertiesAll runs must be contained within a paragraph

Page 5: OpenXML: What is it?  XML-based file format which describes documents, presentations, spreadsheets, etc.  Replacement for binary file formats used in

Docx Structure (Root Elements)

Text <w:t>Basic block of textNormal formatting can be applied through

formatting tags (i.e. <w:b> for bold)Must be contained within a run

Images <w:pic>Pictures, Clipart, Smartart, Shapes, charts, etc.Additional transformations can be applied to

the base image (rotation, reflection, etc.)

Page 6: OpenXML: What is it?  XML-based file format which describes documents, presentations, spreadsheets, etc.  Replacement for binary file formats used in

Docx Structure (example)

This is bold text. <w:p><w:r>

<w:t>This is </w:t></w:r><w:r>

<w:rPr><w:b /></w:pr><w:t>bold </w:t>

</w:r><w:r>

<w:t>text.</w:t></w:r>

</w:p>

Page 7: OpenXML: What is it?  XML-based file format which describes documents, presentations, spreadsheets, etc.  Replacement for binary file formats used in

Dissecting a Word 2007 Document

Demo

Page 8: OpenXML: What is it?  XML-based file format which describes documents, presentations, spreadsheets, etc.  Replacement for binary file formats used in

Working with OpenXML documents

Microsoft SDK for OpenXMLProvides strong bindings for accessing

document partsAllows developer to create or change

documents without having Word open Word Object Model

Coming up next…

Page 9: OpenXML: What is it?  XML-based file format which describes documents, presentations, spreadsheets, etc.  Replacement for binary file formats used in

Office Plugins Visual Studio Tools for Office (VSTO)

Add-on for Visual Studio 2005Develop Office add-ins just like any other

applicationUse WYSIWYG editor to create GUIAccess the document through the Word object

model

Page 10: OpenXML: What is it?  XML-based file format which describes documents, presentations, spreadsheets, etc.  Replacement for binary file formats used in

Word Object Model

Page 11: OpenXML: What is it?  XML-based file format which describes documents, presentations, spreadsheets, etc.  Replacement for binary file formats used in

Word Object Model InlineShapes

Collection of references to all images in the document

ParagraphsDirectly correspond to OpenXML <w:p> tags

RangesContiguous area in documentCan access actual text of document through

Text property

Page 12: OpenXML: What is it?  XML-based file format which describes documents, presentations, spreadsheets, etc.  Replacement for binary file formats used in

Creating a plugin demo Visual Studio Tools for Office Demo

Page 13: OpenXML: What is it?  XML-based file format which describes documents, presentations, spreadsheets, etc.  Replacement for binary file formats used in

How we’re using it…

OpenXML SDK to parse the document/presentation for accessibility errors

VSTO SE to create an addin that checks accessibility

Word Object Model to highlight regions of text and manipulate the document

Page 14: OpenXML: What is it?  XML-based file format which describes documents, presentations, spreadsheets, etc.  Replacement for binary file formats used in

Conclusions Any questions?