Andrew Coates @coatsy #DPP407 I told you it was a ZIP file

Preview:

Citation preview

Deep Dive into Documents with the OpenXML SDKAndrew Coates@coatsy

#DPP407

#DPP407

Once upon a time …

#DPP407

It’s a ZIP fileThere are parts and relationshipsParts are (often) xml

Open Packaging Convention (OPC)

#DPP407

I told you it was a ZIP file

#DPP407

DocumentsWorkbooksPresentations

Lots of Package Types

#DPP407

Documents

Need:document — The root element for a WordprocessingML's main document part, which defines the main document story.body — The container for the collection of block-level structures that comprise the main story.p — A paragraph.r — A run.t — A range of text.

http://msdn.microsoft.com/EN-US/library/office/gg278308(v=office.15).aspx

#DPP407

Need:A single sheetA sheet IDA relationship Id pointing to the location of the sheet definition

Spreadsheets

http://msdn.microsoft.com/en-us/library/office/gg278316(v=office.15).aspx

#DPP407

Need:Presentation part, represented by the file presentation.xmlPresentation properties part (presProps.xml)Slide master part (slideMaster.xml)Slide layout part (slideLayout.xml)Theme part (theme.xml)[One or more slide parts (slide.xml)].

Presentations

“The packaging structure of a presentation document contains several references between the parts, including some circular references. For example, slide layouts reference slide masters, and slide masters reference slide layouts”http://msdn.microsoft.com/en-us/library/office/gg278335(v=office.15).aspx

#DPP407

Enter the OpenXML SDK

v1.0 2007 OPC Only (xml up to you)

v2.0 2008 Strong Typed, LINQ, Validation

v2.5 2013 Office 2013, read ISO StrictOpen

Sourced

2014

OOXMLSDK v1.0static void Main(string[] args){ const string fileName = "Test.docx";

const string documentRelationshipType = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument"; const string stylesRelationshipType = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles";

XDocument xDoc = null; XDocument styleDoc = null;

using (Package wdPackage = Package.Open(fileName, FileMode.Open, FileAccess.Read)) { http://blogs.msdn.com/b/ericwhite/archive/2007/12/20/what-is-the-difference-between-the-system-io-packaging-and-microsoft-office-documentformat-openxml-

packaging-namespaces.aspx

OOXMLSDK v1.0 (cont) PackageRelationship docPackageRelationship = wdPackage .GetRelationshipsByType(documentRelationshipType) .FirstOrDefault(); if (docPackageRelationship != null) { Uri documentUri = PackUriHelper .ResolvePartUri( new Uri("/", UriKind.Relative), docPackageRelationship.TargetUri); PackagePart documentPart = wdPackage.GetPart(documentUri);

// Load the document XML in the part into an XDocument instance. xDoc = XDocument.Load(XmlReader.Create(documentPart.GetStream()));

OOXMLSDK v1.0 (cont (cont)) // Find the styles part. There will only be one. PackageRelationship styleRelation = documentPart.GetRelationshipsByType(stylesRelationshipType) .FirstOrDefault(); if (styleRelation != null) { Uri styleUri = PackUriHelper.ResolvePartUri(documentUri, styleRelation.TargetUri); PackagePart stylePart = wdPackage.GetPart(styleUri);

// Load the style XML in the part into an XDocument instance. styleDoc = XDocument.Load(XmlReader.Create(stylePart.GetStream())); } } }

OOXMLSDK v1.0 (cont (cont (cont))) Console.WriteLine("The main document part has {0} nodes.", xDoc.DescendantNodes().Count()); Console.WriteLine("The style part has {0} nodes.", styleDoc.DescendantNodes().Count()); Console.ReadKey();}

OOXMLSDK v2.0static void Main(string[] args){ const string fileName = "Test.docx"; using (WordprocessingDocument doc = WordprocessingDocument.Open(fileName, false)) { MainDocumentPart mainPart = doc.MainDocumentPart; StyleDefinitionsPart stylePart = mainPart.StyleDefinitionsPart; Console.WriteLine("The main document part has {0} descendents.", mainPart.RootElement.Descendants().Count()); Console.WriteLine("The style part has {0} descendents.", stylePart.RootElement.Descendants().Count()); Console.ReadKey(); }}

#DPP407

All documents (packages) have the same base structureYou can build a document from scratch (it’s just a zip file and xml)It’s MUCH easier to use the OpenXML SDK

OpenXML

#DPP407

Building Simple OOXML Documents

#DPP407

Deconstructing

#DPP407

Manual deconstruction

#DPP407

Programmatic Deconstruction(& reconstruction)

#DPP407

Constructing

#DPP407

Complex Document

#DPP407

Templated Document

#DPP407

SpreadsheetsShared String TableCalculation ChainFormulasPivotTables

PresentationsAnimationsHandouts/Notes

DocumentsStylesThemes

What I Haven’t Covered

#DPP407

Open XML SDK on GitHub - https://github.com/OfficeDev/Open-Xml-SdkBrian Jones' blog announcing open sourcing the SDK http://blogs.office.com/2014/06/25/open-xml-sdk-goes-open-source/Open XML SDK 2.5 Docs - http://msdn.microsoft.com/en-us/library/office/bb448854(v=office.15).aspxOpen XML Developer Site - http://openxmldeveloper.org/About the Open XML SDK 2.5 for Office - http://msdn.microsoft.com/en-us/library/office/bb456487(v=office.15).aspxWorking with WordprocessingML documents (Open XML SDK) - http://msdn.microsoft.com/en-us/library/office/gg278327(v=office.15).aspxWorking with PresentationML documents (Open XML SDK) - http://msdn.microsoft.com/en-us/library/office/gg278318(v=office.15).aspxWorking with SpreadsheetML documents (Open XML SDK) - http://msdn.microsoft.com/en-us/library/office/gg278328(v=office.15).aspx

Resources

#DPP407

ISO/IEC 29500-1:2012 - http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=61750Open XML SDK .NET Foundation Page - http://www.dotnetfoundation.org/prjopenxml.aspxOpen XML Package Editor Released for VS2012 and VS2013 - http://openxmldeveloper.org/blog/b/openxmldeveloper/archive/2014/06/07/open-xml-package-editor-released-for-vs2012-and-vs2013.aspx

http://visualstudiogallery.msdn.microsoft.com/450a00e3-5a7d-4776-be2c-8aa8cec2a75bhttp://blogs.msdn.com/b/acoat/archive/2010/06/19/document-creation-and-conversion-with-the-openxml-sdk-and-sharepoint-2010-word-automation-services.aspxhttp://blogs.msdn.com/b/acoat/archive/2011/04/06/document-creation-and-conversion-with-the-openxml-sdk-and-sharepoint-2010-word-automation-services-part-2.aspx

Resources (cont)

#DPP407

Build Awesome Documents

Fast

Deep Dive into Documents with the OpenXML SDKAndrew Coates@coatsy

#DPP407

Thanks! Don’t forget to complete your evaluations

aka.ms/mytechedmel

Recommended