13
pdf.js Julian Viereck @jviereck

2011 09-pdfjs

Embed Size (px)

DESCRIPTION

Lightning talk abo

Citation preview

Page 1: 2011 09-pdfjs

pdf.jsJulian Viereck

@jviereck

Page 2: 2011 09-pdfjs

Overview

• What is pdf.js

• How PDF is structured

• Processing in pdf.js

• Images & Fonts

• Problems

• Todo

• Demo

Page 3: 2011 09-pdfjs

What is pdf.js

• building faithful & efficient PDF renderer

• HTML5 technology experiment

• no native code

• secure (web sandbox)

• Mozilla Labs Project - Open Source

Page 4: 2011 09-pdfjs

root objID, xRef byte offset

root obj = ref to pages catalog

How PDF is structuredHeader

Body

[Objects]

xRef Table

Trailer

sequence of objets

fonts, drawing cmds, images, words, bookmarks, form fields

mapping objID ⇔ byte offset

PDF version

PDF file

Page 5: 2011 09-pdfjs

CanvasGraphics

PartialEvaluator

Processing in pdf.js

• get plain Uint8Array via XHR2, build Stream

• new PDFDoc(stream): read xRef, root object

• page = PDFDoc.getPage(N)

• page.startRendering(graphics)

• read & convert all PDF cmds ➟ IR

• load required objects (fonts, images)

• graphics.executeIR(IR)

InternalRepresentation

Page 6: 2011 09-pdfjs

5 0 obj<< /Length 8 0 R>> stream /GS1 gs /F0 12 Tf BT 100 700 Td (Hello World!) Tj ET 50 600 m 400 600 l S endstreamendobj

3 0 obj<</Type /Page/MediaBox [0 0 612 792]/Resources 4 0 R/Parent 2 0 R/Contents 5 0 R>>endobj

1. page=PDFDoc.getPage(2) ➟ obj#3

2. page.startRendering(...) ➟ obj#4, obj#5

stream maybe encoded!

Page 7: 2011 09-pdfjs

setGState: [ LW: 10 ]dependency: [ font0 ]setFont: font0, 12beginTextmoveText: 100, 700showText: “Hello World!”endTextmoveTo: 50, 600lineTo: 400, 600stroke

5 0 obj<< /Length 8 0 R>> stream /GS1 gs /F0 12 Tf BT 100 700 Td (Hello World!) Tj ET 50 600 m 400 600 l S endstreamendobj CanvasGraphics

PartialEvaluator xRef, catalog, resources+

IR Form

Page 8: 2011 09-pdfjs

Images• JPEG streams:

• DOMImg.src = 'data:image/jpeg;base64,' + window.btoa(bytesToString(bytes));

• If not JPEG stream:

• read bytes, convert to colorspace

• imgData = canvas.getImageData()

• fillWithPixelData(bytes, imgData)

• canvas.putImageData(imgData)

Page 9: 2011 09-pdfjs

Fonts

• There are lots of different font formats!

• fonts are converted to OpenType

• use CSS: @font-face { font-family:'font0'; src:url(data:font/opentype;base64, ...)

• some fonts can’t be converted :(

• use drawing commands?

Page 10: 2011 09-pdfjs

Problems• No way to detect font is loaded (hacks)

• Font width (wrong on some platforms)

• Subpixel font size depending on platform

• Text selection

• Printing

• Speed

• use workers (postMessage lose shape)

• partial rendering

platform = browser + OS

Page 11: 2011 09-pdfjs

Todo

• more font work, printing, speed

• support more rendering spec

• explore using SVG

• PDF forms, “advanced PDF features”

• infrastructure: automated testing, requireJS

• test more PDF (need your help!)

Page 12: 2011 09-pdfjs

Demo

Page 13: 2011 09-pdfjs

Contact

Github: https://github.com/andreasgal/pdf.js

Mailing list: https://groups.google.com/group/mozilla.dev.pdf-js/topics

IRC: irc.mozilla.org #pdfjs