View
219
Download
2
Embed Size (px)
Citation preview
SMIL: Synchronized Multimedia Integration Language 2.0
Nabil LAYAÏDA
INRIA Rhône-Alpes – SYMM WG/W3C, Monbonnot
March 2002
• Web evolution– Diversity of formats and platforms (no
common exchange format, …)– Multimedia and the web were developed in
distinct parallel worlds !
• Multimedia on the web: an integration problem at two levels
1. Between media objects (mp3, video, text, ..)
2. With the Web (web technologies)
• Context of this work : W3C– SYMM Working Group (SMIL 1.0 et 2.0)
Introduction
Plan
• Introduction
• Goal and design principles
• Organization of a SMIL document (.smi)– Spatial et temporal aspects– Animations and transitions– Hypermedia time-based Links– Scalability framework
• Conclusions and future work
Goal of the project : version 2
• Document Text Format for the integration of media items (the html of multimedia).
• Use Web technologies where applicable for multimedia : XML, Namespaces, Schemas, ... DOM.
• Promotion of the notion of time-based and Synchronized documents at the scale of the web.
• Remain neutral to access protocols and to media formats RTP, RTSP, Mpeg,...
• Bring together major players in multimedia around an open format (the impossible challenge !).
Involved Companies and Organizations
• Application developers – Oratrix, Real Networks, Microsoft, IBM,
Macromedia, Intel, Philips, Panasonic, Nokia Products
– Public Institutions : INRIA, CWI, NIST, WGBH …Experimental Syst.
• Strengths of SMIL– Version 1.0 is a success …given the no of ??
– Very simple to learn and to use.
– Growing integration with other web standards.
SMIL 2.0 : Design principlesMeta-language which allows the description of multimedia documents
ranging from the simplest to the very complex.
XMLDOM 1-2
SMIL DOM
SVGAnimationTransition ….
1 application profile
Functional spaceSynchronization
Languages space
Syntactic and compositional space, programming APIs, …
Vector Animations
Namespaces
SMIL 2.0 : functional spaces
The functional aspects covered in SMIL 2.0 are :
• Layout -- positioning on the screen and audio channels
• Content Control -- content selection, adaptation and optimization
• Structure – the glue between the different modules
• Metainformation -- metadata
• Timing and Synchronization – the heart of the language
• Linking -- hypermedia time-based navigation
• Media object – basic media objects
• Time manipulations – accelerate / decelerate time
• Transition effects – fade-ins, visual effects …
SMIL 2.0 : languages spaceA profile :
• A language corresponds to a particular application (DTD, Schema)
• A composition of the functional space (modules)
• Integration with extra-SMIL modules (Animation SVG)
SMIL 2.0 Language Profile (SMIL Profile) :
• Successor of SMIL 1.0 (backward compatible)
• XML Language syntax + a semantics
• Composition of most of the SMIL 2.0 modules
SMIL 2.0 Basic Language Profile :
• Language for 3G phones and PDAs ...
• A scalability framework to handle heterogeneous devices
XHTML + SMIL
• Basic medias are XHTML elements
• Fusion (thanks to namespaces) of two XML languages
Typical SMIL Documents
• A set of “components” accessible via urls, the content is not included in a SMIL file
• These components may have different media types: audio, video, text, image, etc.
• Synchronization : intra- inter-objects et lip-sync• User Interactions : TAC (Global and VCR like)
and spatial and temporal links, dynamic changes to the course of a presentation (events)
Plan
• Introduction
• Goal and design principles
• Organization of a SMIL document (.smi)– Spatial et temporal aspects– Animations and transitions– Hypermedia time-based Links– Scalability framework
• Conclusions and future work
Organization of a SMIL document
Two parts :• Head : contains information of document level
• Body : contains the temporal scenario, animations, transitions and the media objects
Structure of a document
toto.smi
head body
seq switchparLayout
Region 1
Media
Audio Channel
Media
Transition
Animation
Transition
<smil xmlns="http://www/w3.org/2000/SMIL20/Language"> <head> <layout type="text/smil-basic"> <region id="left-video" left="20" top="50" z-index="1"/> <region id="left-text" left="20" top="120" z-index="1"/> <region id="right-text" left="150" top="120" z-index="1"/> </layout> </head> <body> <par>
<seq> <img src="graph" region="left-video" dur="45s"/> <text src="graph-text" region="left-text"/> </seq> <par> <a href="http://www.w3.org/People/Berners-Lee"> <video src="tim-video" region="left-
video"/> </a> <text src="tim-text" region="right-text"/> </par> <seq> <audio src="joe-audio"/> <video id="jv" src="joe-video" region="right-video"/> </seq> </par> </body> </smil>
Head
Body = scenario
Document Head
• META element: description of the document properties and metadata (RDF)– Title, author, expiration date, keywords,
summary, …– … the MPEG 7 of SMIL !
Metadata Example<!-- Metadata about the SMIL presentation --> <rdf:Description about="http://www.example.com/meta.smi" dc:Title="An
Introduction to the Resource Description Framework" dc:Description="The Resource Description Framework (RDF) enables the encoding, exchange and reuse of structured metadata" dc:Publisher="W3C" dc:Date="1999-10-12" dc:Rights="Copyright 1999 John Smith" dc:Format="text/smil" >
<dc:Creator> <rdf:Seq ID="CreatorsAlphabeticalBySurname">
<rdf:li>Mary Andrew</rdf:li> <rdf:li>Jacky Crystal</rdf:li>
</rdf:Seq> </dc:Creator><smilmetadata:ListOfVideoUsed>
<rdf:Seq ID="VideoAlphabeticalByFormatname"> <rdf:li Resource="http://www.example.com/videos/meta-1999.mpg"/> <rdf:li Resource="http://www.example.com/videos/meta2-1999.mpg"/>
</rdf:Seq> </smilmetadata:ListOfVideoUsed> <smilmetadata:Access LevelAccessibilityGuidelines="AAA"/> </rdf:Description>
Spatial Aspect
• Layout element– Hierarchical Model with hotspots (regPoints)– Layers instead of text flow (text flow for text !)– Simple Positioning close to the CSS
Model 2+1/2 D (x, y, z-index, fit)
Spatial Aspect
Region 1 Region 2
Region 3
b
c
a
Time flow
Regions and sub-regions for the spatial placement of media objects
Document Body: Synchronization
Contains the temporal scenario of the document
• A scenario is defined recursively : Schedule elements
• Schedule = Parallel | Seq | Excl | Media object
| anchors (starting/arriving)| Switch
| priorityClass | Prefetch
Basic Media Objects
Media Objects marked-up with:Audio, Video, Text, Img, Textstream,
Animation, Ref, Param, et … Prefetch
Attributes :• Src : identifies the basic media object file (URL)
rtsp://rtsp.example.org/video.mpg
• Type : mime type (eg. video/mpeg)
• Region : identifies the drawing surface
• Dur : duration of the media object
Synchronization Attributes
The Dur (duration) attribute:• “intrinsic”: the duration corresponds to the duration of the
external file.
• “explicit”: the duration is specified in the document(dur= “15 s”)
The repeat attribute:RepeatCount=“3” repeats the simple duration of the media.
RepeatDur=“12 s” :
Synchronization Attributes
The begin, end attributes: • Value (begin= “13 s”) : offset relative to the parent
element.
• Reference to another clock : (begin= “e2.end + 5 s ”)
• Reference to the absolute time reference: (begin= “wallclock(2001-01-01Z)”
• Reference to an asynchronous event (interactivity):(begin= “button.click”)
Media Clipping
• Spatial Clipping using regions and sub-regions• Temporal Clipping using clip-begin et clip-end
attributes (media objects are external files)
<video id="a" src=“video.mpg" clip-begin=“smpte=00:01:45" clip-end=“smpte=00:01:55" …/>
Media Slice
• Semantics : play in sequence a set of media objects
• Attributes – Fill : used to make the object « persist » on the screen
• Remove : removes the object at end time
• Freeze : keeps the last frame at end time
The sequential element : seq
<seq> <image id="a" regionName=“x” src="wait.gif“ fill=“freeze”/> <video id="b" regionName=“x” src="video.au dur="20 s" /></seq>
The parallel element : par (1)
• Semantics : – Play in parallel a set of media objects– End time : maximum duration of child objects
• Attributes :– endSync : Last (Rendez-vous)– Dur : reference clock of the par(Wall clock)– Begin/End : Synchronization Arc
Synchronization Arcs and EventsAllows the description of graph structures:
a
cb
d
...<par> <audio id="a" src="audio.au" begin="id(b)" /> <video id="b" src="video.au end="id(c)" /> <text id="c" src="text" begin="id(d)"
end=id(a)(end) /> <image id="d" src="image.gif" begin="id(b)(end)/></par> ...
Triggering of objects on events :
...<par> <img id="a" src=“image" /> <video id="b" src="video.mpg” begin=“a.activateEvent" end=“a.activateEvent /> <text id=“c" src=“text” end=“b.focusInEvent" /></par> ...
a
cb
syncBehavior and syncTolerance• syncBehavior
– canSlip : the synchro is loose, child elements can slip from the parent clock
– locked : the Synchronization is hard (lipsync), amount of tolerated slipping (syncTolerance).
– Independent : synchro completely independent
• syncTolerance =“amount of jitter”
• syncMaster=“true” clock ticker of the par element
excl and priorityClass elements
• Semantics : – Play a set of media objects one at a time– End time: same as par/seq with the addition of
<excl dur="indefinite“ endSync=“all”> <priorityClass id="ads" peers="defer">
<video id=“pub1" .../> <video id=“pub2" .../>
</priorityClass> <priorityClass id="program" peers="stop" higher="pause">
<video id="program1" .../> <video id="program2" .../> <video id="program3" .../> <video id="program4" .../>
</priorityClass> </excl>
Switch element• An element to choose from a set of content
equivalent objects• Choice is based on attribute values
– language, screen size, depth, bitrate, systemRequired
– …and user preferences
...<par> <text .../> <switch> <par bitrate="40000"> ... </par> <par bitrate="24000"> ... </par> ........ </switch> </par> ...
...<switch> <audio src="joe-audio-better-quality" language="fr"/> <audio src="joe-audio" language="en"/> </switch> ...
Plan
• Introduction
• Goal and design principles
• Organization of a SMIL document (.smi)– Spatial et temporal aspects– Animations and transitions– Hypermedia time-based Links– Scalability framework
• Conclusions and future work
Animations
Definition :
• A set of attributes are target of the animation
• A function (calc mode) makes these attributes evolve
• A control on the instants where the changes are applied
Syntax– animateMotion : graphical movements of elements– animate : generic animation applied to element
attributes from/to/by/calcMode– set : discrete change of an attribute value at a given
instant– animateColor : animation in the color space
Animations<img top="3" ...> <animate begin= "5s" dur="10s" attributeName="top"
by="100" repeatCount="2.5" fill="freeze" calcMode="linear"/> </img>
Calc Mode : discrete, list of values with linear, log interpolation
TransitionsElement : transition
– Type and Subtype (transition repository + variant)– transIn and transOut attributes
... <transition id="wipe1" type=“zigZagWipWipe" subtype="leftToRight" dur="1s"/><transition id="wipe2" type=“veeWipe" subtype="leftToRight" dur="1s"/> ... <seq> <img src="butterfly.jpg" dur="5s"... /> <img src="eagle.jpg" dur="5s" fill="transition" transIn="wipe1" ... /> <img src="wolf.jpg" dur="5s" fill="transition" transIn="wipe2" transOut=“wipe1” ... /> </seq>
Transition Transition Transition
Plan
• Introduction
• Goal and design principles
• Organization of a SMIL document (.smi)– Spatial et temporal aspects– Animations and transitions– Hypermedia time-based Links– Scalability framework
• Conclusions and future work
Hypermedia time-based links
• Compatible with (Xlink/Xpointer)• Extension to the semantics of URLs
– http://foo.com/path.smil#ancre(begin(id(anchor))
– Two types (a: whole object, area: part of it)
– Jump over time and space !!
• Attribute show– Replace (default value)
– New (fork)
– Pause (procedure call)
Links with spatial and temporal anchors
Anchor on a sub-surface of an object
Anchor on a sub-duration of an object
<video src=“rtsp://www.w3.org/video.mpg”> <area href=“http://www.w3.org/AudioVideo” coords=“0%, 0%, 50%, 50%”/> <area href=“http://www.w3.org/Style” coords=“50%, 50%, 100 %, 100%”/></video>
<video src=“rtsp://www.w3.org/video.mpg”> <a href=“http://www.w3.org/AudioVideo” begin=“0 s” end=“5 s” /> <a href=“http://www.w3.org/AudioVideo” begin=“10 s” end=“15 s”/></video>
time
Combination of both …<video src=“rtsp://www.w3.org/video.mpg”> <anchor href=“http://www.w3.org/Lion”
begin=“0 s” end=“5 s” coords=“0%, 0%, 100%, 50%”/> <anchor href=“http://www.w3.org/Tortue”
begin=“10 s” end=“15 s” coords=“0%, 50%, 100 %, 100%”/></video>
time
With the animation of coords
time
A word on scalable profilesBased on CC/PP and static content negotiation : user agent and
serverCorrespondence between the namespace prefix and module
names (URIs)Uses the systemRequired attribute and the switch element
ScalabilityAn example of capabilities description
<smil xmlns="http://www.w3.org/2000/SMIL20/CR/" xmlns:time="http://www.w3.org/2000/SMIL20/CR/BasicInlineTiming" xmlns:contain="http://www.w3.org/2000/SMIL20/CR/BasicTimeContainers" xmlns:media="http://www.w3.org/2000/SMIL20/CR/BasicMedia" systemRequired="time+contain+media" >
... </smil>
<smil xmlns="http://www.w3.org/2000/SMIL20/CR/" xmlns:smil20="http://www.w3.org/2000/SMIL20/CR/" systemRequired="smil20" >
...</smil>
The user agent must support SMIL 2.0 entirely
The user agent must support time+contain+media modules
An example
Prefetch an image so it is made available for display immediately after the video:
<smilmlns="http://www.w3.org/2001/SMIL20/CR/Language"> <body> <seq> <par> <prefetch id="endimage" src="http://www.example.org/logo.gif"/> <text id="interlude" src=“http://www.example.org/pleasewait.html” fill="freeze"/> </par> <video id="main-event" src="rtsp://www.example.org/video.mpg"/> <img src="http://www.example.org/logo.gif" dur="5s"/> </seq> </body> </smil>
The prefetch element
• The prefetch element gives the authors the control to enhance network transfers.
• SMIL documents must be playable even if prefetch elements are ignored.
• If a prefetch element is ignored,its Synchronization must be enforced, e.g. if a prefetch element has a dur="5s", depending elements must behave accordingly.
The prefetch elementThe prefetch element supports the following attributes: mediaSize values: bytes-value | percent-value
• Defines how much of the resource to fetch as a function of the file size of the resource. To fetch the entire resource without knowing its size, specify 100%. The default is 100%.
mediaTime values: clock-value | percent-value • Defines how much of the resource to fetch as a function of the
duration of the resource. To fetch the entire resource without knowing its duration, specify 100%. The default is 100%.
• For discrete media (non-time based media like text/html or image/png) using this attribute causes the entire resource to be fetched.
bandwidth values: bitrate-value | percent-value • Defines how much network bandwidth the user agent should use
when prefetching. To use all that is available, specify 100%. The default is 100%.
Plan
• Introduction
• Goal and design principles
• Organization of a SMIL document (.smi)– Spatial et temporal aspects– Animations and transitions– Hypermedia time-based Links– Scalability framework
• Conclusions and future work
Conclusion• More visible impact on industry : HTML
browsers (IE) , ++ browsers(RealOne), ++ authoring tools, ++ smil servers.
• Declarative Markup and specification very appreciated.
• 3GPP adopted SMIL Basic for MMS.• XMT – Part of Mpeg 4 uses SMIL syntax.• SVG+Animation Profile (Adobe, …).