Upload
marius-preda
View
7.037
Download
6
Embed Size (px)
DESCRIPTION
I made this tutorial at Web3D 2012 conference. It provides MPEG position to AR, technologies currently used, as well as explanations on how to set up AR applications.
Citation preview
MPEG Augmented Reality Tutorial
Web3D Conference, August 4-5, Los Angeles, CA
Marius Preda, MPEG 3DG ChairInstitut Mines TELECOM
http://www.slideshare.net/MariusPreda/mpeg-augmented-reality-tutorial
Topics of the day
MPEG-A Part 14 Augmented Reality Reference Model
MPEG-A Part 13 Augmented Reality Application Format
MPEG offer in the Augmented Reality field
What is MPEG?
MPEG Augmented Reality Tutorial
Topics of the day
MPEG-A Part 14 Augmented Reality Reference Model
MPEG-A Part 13 Augmented Reality Application Format
MPEG offer in the Augmented Reality field
What is MPEG?
MPEG Augmented Reality Tutorial
What is MPEG?
Coding/compression of elementary media: – Audio (MPEG-1, 2 and 4)– Video (MPEG-1, 2 and 4)– 2D/3D graphics (MPEG-4)
Storage and Transport – MPEG-2 Transport– File Format (MPEG-4)– Dynamic Adaptive Streaming over HTTP (DASH)
Hybrid (natural & synthetic) scene description, user interaction (MPEG-4)Metadata (MPEG-7)Media management and protection (MPEG-21)Sensors and actuators, virtual worlds (MPEG-V)Advanced User interaction (MPEG-U)Media-oriented middleware (MPEG-M)
More ISO/IEC standards under development for– 3D Video, 3D Audio – Coding and Delivery in Heterogeneous Environments– …
A suite of ~130 ISO/IEC standards
A standardization activity continuing for 24 years– Supported by several hundreds companies/organisations from ~25 countries– ~500 experts participating in quarterly meetings– More than 2300 active contributors– Many thousands experts working in companies
A proven manner to organize the work to deliver useful and used standards– Developing standards by integrating individual technologies– Well defined procedures– Subgroups with clear objectives – Ad hoc groups continuing coordinated work between meetings
MPEG standards are widely referenced by industry– 3GPP, ARIB, ATSC, DVB, DVD-Forum, BDA, EITSI, SCTE, TIA, DLNA, DECE, OIPF…
Billions of software and hardware devices built on MPEG technologies – MP3 players, cameras, mobile handsets, PCs, DVD/Blue-Ray players, STBs, TVs, …
What is MPEG?Involvement, approach, deployment
Topics of the day
MPEG-A Part 14 Augmented Reality Reference Model
MPEG-A Part 13 Augmented Reality Application Format
MPEG offer in the Augmented Reality field
What is MPEG?
MPEG Augmented Reality Tutorial
MPEG technologies related to AR
MPEG-1/2(AV content)
1992/4
VRML
1997
• Part 11 - BIFS: -Binarisation of VRML -Extensions for streaming -Extensions for server command -Extensions for 2D graphics - Real time augmentation with audio & video• Part 2 - Visual: - 3D Mesh compression - Face animation
1998
• Part 2 – Visual - Body animation
1999
MPEG-4 v.1
MPEG-4 v.2
First form of broadcast signal augmentation
MPEG technologies related to AR
MPEG-4
2003
•AFX 2nd Edition: - Animation by morphing - Multi-texturing
2005
• AFX 3rd Edition - WSS for terrain and cities - Frame based animation
2007
MPEG-4
MPEG-4
• Part 16 - AFX: - A rich set of 3D graphics tools - Compression of geometry, appearance, animation
• AFX 4th Edition - Scalable complexity mesh coding
2011
MPEG-4A rich set of 3D Graphics representation and compression tools
MPEG technologies related to AR
MPEG-4
2003
•AFX 2nd Edition: - Animation by morphing - Multi-texturing
2005
• AFX 3rd Edition - WSS for terrain and cities - Frame based animation
2007
MPEG-4
MPEG-4
• Part 16 - AFX: - A rich set of 3D graphics tools - Compression of geometry, appearance, animation
• AFX 4th Edition - Scalable complexity mesh coding
2011
MPEG-4
2009
• Part 25 - Compression of third-party XML (X3D, COLLADA)
MPEG-4
2004
• Part 16 - X3D Interactive Profile
MPEG-4
MPEG technologies related to AR
MPEG-V - Media Context and Control
2011
• 2nd Edition: - GPS - Biosensors - 3D Camera
201x
• Compression of video + depth
201x
MPEG-V
- 3D Video
• 1st Edition - Sensors and actuators - Interoperability between Virtual Worlds
• Feature-point based descriptors for image recognition
201x
CDVS
MPEG-U – Advanced User Interface
2012
A rich set of sensors and actuators
- 3D Audio
MPEG-H
All AR-related data is available from MPEG standardsReal time composition of synthetic and natural objectsAccess to
– Remotely/locally stored BIFS/compressed 2D/3D mesh objects – Streamed real-time BIFS/compressed 2D/3D mesh objects
Inherent object scalability (e.g. for streaming)User interaction & server generated scene changesPhysical context
– Captured by a broad range of standard sensors– Affected by a broad range of standard actuators
Main features of MPEG AR technologies
MPEG vision on AR, the MPEG AR Browser
Point to a URL – no need to download new applications for each context. The browser
– Retrieves scenario from the internet– Starts video acquisition– Tracks objects– Recognizes objects from visual signatures– Recovers camera pose– Gets streamed 3D graphics– Composes new scenes– Gets inputs from various sensors– Offers optimal AR experience by constantly adapting interaction possibilities
and objects from a remote server. Industry
– Maximize number of customers through MPEG-compliant authoring tools and browsers
– No need to develop a new application for each use case and device platform
MPEG vision on AR
MPEG-4/MPEG-7/MPEG-21/MPEG-U/MPEG-V MPEG Player
CompressionAuthoring Tool
Produce
Download
Architecture
AR Player
MediaServers
ServiceServers
User
LocalSensors & Actuators
RemoteSensors & Actuators
AR file or stream
Local Real World
Environment
Local Real World
Environment
Remote Real World
Environment
Remote Real World
Environment
ISO/IEC 23000-14 Augmented Reality Reference Model– WD stage, collaborating with SC24/WG9, ARStandards, OGC, Khronos,
Web3DISO/IEC 23000-13 Augmented Reality Application Format
– CD stage, based on MPEG standards
MPEG ongoing work on AR
Topics of the day
MPEG-A Part 14 Augmented Reality Reference Model
MPEG-A Part 13 Augmented Reality Application Format
MPEG offer in the Augmented Reality field
What is MPEG?
MPEG Augmented Reality Tutorial
Glossary
InformationViewpoint
ComputationalViewpoint
EngineeringViewpoint
Implementation/Development
TechnologyViewpoint
EnterpriseViewpoint
Community Objectives
Abstract/Design
Use cases- Guide- Create- Play
WD2.0 content
Augmented Reality Reference Model
Viewpoints
AR Player
MediaServers
ServiceServers
User
Local / Remote Context
AR Document
Enterprise viewpoint: global architecture and actors
Augmented Reality Reference Model
ARTC
AREC
AC
AR Tools Creator (ARTC)AR Experience Creator (AREC)Assets Creator (AC)
AC AAAssets Aggregator (AA)
Device Manufacturer (DM)Middleware/Component Provider (MCP)
DM MCP
MCP
End-User (EU)
EU
Telecommunication Operator (TO)
TO TO
TO
TO
Online Middleware/Component Provider (OMCP)AR Service Provider (ARSP)
OMCP ARSP
Information viewpoint
Augmented Reality Reference Model
AR Player
MediaServers
ServiceServers
User
Local/Remote Context
AR Document
Scene/Real World• Raw image• Sensed data • Virtual Camera view• Detected features• Area of Interest/Anchors
Tracking objects• Markers• Marker-less
Device Context• Device capabilities Location of Device
• Location• Orientation
Spatial Models• Coordinate Ref. Sys.• (Geol)ocation• Projections• Coordinate conversion
Presentation• Augmentation• Registration• Styling/complexity• Spatial Filtering, e.g.
rangeUser Input
• Query• Manipulation of
Presentation• Topics of interest• Preferences
Digital Assets• Presentation data• Trigger/Event rules• Accuracy based
AR Player
MediaServers
ServiceServers
User
Local / Remote Context
AR Document
Computational viewpoint
Augmented Reality Reference Model
1
2
3 4
5
AR Player
MediaServers
ServiceServers
User
Local / Remote Context
AR Document
Computational viewpoint
Augmented Reality Reference Model
1
2
4 5
3
Engineering viewpoint
Augmented Reality Reference Model
AR Player
MediaServers
ServiceServers
User
Local/Remote Context
AR Document
Camera Mic Accelerometer Compass GPS …
Rendering Engine
Display(A/V/H)
Application Engine …
Glossary
Augmented Reality Reference Model
Use cases
Augmented Reality Reference Model
How to contribute?
Augmented Reality Reference Model
Use Trac!http://wg11.sc29.org/trac/augmentedreality/
Topics of the day
MPEG-A Part 14 Augmented Reality Reference Model
MPEG-A Part 13 Augmented Reality Application Format
MPEG offer in the Augmented Reality field
What is MPEG?
MPEG Augmented Reality Tutorial
A set of scene graph nodes/protos as defined in MPEG-4 Part 11– Existing nodes
– Audio, image, video, graphics, programming, communication, user interactivity, animation
– New standard PROTOs– Map, MapMarker, Overlay, ReferenceSignal, ReferenceSignalLocation,
CameraCalibration, AugmentedRegion
Connection to sensors as defined in MPEG-V– Orientation, Position, Angular Velocity, Acceleration, GPS, Geomagnetic,
Altitude– Local camera sensor
Compressed media
3 components: scene, sensors/actuators, medias
MPEG-A Part 13 ARAF
Category Sub-category Node, Protos / Elements name in MPEG-4 BIFS / XMT
Elementary media
AudioAudioSource
SoundSound2D
Image and video
ImageTextureMovieTexture
Textual information
FontStyleText
Graphics
AppearanceColor
LinePropertiesLinearGradient
MaterialMaterial2DRectangle
ShapeSBVCAnimationV2
SBBoneSBSegment
SBSkinnedModelMorphShapeCoordinate
TextureCoordinateNormal
IndexedFaceSetIndexedLineSet
Programming Script
User interactivity
InputSensorSphereSensorTimeSensorTouchSensorMediaSensorPlaneSensor
Category Sub-category Node, Protos / Elements name in MPEG-4 BIFS / XMT
Scene related information (spatial and
temporal relationships)
AugmentationRegionBackground
Background2DCameraCalibration
GroupInline
Layer2DLayer3DLayout
NavigationInfoOrderedGroup
ReferenceSignalReferenceSignalLocation
SwitchTransform
Transform2DViewpointViewport
Form
Dynamic and animated scene
OrientationInterpolatorScalarInterpolator
CoordinateInterpolatorColorInterpolator
PositionInterpolatorValuator
Communication and compression
BitWrapperMediaControl
MapsMap
MapOverlayMapMarker
Terminal TermCap
Scene: 63 XML Elements
MPEG-A Part 13 ARAF
Category Sub-category Node, Protos / Elements name in MPEG-4 BIFS / XMT
Elementary media
AudioAudioSource
SoundSound2D
Image and video
ImageTextureMovieTexture
Textual information
FontStyleText
Graphics
AppearanceColor
LinePropertiesLinearGradient
MaterialMaterial2DRectangle
ShapeSBVCAnimationV2
SBBoneSBSegment
SBSkinnedModelMorphShapeCoordinate
TextureCoordinateNormal
IndexedFaceSetIndexedLineSet
Programming Script
User interactivity
InputSensorSphereSensorTimeSensorTouchSensorMediaSensorPlaneSensor
Category Sub-category Node, Protos / Elements name in MPEG-4 BIFS / XMT
Scene related information (spatial and
temporal relationships)
AugmentationRegionBackground
Background2DCameraCalibration
GroupInline
Layer2DLayer3DLayout
NavigationInfoOrderedGroup
ReferenceSignalReferenceSignalLocation
SwitchTransform
Transform2DViewpointViewport
Form
Dynamic and animated scene
OrientationInterpolatorScalarInterpolator
CoordinateInterpolatorColorInterpolator
PositionInterpolatorValuator
Communication and compression
BitWrapperMediaControl
MapsMap
MapOverlayMapMarker
Terminal TermCap
Scene: the distance between ARAF and X3D is 32 (XML Elements)
MPEG-A Part 13 ARAF
Name: Park
Chu-Young
Position: FW
Team:
Arsenal: F
C
Referen
ce Image
Name: Park Chu-Young Position: FWTeam: Arsenal: FC
Reference Image
3D graphic Synchronized with movement of marker image
Marker
Marker Tracking
Scene:: Reference Signal
MPEG-A Part 13 ARAF
Scene:: Reference Signal
MPEG-A Part 13 ARAF
<ProtoDeclare name="ReferenceSignal” locations="org:mpeg:referencesignal"> <field name="source" ="Strings" vrml97Hint="exposedField" stringArrayValue=""/> <field name="referenceResources" ="Strings" vrml97Hint="exposedField" stringArrayValue=""/> <field name="enabled" ="Boolean" vrml97Hint="exposedField" booleanValue="false"/> <field name="detectionHints" ="Strings" vrml97Hint="exposedField" stringArrayValue=""/> <field name="onInputDetected" ="Integer" vrml97Hint="eventOut"/> <field name="onError" ="Integer" vrml97Hint="eventOut"/></ProtoDeclare>
Scene:: Reference Signal Location
MPEG-A Part 13 ARAF
<ProtoDeclare name="ReferenceSignalLocation" locations="org:mpeg:referencesignallocation"> <field name="source" ="Strings" vrml97Hint="exposedField" stringArrayValue=""/> <field name="referenceResources" ="Strings" vrml97Hint="exposedField" stringArrayValue=""/> <field name="enabled" ="Boolean" vrml97Hint="exposedField" booleanValue="false"/> <field name="detectionHints" ="Strings" vrml97Hint="exposedField" stringArrayValue=""/>
<field name="translation" ="Vector3Array" vrml97Hint="exposedField" Vector3ArrayValue=""/> <field name="rotation" ="Rotations" vrml97Hint="exposedField" rotationArrayValue=""/>
<field name="onInputDetected" ="Integer" vrml97Hint="eventOut"/> <field name="onTranslationChanged" ="Integer" vrml97Hint="eventOut"/> <field name="onRotationChanged" ="Integer" vrml97Hint="eventOut"/> <field name="onError" ="Integer" vrml97Hint="eventOut"/></ProtoDeclare>
Augmentation Region
Broadcaster
AR service provider A
AR service provider B
User A
User B
Scene:: Augmentation Region
MPEG-A Part 13 ARAF
Scene:: Augmentation Region
MPEG-A Part 13 ARAF
<ProtoDeclare name="AugmentationRegion" locations="org:mpeg:augmentationregion"> <field name="source" ="Strings" vrml97Hint="exposedField" stringArrayValue=""/> <field name="2DRegion" ="Vector2Array" vrml97Hint="exposedField" vector2ArrayValue=""/> <field name="arProvider" ="Strings" vrml97Hint="exposedField" stringArrayValue=""/> <field name="enabled" ="Boolean" vrml97Hint="exposedField" booleanValue="false"/> <field name="translation" ="Vector3Array" vrml97Hint="exposedField" Vector3ArrayValue=""/> <field name="rotation" ="Rotations" vrml97Hint="exposedField" rotationArrayValue=""/> <field name="onTranslationChanged" ="Integer" vrml97Hint="eventOut"/> <field name="onRotationChanged" ="Integer" vrml97Hint="eventOut"/> <field name="onARProviderChanged" ="Boolean" vrml97Hint="eventOut"/> <field name="onError" ="Integer" vrml97Hint="eventOut"/></ProtoDeclare>
Scene:: Map, MapMarkers and Overlay
MPEG-A Part 13 ARAF
Scene:: Map, MapMarkers and Overlay
MPEG-A Part 13 ARAF
<ProtoDeclare name="Map" protoID="1" locations="org:mpeg:map"> <field name="addChildren" ="Nodes" vrml97Hint="eventIn"/> <field name="removeChildren" ="Nodes" vrml97Hint="eventIn"/> <field name="addOverlays" ="Nodes" vrml97Hint="eventIn"/> <field name="removeOverlays" ="Nodes" vrml97Hint="eventIn"/> <field name="translate" ="Vector2" vrml97Hint="eventIn"/> <field name="zoom_in" ="Boolean" vrml97Hint="eventIn"/> <field name="zoom_out" ="Boolean" vrml97Hint="eventIn"/> <field name="gpscenter_changed" ="Vector2" vrml97Hint="eventOut"/> <field name="children" ="Nodes" vrml97Hint="exposedField"> <nodes></nodes> </field> <field name="overlays" ="Nodes" vrml97Hint="exposedField"> <nodes></nodes> </field><field name="gpsCenter" ="Vector2" vrml97Hint="exposedField" vector2Value="0 0"/> <field name="mode" ="Strings" vrml97Hint="exposedField" stringArrayValue="ROADMAP"/> <field name="provider" ="Strings" vrml97Hint="exposedField" stringArrayValue="ANY"/> <field name="size" ="Vector2" vrml97Hint="exposedField" vector2Value="0 0"/> <field name="mapWidth" ="Float" vrml97Hint="exposedField" floatValue="0"/> <field name="zoomLevel" ="Integer" vrml97Hint="exposedField" integerValue="0"/></ProtoDeclare>
Scene:: Map, MapMarkers and Overlay
MPEG-A Part 13 ARAF
<ProtoDeclare name="MapOverlay" locations="org:mpeg:mapoverlay"> <field name="addChildren" ="Nodes" vrml97Hint="eventIn"/> <field name="removeChildren" ="Nodes" vrml97Hint="eventIn"/> <field name="children" ="Nodes" vrml97Hint="exposedField"> <field name="keywords" ="Strings" vrml97Hint="exposedField stringArrayValue=""/></ProtoDeclare>
<ProtoDeclare name="MapMarker" locations="org:mpeg:mapmarker"> <field name="addChildren" ="Nodes" vrml97Hint="eventIn"/> <field name="removeChildren" ="Nodes" vrml97Hint="eventIn"/> <field name="gpsPosition" ="Vector2" vrml97Hint="exposedField" vector2Value="0 0"/> <field name="children" ="Nodes" vrml97Hint="exposedField"> <nodes></nodes> </field> <field name="keywords" ="Strings" vrml97Hint="exposedField stringArrayValue=""/></ProtoDeclare>
Sensors/Actuators
MPEG-A Part 13 ARAF
MPEG-4 Player
MPEG-V Sensor 1
MPEG-V Sensor 2
MPEG-V Sensor 3
MPEG-4 Scene
Compositor Screen
InputSensor 1
InputSensor 2
InputSensor 3
MPEG-4 Player
CameraCamera Input
StreamRAW Decoder Compositor
Screen
hw://camera/back
Scene mapping of captured
data
Compositor mapping of captured
data
Acceleration SensorOrientation Sensor
Angular VelocityGlobal Position Sensor
Altitude Sensor
Camera Sensor
Sensors/Actuators:: MPEG-V
MPEG-A Part 13 ARAF
Real World(Sensor Device)
Real World(Sensory Device)
User
SensedInformation
(5)
Sensor Device
Capability(2)
Sensory Device
Capability(2)
DeviceCommands
(5)
SensoryEffects
Preferences (2)
Virtual World
SensoryEffects
(3)
SensorAdaptationPreferences
(2)
V→R Adaptation: converts Sensory Effects from VW into Device Cmds applied to RW
R→V Adaptation: converts Sensed Info from RW to VW Object
Char/Sensed Info applied to VW
SensedInformation
(5)
VW ObjectCharacteristics
(4)
Engine
Sensors/Actuators:: MPEG-V
MPEG-A Part 13 ARAF
Sensors/Actuators:: MPEG-V types
MPEG-A Part 13 ARAF
ActuatorsLight Flash Heating Cooling Wind Vibration Sprayer Scent Fog Color correction Initialize color correction parameter Rigid body motion Tactile Kinesthetic
Global position command
SensorsLight Ambient noise Temperature Humidity Distance Atmospheric pressure Position Velocity Acceleration Orientation Angular velocity Angular acceleration Force Torque Pressure Motion Intelligent camera typeMulti Interaction point Gaze tracking Wind
Dust Body height Body weight Body temperature Body fat Blood type Blood pressure Blood sugar Blood oxygen Heart rate
Electrograph EEG , ECG, EMG, EOG , GSR Weather Facial expression Facial morphology Facial expression characteristics Geomagnetic
Global position Altitude Bend Gas
Compression
MPEG-A Part 13 ARAF
Media Compression tool name Reference standardImage JPEG ISO/IEC 10918
JPEG2000 ISO/IEC 15444Video Visual ISO/IEC 14496-2
Advanced Video Coding ISO/IEC 14496-10Audio MP3 ISO/IEC-11172-3
Advanced Audio Coding ISO/IEC 14496-33D Graphics Scalable Complexity Mesh Coding ISO/IEC 14496-16
Bone-based Animation ISO/IEC 14496-16Scenes BIFS ISO/IEC 14496-11
Exercises
MPEG-A Part 13 ARAF
AR Quiz Augmented Book
Exercises
MPEG-A Part 13 ARAF
AR Quiz Augmented Book
http://youtu.be/LXZUbAFPP-Yhttp://youtu.be/la-Oez0aaHE
AR Quiz setting, preparing the medias
MPEG-A Part 13 ARAF
images, videos, audios, 2D/3D assetsGPS location
AR Quiz XML inspection
MPEG-A Part 13 ARAF
http://tiny.cc/MPEGARQuiz
AR Quiz Authoring Tool
MPEG-A Part 13 ARAF
www.MyMultimediaWorld.com go to Create / Augmented Reality
Augmented Book setting
MPEG-A Part 13 ARAF
images, audios
Augmented Book XML inspection
MPEG-A Part 13 ARAF
http://tiny.cc/MPEGAugBook
Augmented Book Authoring Tool
MPEG-A Part 13 ARAF
www.MyMultimediaWorld.com go to Create / Augmented Books
Next Steps
MPEG-A Part 13 ARAF
Support for metadata at scene and object levelSupport for usage rights at scene and object levelCollisions between real and virtual objects, partial rendering
ARAF distance to X3D
On Scene Graph– 32 elements
– including 2D graphics, humanoid animation, generic input, media control, and pure AR protos
On Sensors/Actuators– 6 elements
On Compression– MPEG-4 Part 25 already compresses X3D
• Joint development of AR Reference Model– The community at large is invited to react/contribute
such as the model became a reference– http://wg11.sc29.org/trac/augmentedreality
• MPEG promoted a first version of an integrated and consistent solution for representing content in AR applications and services– Continue synchronized/harmonized development of
technical specifications with X3D, COLLADA, OGC content models
Conclusions