Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Overview of MPEG-4
Lihang YingDepartment of Computing Science
University of Alberta, Edmonton, Canada
These slides are available online: www.cs.ualberta.ca/~lihang/Share/mpeg4
Outline� MPEG-4 Demos and Overview
� Demos� Overview
� How to Organize MPEG-4 Contents –Scene/Object Description� Examples Study
� Synthetic and Natural Hybrid Coding(SNHC) – Visual Part� 2D Mesh Coding� 3D Mesh Coding
Characters(1)
� MPEG-4 vs MPEG-1/2� Not merely video and audio� Interactive
� Object-based� Scalability
Characters(2)� Why MPEG-4?
� Interoperability: � Run on all kinds of platforms and devices� Reuse Multimedia contents� Create once, use everywhere
� Multi-network Delivery� Internet/Mobile/Broadcast Networks� Different bandwidth
� Scalability� Different capacity (i.e. display resolution) of
different devices
MPEG-J
� API:� org.iso.mpeg.mpegj� org.iso.mpeg.mpegj.scene� org.iso.mpeg.mpegj.resource� org.iso.mpeg.mpegj.decoder� org.iso.mpeg.mpegj.net
� Implement MPEG-4 Coder/Decoder conveniently with MPEG-J API
� Create Coder/Decoder once, run on all kinds of devices and platforms
Profile/Level
� Different Implementations:� Profile
� Divide functionality into different subsets
� Level� Constraints on parameters(bitrate,frames/sec…)
� Example: EnvivioTV� Video: Advanced simple profile at levels 0 - 5.� Audio: High-quality profile at levels 1 - 2.� Graphics: Advanced profile
•Interactive
� Multi-network Delivery
� Coder/Decoder: Using MPEG-J� Scalability: Different Capacity� Profile/Level
�Not merely audio/video�Object-based �Interoperability
Outline� MPEG-4 Demos and Overview
� Demos� Overview
�How to Organize MPEG-4 Contents –Scene/Object Description� Examples Study
� Synthetic and Natural Hybrid Coding(SNHC) – Visual Part� 2D Mesh Coding� 3D Mesh Coding
How to Organize Contents
� Scene Descriptor� Assemble objects into audiovisual scene� Scene description format—binary format
for MPEG-4 scenes (BIFS)
� Object Descriptor� Describe objects
initialobject descriptionES_Descriptor1
ES_Descriptor2
scene descriptor stream
BIFS update (replace scene)
scenedescription
scenedescription
VideoSourceAudio
Source
object descriptor stream
object descriptor update
objectdescr.
object descr.
ES_Descr1
ES_Desc2
visual stream (base layer)
visual stream (e.g. temporal enhancement layer)
audio stream
ES_ID1
ES_ID2
ES_D1 ES_IDc
ES_IDbES_IDa
ES_IDi
ES_IDii
Scene Description - BIFS� Represented by XMT-A Format:
� Similar to XML� Express bitstream syntax in document� Enable easy generation of bitstream parser
� BIFS Examples: …
BIFS Example(1)–Trivial Scene(MPEG-2/DVD)
� Scene Tree
Layer2D
Sound2D
AudioSource
Shape
Bitmap
Appearance
MovieTexture
Object Description� Syntactic Description Language (SDL)
� Express bitstream syntax in document� Enable easy generation of bitstream parser
� SDL Example:…
Object Description - SDL� ObjectDescriptorclass ObjectDescriptor extends ObjectDescriptorBase: bit(8)
tag=ObjectDescrTag {
bit(10) ObjectDescriptorID;
bit(1) URL_Flag;
const bit(5) reserved=0b1111.1;
if (URL_Flag) {
bit(8) URLlength;
bit(8) URLstring(URLlength);
} else {
ES_Descriptor esDescr[1..255];
OCI_Descriptor ociDescr[0..255];
IPMP_DescritporPointer ipmpDescriPtr[0..255];
}
ExtensionDescriptor extDescr[0..255];
}
Object Descriptor Summary� ObjectDescriptor
� ObjectDescriptorID� URL_Flag� ES_Descriptor // Elementary Streaming
ES_ID, streamDependenceFlag, URL_Flag, OCRstreamFlag, streamPriority, DecoderConfigDescriptor, SLConfigDescriptor, IPI_DescrPointer, IP_IdentificationDataSet, IPMP_DescriptorPointer, LanguageDescriptor, QoS_Decriptor...� OCI_Descriptor // Object Content Information
ContentClassificationDescriptor, KeywordDescriptor, RatingDecriptor, LanguageDescriptor, ShortTextualDescriptor, ExpandedTextualDescriptor, ContentCreatorNameDescriptor, ContentCreationDataDescriptor, OCICreatorNameDescriptor, OCICreationNameDescriptor, SmpteCameraPositionDescriptor, MediaTimeDescriptor, ...� IPMP_DescriptorPointer // Intellectual Property Management and
Protection
� Applications of OCI/IPMP–eDonkey’s problems
MPEG-4 Objects and Tools
� Audio� Natural Audio� Synthetic and Natural Hybrid Coding(SNHC)
� Visual� Natural Video
� Object-based/Scalability
� SNHC� 2D/3D Mesh Object/Face and Body Animation
� Image� Text …
Outline� MPEG-4 Demos and Overview
� Demos� Overview
� How to Organize MPEG-4 Contents –Scene/Object Description� Examples Study
�Synthetic and Natural Hybrid Coding(SNHC) – Visual Part� 2D Mesh Coding� 3D Mesh Coding
[2D Mesh Coding]� Natural Video Coding
� Block-based textual and motion coding� Shape information coding
� 2D Mesh Coding� Designed for video manipulation� 2D mesh or 2D planar graphs with triangles� Natural images and video mapped on 2D meshes� Applications: Object tracking, Content-based video
retrieval(e.g. motion-based queries), 2D animation, Augmented reality, …
Example
�(a)original frame
�(b)Mesh generated
�(c)Text overlaid on video:Text moves along with the fish’s meshs
2D Mesh Object� Also called 2D Dynamic Mesh� Support video coding by moving the
vertices of the mesh� Topology of the mesh does not change in
one session
� Mesh Data includes:� Connectivity: how vertices are connected� Geometry: 2D coordinates of vertices� Motion: temporal difference of vertices’
positions
I-MOP and P-MOP
� I-MOP:Intra-Mesh Object Plane� For a given session, connectivity and
geometry information needs to be transmitted only once
� P-MOP:Inter-Mesh Object Plane� The deformation of the given mesh over
time can be described as temporal difference of the geometry, or geometry motion
Mesh Data - Connectivity
� Uniform Triangulation: � Suited for rectangular video objects� Located in x and y grids� Specify the length of grid intervals
Mesh Data - Connectivity
� Delaunay Triangulation: � Suited for arbitrarily shaped video objects� Guarantee:
� Close to Equilateral: producing the largest minimal angle
� Unique: unique triangulation for given vertices
Coding of Connectivity Data
� Uniform Triangulation:
� Delaunay Triangulation:� Differential coding:
xn=xn-1+dxn, yn=yn-1+dyn
Coding Order of Delaunay Triangulation
� 1) Boundary vertices� Start from top-left most� Counterclockwise
� 2) Inside vertices� Choice the next by distance-closest one
Coding of Mesh Motion
� Motion: temporal difference of vertices’ positions� Mesh Traversal:
� 1) Start from top-left, breadth-first� 2) Right(Next counterclockwise)� 3) Left� This order remain unchanged(intact) until next I-
MOP is decoded
� Mesh Motion Coding� Encoded based on previously encoded two
neighboring vertices, e.g. cbaabcIn →∆ ),(,
[3D Mesh Coding]
� 2D Mesh Coding:� supports to map natural images and video
mapped on 2D meshes
� 3D Mesh Coding:� Represent and compress 3D objects onto
which images and videos may be mapped� Compress static 3D models, not their
animation
Functionalities
� High compression� 2%-4% of VRML ASCII file
� Incremental rendering� Building the model with part bitstream
� Error resilience� Suffer less from network errors
� Hierarchical buildup� Scalable bitstream with different
resolutions, depending on viewing distance
Data of 3D Mesh Object
� Connectivity:� how vertices are connected
� Geometry:� 3D coordinates of vertices
� Photometry� Colors� Normals� Texture
Bitstream of 3D Mesh Coding
� Connectivity Data� Vertex graph� Triangle tree
� Triangle Data � Contains: geometry coordinates, colors,
normals, texture coordinates� Largest part of the bitstream
Bitstream of 3D Mesh Coding
� Connectivity Data is packed separatelyand before the Triangle Data.
� Benefits:� Incremental rendering:
� Could decode Triangle Data incrementally since full Connectivity(topology) Data is already available
� Shorten the latency
� Error resilience:� Can form 3D structure even with some missing
Triangle Data
Data of 3D Mesh Object
� Connectivity:� how vertices are connected
� Geometry:� 3D coordinates of vertices
� Photometry� Colors� Normals� Texture
Coding of Geometry and Photometry Data
� 1) Quantization
� 2) Differential Coding� No prediction� Parallelogram prediction� Tree prediction
� 3) Adaptive Arithmetric Entropy Coding� Code the differential values
3D Mesh Coding Modes
� Error-Resilience Mode� To minimize the impact of errors, divide
into partition or packet� Render partitions independently
� Progressive Transmission Mode� Scalable coding
� One base layer� One or more enhancement layers
� Provide Forest Split operations� Contains face forest, triangle tree, triangle data
Forest Split Operation
(a) Cut through the edges of vertex tree
(b) Open the dotted line
(c) Triangulate the opening to form a triangle tree
(d) Refined mesh
References� Books:
�� Major Reference: Major Reference: Fernando Pereira,Touradj Ebrahimi,The MPEG-4 Book, Prenticle Hall PTR, 2002
� Natural Video Coding Technology: Joan L.Mitchell,etc. MPEG Video Compression Standard, Chapman&Hall, 1996
� MPEG Official Websites:� Overview: http://mpeg.telecomitalialab.com/standards.htm�� ResourcesResources: http://www.m4if.org/resources.php
� Demos:� http://www.envivio.com/products/etv/content/technical.jsp� http://www.ivast.com/aboutmpeg4/index.html
� MPEG-4 Series Slides, Course Presentation of C640/2003 Winter, U. of Alberta:� http://www.cs.ualberta.ca/~anup/Courses/604/604_3D.htm
DecoderConfigDescriptorClass DecoderConfigDescriptor extends
BaseDescriptor : bit(8)tag=DecoderConfigDescrTag {
bit(8) objectTypeIndication;bit(6) streamType;bit(1) upStream;const bit(1) reserved=1;bit(24) bufferSizeDB;bit(32) maxBitrate;bit(32) avgBitrate;DecoderSpecificInfo decSpecificInfo[0..1];profileLevelIndicationIndexDescriptor
profileLevelIndicationIndexDescr[0..255];}
Back