4
3D MODEL COMPRESSION IN MPEG Eun-Young Chang 1,2 , Namho Hur 2 , Euee S. Jang 1 1 Digital Media Lab., Hanyang University, Seoul, Korea 2 Broadcasting & Telecommunications Media Research Division, ETRI, Daejeon, Korea ABSTRACT 3D polygonal mesh is the most commonly used representation for 3D graphic models. Compact and compressed representation of the 3D mesh is the major functionality of MPEG 3D mesh compression technologies. 3D mesh coding (3DMC) in MPEG was first standardized in 2000 with the achievement of compression ratio of more than 40 times with reasonable quality. In order to extend 3DMC with additional functionalities (e.g., efficient texture coordinate compression, support for animation/editing, etc.), 3DMC extension (3DMC-X) has been proposed and standardized. In this paper, we reviewed 3D model compression in MPEG in detail. Index Terms— 3D, graphics, mesh, coding, MPEG 1. INTRODUCTION 3D polygonal mesh (with geometry, color, normal vector, and texture coordinate information), as a common surface representation, is now heavily used in various multimedia applications such as computer games, animations, and simulation applications. To maintain a convincing level of realism, many applications require highly detailed mesh models. However, such complex models demand broad network bandwidth and much storage capacity to transmit and store. To address these problems, many 3D mesh compression algorithms have been proposed to reduce the size of 3D model representation [1] [2]. As one of the well-known conventional algorithms, 3D mesh coding (3DMC) introduced in MPEG-4 Visual [3] can typically compress a 3D mesh model represented by IndexedFaceSet in Virtual Reality Modeling Language (VRML) [4] 40 to 50 times without noticeable visual degradation. In addition to compression efficiency, functionalities such as incremental rendering, error resilience, support of non-manifold models, and hierarchical buildup are supported by 3DMC [5] [6]. Efforts for extending the current 3DMC with useful and necessary features have been progressed within MPEG for the purpose of accommodating the recent advances in 3D graphics technologies. Thus, additional functionalities (i.e., efficient textures coordinate compression, support for animation/editing, and etc.) are proposed and integrated into 3DMC extension (3DMC-X) standardized in MPEG-4 Part16 AMD1 [7]. In this paper, we reviewed the 3D mesh compression in MPEG in detail, especially on 3DMC and 3DMC-X. The remainder of this paper is organized as follows. In Section 2, we introduce the definition of 3D mesh model based on the IndexedFaceSet representation in VRML for the better understanding of 3DMC and 3DMC-X. Section 3 and 4 introduce MPEG 3DMC and 3DMC-X, respectively. Comparative results on various 3D VRML models are given in Section 5. In Section 6, we summarize this paper with concluding remarks. 2. DEFINITION OF 3D MESH MODEL 3D polygonal mesh representation in MPEG-4 stems from IndexedFaceSet in VRML [6]. In VRML, a simple 3D model can be easily built by a set of primitive geometries like box, cone, cylinder, and sphere. To construct the more complex 3D model, VRML provides a very flexible set of representations such as PointSet, IndexedLineSet, and IndexedFaceSet representation, using sets of points, lines, and faces, respectively. For example, line plots or grids can be drawn by a set of lines using the IndexedLineSet representation. Complex faceted surfaces like smooth curves of an apple would be constructed by arranging many adjacent sets of faces like flat shaped triangles and squares using the IndexedFaceSet representation [4]. Major components in IndexedFaceSet are geometry, connectivity, and property: (1) geometry information specifies the locations of the points (or vertices) in a 3D world made up of three coordinate values (X, Y, and Z); (2) connectivity information specifies the way of how a set of polygons are formed from the given vertices and connected to construct the 3D model. Connectivity will be described as a list of coordinate integer indices describing the perimeter of polygons. A basic 3D model can be generated only with connectivity and geometry information. Therefore, these two are the most important components in 3D mesh representation; (3) property information is composed of sets of attributes: colors, normals, and texture coordinates. RGB colors contain a list of three floating-point values used as geometry (i.e., vertex, corner, and face) colors, one each for the red (R), green (G), and blue (B) components of a color. Texture coordinates contain a list of 2D coordinates which 2692 978-1-4244-1764-3/08/$25.00 ©2008 IEEE ICIP 2008

[IEEE 2008 15th IEEE International Conference on Image Processing - San Diego, CA, USA (2008.10.12-2008.10.15)] 2008 15th IEEE International Conference on Image Processing - 3D model

  • Upload
    euee-s

  • View
    215

  • Download
    2

Embed Size (px)

Citation preview

3D MODEL COMPRESSION IN MPEG

Eun-Young Chang1,2, Namho Hur2, Euee S. Jang1

1Digital Media Lab., Hanyang University, Seoul, Korea 2Broadcasting & Telecommunications Media Research Division, ETRI, Daejeon, Korea

ABSTRACT

3D polygonal mesh is the most commonly used representation for 3D graphic models. Compact and compressed representation of the 3D mesh is the major functionality of MPEG 3D mesh compression technologies. 3D mesh coding (3DMC) in MPEG was first standardized in 2000 with the achievement of compression ratio of more than 40 times with reasonable quality. In order to extend 3DMC with additional functionalities (e.g., efficient texture coordinate compression, support for animation/editing, etc.), 3DMC extension (3DMC-X) has been proposed and standardized. In this paper, we reviewed 3D model compression in MPEG in detail.

Index Terms— 3D, graphics, mesh, coding, MPEG

1. INTRODUCTION

3D polygonal mesh (with geometry, color, normal vector, and texture coordinate information), as a common surface representation, is now heavily used in various multimedia applications such as computer games, animations, and simulation applications. To maintain a convincing level of realism, many applications require highly detailed mesh models. However, such complex models demand broad network bandwidth and much storage capacity to transmit and store. To address these problems, many 3D mesh compression algorithms have been proposed to reduce the size of 3D model representation [1] [2].

As one of the well-known conventional algorithms, 3D mesh coding (3DMC) introduced in MPEG-4 Visual [3] can typically compress a 3D mesh model represented by IndexedFaceSet in Virtual Reality Modeling Language (VRML) [4] 40 to 50 times without noticeable visual degradation. In addition to compression efficiency, functionalities such as incremental rendering, error resilience, support of non-manifold models, and hierarchical buildup are supported by 3DMC [5] [6].

Efforts for extending the current 3DMC with useful and necessary features have been progressed within MPEG for the purpose of accommodating the recent advances in 3D graphics technologies. Thus, additional functionalities (i.e., efficient textures coordinate compression, support for animation/editing, and etc.) are proposed and integrated into

3DMC extension (3DMC-X) standardized in MPEG-4 Part16 AMD1 [7]. In this paper, we reviewed the 3D mesh compression in MPEG in detail, especially on 3DMC and 3DMC-X.

The remainder of this paper is organized as follows. In Section 2, we introduce the definition of 3D mesh model based on the IndexedFaceSet representation in VRML for the better understanding of 3DMC and 3DMC-X. Section 3 and 4 introduce MPEG 3DMC and 3DMC-X, respectively. Comparative results on various 3D VRML models are given in Section 5. In Section 6, we summarize this paper with concluding remarks.

2. DEFINITION OF 3D MESH MODEL

3D polygonal mesh representation in MPEG-4 stems from IndexedFaceSet in VRML [6]. In VRML, a simple 3D model can be easily built by a set of primitive geometries like box, cone, cylinder, and sphere. To construct the more complex 3D model, VRML provides a very flexible set of representations such as PointSet, IndexedLineSet, and IndexedFaceSet representation, using sets of points, lines, and faces, respectively. For example, line plots or grids can be drawn by a set of lines using the IndexedLineSet representation. Complex faceted surfaces like smooth curves of an apple would be constructed by arranging many adjacent sets of faces like flat shaped triangles and squares using the IndexedFaceSet representation [4].

Major components in IndexedFaceSet are geometry, connectivity, and property: (1) geometry information specifies the locations of the points (or vertices) in a 3D world made up of three coordinate values (X, Y, and Z); (2) connectivity information specifies the way of how a set of polygons are formed from the given vertices and connected to construct the 3D model. Connectivity will be described as a list of coordinate integer indices describing the perimeter of polygons. A basic 3D model can be generated only with connectivity and geometry information. Therefore, these two are the most important components in 3D mesh representation; (3) property information is composed of sets of attributes: colors, normals, and texture coordinates. RGB colors contain a list of three floating-point values used as geometry (i.e., vertex, corner, and face) colors, one each for the red (R), green (G), and blue (B) components of a color. Texture coordinates contain a list of 2D coordinates which

2692978-1-4244-1764-3/08/$25.00 ©2008 IEEE ICIP 2008

specify locations in texture coordinate system spanning the texture image, and control the way of texture image mapping on to the 3D model. Normals consist of a list of three floating-point values (Nx, Ny, and Nz) indicating the geometry orientations to control shading. The appearance of the 3D model will be designed by applying preferable properties on the top of the model representation [4] [5].

(a) wireframe (b) with colors

(c) with texture maps (d) with normals Figure 1. 3D mesh model.

3. 3DMC

As an international standard, MPEG-4 3DMC is a representation and compression tool for IndexedFaceSet nodes of 3D mesh models. The major concept of 3DMC is called topological surgery [8], which decomposes a 3D mesh model into the 2D mesh structure composed of a dual graph pair of vertex graph (VG) and simple polygon (SP). The simple polygon is a 2D mesh torn down from the 3D mesh. The vertex graph is connectivity information necessary to stitch the 2D mesh together to restore the original 3D mesh model [6]. Using topological surgery, 3DMC can compress the connectivity of a 3D mesh model losslessly with a cost of around 2 bits per triangles.

Besides the connectivity information, 3DMC compresses all the geometry and properties information of the 3D mesh model. However, the compression of geometry and properties information may be lossy. A high level block diagram of a general 3DMC encoder is shown in Figure 2. 3DMC comprises three major coding blocks: topological surgery (data transformation); differential quantization of connectivity, geometry, and property information (quantization); and entropy coding [5].

3DMC based on the topological surgery can compress a VRML ASCII file down to 2% to 4% of its original size without noticeable visual degradation. However, compression is not the only advantage to using 3DMC. The following enhanced functionalities are supported by 3DMC: (1) support for computational graceful degradation control;

(2) support for non-manifold and non-orientable models based on the dedicated operation called stitching; (3) support for error resilience; (4) quality scalability via hierarchical transmission of levels of detail based on the forest split operation [9]; and (5) incremental rendering [3].

Figure 2. Block diagram of 3DMC encoder.

4. 3DMC-X

Need for extending 3DMC has been arisen to fulfill the demands of recent emerging applications such as animations and computer games. Therefore, more features (e.g., efficient texture coordinate compression, support for animation/editing, and new stitching operation) are added to 3DMC. The extended version of 3DMC is called as 3DMC-X and standardized in MPEG-4 Part16 AMD1 [7]. In this section, efficient texture coordinate compression and support for animation/editing functionalities are described in detail.

4.1. Efficient Texture Coordinate Compression

The more polygons are used, the better the quality of the rendered 3D model is. However, this increases the rendering complexity. To help accelerating the rendering speed while maintaining the reasonable quality for the rendered 3D model, texture mapping using small number of polygons with high quality texture is popularly used in the recent applications. Texture coordinates (TCs) play an important role in mapping a texture on top of the 3D geometry of a model. Using 3DMC, TCs may be coded lossy, which means that the original values may not be preserved after compression. These precision errors can cause serious texture mapping distortions [10]. And lossless compression of 3D meshes is clearly identified as one of mandatory functionalities in MPEG [11].

To address the above issues, efficient TC compression is supported in 3DMC-X, which enables lossless compression of TC values (possibly) with better compression efficiency by adopting an adaptive quantization scheme. The adaptive quantization scheme adaptively chooses the quantization step size (QSS) by exploiting two

2693

factors: 1) the inverse of texture image size and 2) the interval of TCs. Originally, the QSS is determined by bits per texture coordinates (bpt) given by the encoder input. Finally, the value of QSS for the adaptive quantization scheme can be one of the following values as shown in Equation (1).

sizestepTCQTC ii _/

tervalregular_insizetexture

sizestep bpt

:_:

2:_1

(1-a) 3DMC (1-b) QSS1 (1-c) QSS2

where, bptsizestep 2_ and 120 bptiQTC [10].

The major reason for further reduction in compression of TC values is that the QSS value in 3DMC is normally larger than QSS1 or QSS2. Otherwise, a lossless texture compression cannot be guaranteed.

4.2. Support for Animation, Editing, and Updating

The decomposition process of the topological surgery inherently destructs the original vertex and face order defined by the author (through authoring tools) to achieve substantial compression gain. The change of vertex and face order may not be a problem when the given 3D mesh model is used in static environment, which means that no further animation or editing will be applied to the model. On the other hand, the change of vertex and face order is troublesome in object animation or on-line/instant object updating and editing [12].

In order to resolve this problem, 3DMC-X provides the functionality of supporting for animation, editing and updating by transmitting the vertex and face order information additionally. However, this additional transmission causes an unexpected increase of the overall bitstream size. Thus, 3DMC-X introduced a vertex and face order compression enabling the efficient animation support with minimal size of the side information. The first primary idea of the vertex and face order compression is the adoption of an adaptive probability model on the symbol occurrences, which can allocate one-less-bit codeword to each vertex and face order in every distinguishable unit as encoding proceeds. The coding gain of the vertex and face order compression based on the adaptive probability model in difference compared to the fixed probability model is shown in Equation (2).

22 0bpigaincoding (2) where, Nbpi 20 log and N is the total number of the vertices or faces [12].

The other principal idea is the introduction of the connected component representation. According to the current VRML representation with IndexedFaceSet, we can consider a 3D mesh as a set of connected components or connected polygons instead of treating a 3D mesh as a set of

indexed faces (or polygons). Exploiting the connected component concept that exists in 3DMC, we can achieve further improvement of coding efficiency without any loss compared to the IndexedFaceSet representation [12].

5. PERFORMANCE EVALUATION

In order to evaluate the lossless coding performance on texture coordinates of 3DMC and 3DMC-X, we have tested three models (batteryD, earth, and nefert131b) with texture coordinate information using three methods: 1) 3DMC, 2) QSS1, and 3) QSS2. The objective assessment is provided in Table 1.

From the Table 1, we can see that 3DMC can efficiently compress the VRML ASCII files from 8.4% to 1.8% of their original file size. However, it cannot guarantee the lossless encoding, resulting about 0.9 pixel mismatch on average between original and reconstructed TC values. And, we can clearly see that 3DMC-X can support lossless compression with better compression efficiency. TC bits of QSS1 and QSS2 are about 10 % less than those of 3DMC with lossless coding [10].

Table 1. Objective evaluation with VRML.

MethodTotal

bit-rate (%)

MSE Lossless

count (%)

TC bits(%)

batteryD 3DMC 1.76 0.8073 26.97 100.00QSS1 1.63 0.0000 100.00 89.83QSS2 3.70 0.0000 100.00 253.83

Earth 3DMC 2.39 0.9937 14.69 100.00QSS1 2.31 0.0000 100.00 94.48QSS2 1.67 0.0000 100.00 59.55

nefert131b3DMC 8.32 0.8420 26.52 100.00QSS1 8.00 0.0000 100.00 90.48QSS2 8.01 0.8253 27.00 90.40

In addition to the objective measures, we have compared the test results subjectively and attached some of the results in Figure 3 [10].

(a) 3DMC (b) QSS1 (c) QSS2 Figure 3. Subjective quality (nefert131b).

Regarding QSS2 using QSS as a regular interval of ordered TCs, the compression efficiencies are variable according to the models. To achieve consistent coding gain

2694

using QSS2, we may apply a filtering process to the TC values before encoding to avoid multiple different real values that indicate the same integer texture coordinate to guarantee one-to-one mapping in data type conversion from real to integer even after compression [10].

In order to evaluate the performance of 3DMC-X with vertex and face order information compared with 3DMC, we have tested the nefert131b model under the following conditions: (1) set all the coding parameters with the default values (CASE Ⅰ) and (2) set coding parameters enough to achieve near lossless compression (CASE Ⅱ). In Table 2, we listed the total compressed bit-rate including geometry, connectivity, and other property information [12].

Table 2. Overall bit-rate for the nefert131b model [bytes]. Bit-rate %

Original VRML 2,389,478 100

CASE Ⅰ(bpv=10)

3DMC 198,690 8.32

3DMC-X

3DMC+FO 214,173 8.96

3DMC+VO 250,938 10.50

3DMC+FO+VO 266,420 11.15

CASE Ⅱ(bpv=20)

3DMC 320,043 13.39

3DMC-X

3DMC+FO 335,526 14.04

3DMC+VO 372,291 15.58

3DMC+FO+VO 387,773 16.23

CASE Ⅲ ZIP compression 630,506 26.39

From the Table 2, we can see that 3DMC can efficiently compress the VRML ASCII file to 8% of its original file size. Moreover, adding only 3% compared to the original file size (CASE Ⅰ and CASE Ⅱ), we can preserve the vertex and face order which makes supporting the animation and on-line/instant editing operations. If the vertex and face order is not provided, we need to use the BIFS encoding or other lossless encoding methods which are the other ways of preserving the vertex and face order. To evaluate lossless performance, we used the ZIP compression tool (CASE Ⅲ)for comparison. From these experimental results on CASE Ⅱ and CASE Ⅲ, we can conclude that this 3% additional information is sufficiently small compared to the size of the ZIP compressed bitstream which brings the same effects [12].

6. CONCLUSIONS

In this paper, we reviewed the 3D mesh compression in MPEG in detail. 3DMC is the representation and compression tool for IndexedFaceSet nodes of 3D mesh models, which achieves a compression ratio of 50:1 over the VRML ASCII file. In order to extend 3DMC to fulfill the demands of recent emerging applications, 3DMC-X has

been proposed and standardized. Therefore, the useful and necessary features such as efficient texture coordinate compression and support for animation/editing are added to 3DMC-X. The efficient texture coordinate compression based on the adaptive quantization scheme can guarantee the lossless compression with better compression efficiency. And, the vertex and face order compression using the adaptive probability model and the connective components representation supporting the animation preserves the vertex and face order with sufficiently small information compared to the other lossless encoding methods.

Recently, there are expectations for promoting 3DMC to the mobile game market in Korea. Therefore, standardization on mobile 3D compression profile [13] and mobile 3D compression stream multiplexer [14] is progressed under the Mobile 3D Standardization Forum.

7. REFERENCES

[1] P. Alliez and C. Gotsman, “Recent Advances in Compression of 3D Meshes,” Symposium on Multiresolution in Geometric Modeling, 2003. [2] Jingliang Peng, Chang-Su Kim, and C.-C. Jay Kuo, “Technologies for 3D triangular mesh compression: a survey,” Journal of Visual Communication and Image Representation, Vol. 16, No. 6, pp. 688-733, Dec. 2005. [3] “Information Technology - Coding of Audio-Visual Objects - Part2: Visual,” ISO/IEC JTC1/SC29/WG11 N5546, Pattaya, Thailand, March 2003. [4] A. Nadeau, VRML 2.0 Sourcebook, Wiley, 1997. [5] Fernando Pereira and Touradj Ebrahimi, The MPEG-4 Book,Prentice Hall, 2002. [6] Aaron E. Walsh and Mikael Bourges-Sevenier, MPEG-4 Jump-Start, Prentice Hall, 2002. [7] “Information Technology - Coding of Audio-Visual Objects – Part16: Animation Framework eXtension (AFX) AMENDMENT 1: Geometry and shadow,” ISO/IEC 14496-16:2006/Amd.1:2007, 2007.[8] G. Taubin and J. Rossignac, “Geometric Compression through Topological Surgery,” ACM Transactions on Graphics, 1998. [9] G. Taubin, “3D geometry compression and progressive transmission,” EUROGRAPHICS. State of the Art Report, 1999. [10] Sunyoung Lee, Byeongwook Min, Daiyong Kim, Eun-Young Chang, Namho Hur, Soo In Lee, and Euee S. Jang, “An Adaptive Quantization Scheme for Efficient Texture Coordinate Compression in MPEG 3DMC,” Lecture Notes in Computer Science(LNCS), Vol.3767, pp. 73-83, 2005. [11] “MPEG-4 requirements, version 18,” ISO/IEC JTC1/SC29/WG11 N5866, Trondheim, Norway, July 2003. [12] Eun-Young Chang, Daiyong Kim, Byeongwook Min, Sunyoung Lee, Namho Hur, Soo In Lee, and Euee S. Jang, “Vertex and Face Permutation Order Compression for Efficient Animation Support,” Electronic Imaging 2006, 2006. [13] “Mobile 3D compression profile,” Mobile 3D Standardization Forum (M3DSF) M3DSF_04.0001/R1, 2006. [14] “Mobile 3D compression stream multiplexer,” Mobile 3D Standardization Forum (M3DSF) M3DSF_04.0002/R1, 2006.

2695