7

Click here to load reader

Interactive 3D Online Video Requirements

Embed Size (px)

Citation preview

Page 1: Interactive 3D Online Video Requirements

Nomad3D White Paper

Date: 29.07.2012 Page 1

White Paper: Mobile Interactive Online 3D Video Requirements

Dr. Alain Fogel, CEO Nomad3D

www.nomad3D.com

Contact: Dr. Alain Fogel, CEO Nomad3D

Email: alain.fogel@nomad3D

Page 2: Interactive 3D Online Video Requirements

Nomad3D White Paper

Date: 29.07.2012 Page 2

1 Nomad3D Executive and Technical Summary

Nomad3D has developed a revolutionary 3D video codec that runs as an efficient

extension on existing 2D video codecs. This codec, referred to as Nomad3D 3D+F, or

3D+F for short, is able to provide interactive 3D video capabilities on top of existing 2D

video infrastructure, where traditional 3D extensions of 2D video codecs fail. Nomad3D

3D+F provides for an efficient and low latency coding and decoding of stereoscopic 3D

video as a thin software layer on top of existing 2D video codec infrastructure. It is well

positioned for use in interactive 3D video gaming, telemedicine, 3D teleconferencing,

3D military communication and mobile 3D environments.

3D+F uses a representation of stereoscopic left/right views as a fused single 2D view

plus additional fusion meta-data. The fused 2D view is coded using a traditional 2D

video codec (H.264, VP8), whereas the fusion metadata (FusionData) is efficiently

encoded using Nomad3D’s fusion codec. Nomad3D IP protects the fusion and de-fusion

technology, as well as technical aspects of the fusion codec.

Traditional stereoscopic 3D video codecs essentially operate by separately encoding left

and right views. This approach leads to an increase by a factor of two compared to a 2D

video codec in terms of complexity, required resources and bandwidth. The 3D+F codec

resolves this complexity issue by transforming a stereoscopic pair into (1) a single fused

(Cyclops) 2D view, and (2) FusionData. The FusionData encode technical elements of

the fusion process and assist in recreating left-right views at decoding time. The

complexity of the fusion and defusion, as well as the fusion codec, is very small.

Moreover, the bandwidth required for FusionData is small. The overall complexity of

the 3D+F codec is therefore mainly determined by the complexity of the 2D video

codec in the lower branch of Figure 1. The 2D video codec in 3D+F is a free parameter,

and may be chosen from a number of well-performing video codecs, such as

H.264/AVC (Wikipedia, 2012) and VP8 (WebM, 2012).

For a given video quality target, 3D+F may gain close to a factor of two compared to a

standard 3D video codec in coding complexity, required resources and bandwidth.

3D+F is therefore very well suited to be deployed in demanding and/or resource-

constrained ecosystems.

Page 3: Interactive 3D Online Video Requirements

Nomad3D White Paper

Date: 29.07.2012 Page 3

2 Interactive Video Overview

Interactive video involves two or more parties interacting in real-time via graphical,

video, textual or audio interfaces. Interactive video is important and/or relevant in video

gaming, telemedicine, and military communications, among others. The respective

markets are growing at tremendous speeds due to the increasing availability of powerful

mobile devices and increasingly powerful connectivity. As stereoscopic 3D (S3D)

becomes more available due to maturing technologies such as auto-stereoscopic

smartphones and tablets (no glasses), interactive video will migrate to S3D devices. The

use of interactive 3D video is believed to lead to a richer user experience and/or a better

understanding of context and situation.

2.1 Interactive Video Gaming

Interactive video gaming over the Internet is a large and growing market, including the

developing Online Cloud Gaming (OCG) market. According to PWC (PWC, 2012), the

online market will take over the traditional gaming market as shown on Figure 2 and

clearly this trend is supported by the deployment of mobile devices. Social Gaming

companies such as Zynga and Playfish and Game on Demand companies such as

OnLive, Microsoft, and Sony Computer Entertainment are forming the bulk of the

ecosystem of OCG.

2.2 Telemedicine

Telemedicine is a developing area, benefitting from the ubiquity of smartphones and

tablets. The technology leaders in this market (e.g. Philips, Cisco, Lucent-Alcatel, and

Fusion

FusionData

Encoding FusionData

Decoding

De-Fusion

2D Video

Encoding 2D Video

Decoding

Left-Right

Views

Left-Right

Views

Fusion

Data

Cyclops (2D)

View Standard

2D Codec Standard

2D Codec

Fusion Codec

(Software) Fusion Codec

(Software)

Figure 1 - 3D+F Overview

Page 4: Interactive 3D Online Video Requirements

Nomad3D White Paper

Date: 29.07.2012 Page 4

Honeywell) are cooperating with leading carriers (e.g. Orange, and Vodafone), leading

device manufacturers (e.g. Apple, Samsung, and LG) and providers of video

conferencing systems (e.g. Cisco, Polycom, and Vidyo).

2.3 Military communications

Military Communications have specific requirements for interactive video, in particular

in the area of airborne surveillance (e.g. drones), target identification, tracking and

engaging. The use of interactive 3D video is critical to a better understanding of context

and situation.

Figure 2– Online/Wireless Gaming Market vs. Console/PC Market

3 Interactive Online Mobile Video – The Pain and Requirements

In the following, we present the pain and requirements of interactive online mobile

video. We point out the differences between 2D and 3D and the resulting differences in

the requirements and the compared consequences due to video codecs.

3.1 3D Cursor

3D interactive applications very often require a 3D cursor, i.e. a cursor that is being

perceived at the depth of the object it is pointing to. The correspondence between the

3D cursor and the actual pixel pointers in the left and right view is hard to implement

Page 5: Interactive 3D Online Video Requirements

Nomad3D White Paper

Date: 29.07.2012 Page 5

without the help of a depth or disparity map. This depth/disparity must therefore be

provided in addition to the H.264 baseline flows, implying additional significant

computational, power and bandwidth resources.

3.2 Latency

Latency is the time between capturing and displaying a video frame. Latency is a

primary issue in interactive video and is preferably less than 100ms end-to-end. The

standard codec H.264 is not able to achieve such latency in its highest profiles where the

latency is often above 1 second. Hence, the standard codec will be typically used in

baseline profile. Latency gets worse for 3D video content with separate or multi-view

encoding of left and right views.

3.3 Bit Rate

The required bit rate (or bandwidth) for video transmission depends on the resolution,

frame rate, quality requirement and video codec capability to compress and decompress

within the limitations of the communication channel bandwidth (network capability).

For 3D content, the required bit rate for a given quality may increase by as much as a

factor of two.

3.3.1 Cloud gaming

For OCG the required bit rate for good video quality with H.264 exceeds 3 Mbps. In the

US, available bit rates on the public Internet are usually between 1 and 7 Mbps, making

3D gaming with H.264 problematic. In addition 3D OCG very often needs the presence

of a depth or disparity map for implementation of a 3D cursor requiring significant

additional transmission bandwidth.

3.3.2 Telemedicine and the military communication

The required bit rate for interactive telemedicine and military applications is in the order

of 10 Mbps. This rate is difficult to sustain on public US networks. In Europe and Asia

the situation is slightly better, but still on the cutting edge of possibilities. For 3D

interactive video with the requirement of a 3D cursor using a standard H.264 codec,

network performance is insufficient to sustain acceptable 3D quality.

3.4 Computing Power

3.4.1 Cloud gaming and military

Strong computing capabilities are required to decode a flow of H.264-coded 2D video.

3D video decoding using a standard H.264 solution requires doubling of this compute

Page 6: Interactive 3D Online Video Requirements

Nomad3D White Paper

Date: 29.07.2012 Page 6

power. On a 4.3” smartphone, this implies up to 40% of reduction of battery life and on

a tablet, up to 20%.

3.4.2 Telemedicine and military

In addition to decoding computing capabilities, telemedicine and military

communication also require HD encoding capabilities. The H.264 encoder is much

more power hungry than the decoder (at least 2 times) and therefore the impact on

computing power and battery life is severe.

3.5 Conclusion

The requirements for interactive 3D video are at or over the limit of current capabilities

of public networks and terminals using standard H.264 coding techniques.

4 The Solution: NOMAD3D 3D+F Codec

4.1 Features

Nomad3D has developed an innovative 3D CODEC with the following features:

• 3D+F is compatible with 2D codecs (e.g. H.264, VP8)

• 3D+F is a layer on top a 2D underlying codec that can be reused for 3D as is.

• 3D+F is high performance and is power efficient.

• 3D+F requires no change in the hardware of 2D application processors.

• 3D+F is enabled by pre-installed or downloadable software.

4.2 Compliance of 3D+F with Interactive 3D Online Video

The Nomad3D 3D+F 3D codec is the solution for complying with the requirements of

3D interactive online video. Specifically, compared to 2D:

• 3D+F adds at most one video frame of latency (30ms for 30 fps).

• 3D+F requires at most a 30% of bit rate increase (H.264: 130%).

• 3D+F does not need an additional depth map to implement a 3D cursor.

• 3D+F increases decoder power consumption by less than 15%.

Page 7: Interactive 3D Online Video Requirements

Nomad3D White Paper

Date: 29.07.2012 Page 7

The advantages of 3D+F vs. H.264 for interactive online 3D video are summarized on

Table 1 below:

3D H.264

Baseline Profile

NOMAD3D

3D+F

Latency Low Low

Decoder Computing

Power High Very Low

Encoder Computing

Power Very High Low

Software

Implementation No Yes

Hardware Impact High None

Required Bit Rate High Low

3D cursor complexity High Low

Additional Depth or

Disparity Map Yes No

Table 1 – Comparison of H.264 and 3D+F for Interactive Online 3D Video

REFERENCES

[ 1] - WebM. (2012, July 25). Welcome to the WebM Project. Retrieved July 25, 2012, from WebM: http://www.webmproject.org/ [ 2] - Wikipedia. (2012, July 25). H.264/MPEG-4 AVC. Retrieved July 25, 2012, from Wikipedia: http://en.wikipedia.org/wiki/H.264/MPEG-4_AVC [ 3] – PwC Internet source - http://www.pwc.com/gx/en/global-entertainment-media-outlook/segment-insights/video-games.jhtml