HDLogix 3D IQ Overview 001

Embed Size (px)

Citation preview

  • 8/3/2019 HDLogix 3D IQ Overview 001

    1/11

    ImageIQ3D 3D Video from 2D Video in Real-TimeUtilizing many of the same patented and patent- pending technologies as HDlogixs ImageIQ --ImageIQ 3D literally creates a new dimension for video. Drawing on HDlogixs expertise withsuperresolution that converts SD video to full- bandwidth HD video and ImageIQs GPU -based, real-time optical flow and image structure analysis capabilities, it is now possible to convert 2D video into3D video in real- time with no intervention. Unlike previous pseudo -3D gim micks, ImageIQ3Dreconstructs the geometry of the video scene from regular video in order to create true 3D stereovideo for any 3D video display even autostereoscopic 3D displays that do not require any glasses.

    What is ImageIQ3D?

    ImageIQ3D constructs a geometric representation and model of objects in a video scene, in real-time,with no user or operator intervention. This information is used to convert regular video into left/rightstereo views and/or color plus depthmap views suitable for display on any existing 3D display.Additionally, ImageIQ3D can generate full-color 3D stereo content from anaglyph (e.g. red/cyan), alsoin real-time.

    Any 2D, live broadcast can be generated as 3D, on-the-fly, without having to shoot withstereoscopic cameras and equipmentAny and all user-generated content can be converted to 3DAnaglyph/colored-glasses programs can be converted to full-color 3D stereo, without the originalfull color versionAll of the 10 different flavors of existing 3D content can be converted to and from each other afirst in the industry in real-time.Any 2D and 3D content can be converted to play back on any 3D stereoscopic display anotherindustry first.Uncalibrated, unaligned camera pairs can be used for stereo video and image captureInexpensively transform telepresence solutions from HD to 3D-HDTransform a webcam video chat into 3D telepresence without a calibrated stereo cameraProblems that cause disorientation, such as rapid depth changes are automatically modeled andeliminated

    3D Has Come a Long Way From the 1980s or has it?

    Almost everyone has had some experience with color anaglyph 3D examples include the gimmicky horror films of the 80s, another exampleis the recent blue/yellow glasses many people experienced with the 2009SuperBowl.

    HDlogixInc

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixcom

    HDlogixInc

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixcom

  • 8/3/2019 HDLogix 3D IQ Overview 001

    2/11

    As we all know, this method of 3D is prone to inducing headaches, and the experience is less thanideal because both eyes do not receive a full-color image. There are, of course, more elegant, full-color 3D solutions one is to use shutter-glasses, with a 3D-Ready display . Another is to usepolarized glasses, with a polarized projection or active display. Other recent developmentsinclude autostereoscopic video displays that use clever optics to allow 3D viewing without anyglasses at all . The hardware required for these has either been cheap and clunky (and headacheinducing), or if executed well, relegated to expensive niche markets like signage, CAD/CAMvisualization for mechanical modeling and medical imaging or for venue-based cinema like iMax3D . (Despite best intentions, some of these have also induced headaches despite being expensiveand well-executed). Recently, full-color stereo-3D capable displays have been on the mass market in fact, if you have bought an HDTV recently, it is entirely possible that it is capable of displaying 3Dvideo with an inexpensive add-on and shutter glasses without headaches and compromises withcolor.

    Welcome to the 3D Zoo

    Millions of customers worldwide already have 3D -Ready displays, but t hey dont even know it.

    There are thousands of hours of high-quality 3D content in movie libraries, plus almost every one ofthe major movie studios have committed billions of dollars to 3D movie production for 2009 and 10 and not just animated productions . Why doesnt everyone know about 3D? The answer is that thereis no ready content for these displays. If theres so much 3D display technology in end- users homesand so much 3D content, how is this possible?

    The reason is that there are many animals in the 3D Zoo , and they dont communicate : there are noless than 4 different technologies for shooting and producing 3D films and videos, yet more ways tostore and transmit them and literally dozens of different technologies and products to display them as3D video and none of these will talk to each other. If you have a red -cyan anaglyph source, youcant display it in full color on a shutter -glasses stereoscopic display. Likewise, if you have a full-colorstereo source, you cant display it on a multichannel autostereoscopic display. If the content has

    been converted for a multichannel autostereoscopic display (at $25,000 per minute of footage), it nolonger can be displayed with red/cyan glasses OR on a shutter glasses display without a newtransmission medium and this barely scratches the surface of the problem.

    Figure 1. Lots of existing 3D video technologies are talking, but not a lot of them are listening to each other.

    HDlogixInc

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixcom

  • 8/3/2019 HDLogix 3D IQ Overview 001

    3/11

    Until ImageIQ3D, there was no way to make all of these formats, technologies, and displaytechnologies compatible with each other without expending tens of thousands of dollars per minute offootage offline, and with a slow turnaround . Most importantly, there has been no way to make 3Dactually work for the end-user without a lot of excuses, apologies, and headaches until now.

    Figure 2. ImageIQ3D is the Universal Translator for 3D and 2D video. With ImageIQ3D , everyones talking 3Dvideo to each other even if the original video is only 2D, and regardless of origination methods, archiveformats, and distribution standards/transmission methods.

    Further, for 2D content that was created with only one camera, the only way to convert to 3D wasmanually intensive, requiring expensive intervention by a veritable army of 3D modelers, stereoscopicspecialists, 3D rendering artists, and video engineers to identify precise scene cuts, and painstakinglyedit geometry and depthmaps, and correct their errors and problems until now.

    Todays Video Architecture: The GPU/3D Ac celerator

    Some of the algorithms that ImageIQ3D uses have existed for years but have only been possible toperform in real-time recently with the advent of programmable graphics hardware: GPUs. GPUsallow for highly-parallel, memory intensive processes much more so than equivalent CPUs that areten times more expensive. Additionally, ImageIQ3D s approach is very similar to the 2Dsuperresolution technology in ImageIQ , which is particularly well-suited for GPU implementation. Asa result, ImageIQ3D can run in real-time on very modest GPU hardware. Top-of-the-line video cardhardware is not required, and in fact laptops that have several-year-old video chipset hardware canrun ImageIQ3D without breaking a sweat.

    HDlogixInc

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixcom

  • 8/3/2019 HDLogix 3D IQ Overview 001

    4/11

    Overview of the ImageIQ3D Process

    Like ImageIQ , ImageIQ3D performs sophisticated motion analysis called optical flow. Muchinformation about the 3D scene geometry can be gleaned from the relative motion of objects in thevideo and how they occlude (reveal and hide) pixels in other objects as they move as long as themotion estimation is precise and accurate. Second, straight lines of buildings, the horizon, and otherobjects gives clues about the vanishing points , which also help to solve the puzzle. A GeneralisedHough/Radon Transform helps identify these lines and other useful features. Finally, in mostphotography there is a tendency for objects that are very near and very far from the cameras focalplane to be blurred by an amount proportional to their distance from it. A Blind Point-Spread-Function Estimator is used to estimate the out-of-focus character for each pixel, to complete theinformation needed to estimate the depth of the video. Some of this information is always available,sometimes not all of it is (for example, when nothing is moving in the video). ImageIQ3D uses asuperresolution-based statistical approach to achieve robust and consistent results even when thereis very little or partial information available.

    Figure 3. The ImageIQ3D Process a simplified view.

    Ultimately, the goal is to produce an accurate depth map for each video frame a representation ofthe distance of each pixel in the video from the camera. Once an accurate depth map is calculated, itis possible to easily convert to and from any 2D or 3D format!

    ImageIQ3D: Depth-from-Motion via Optical Flow

    Critical information about the objects and background making up a scene, and their relative distancesto the camera, can be calculated if these objects ever move, and if there is an accurate and preciseestimation of the true motion. ImageIQ3D uses the same optical flow engine as ImageIQ ssuperresolution process.

    The ImageIQ3D optical flow computation system achieves real-time, per-pixel dense motionestimation with a wide and precise spatial dynamic range 0.01 to 500.00 pixels. A motion vector iscalculated for every pixel, in every image the motion vector tells how much the pixel has moved.One way to view a motion-vector field is to let hue represent the direction, and brightness to representthe magnitude, as shown in Figure 4.

    HDlogixInc

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixcom

    HDlogixInc

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixcom

  • 8/3/2019 HDLogix 3D IQ Overview 001

    5/11

    Figure 4. Original frame (left), Hue-Saturation-Value representation of the optical flow field (right).

    Not just the motion itself is important how objects hide and reveal pixels from other objects and thebackground behind them gives significant depth information. ImageIQ3D computes occlusions inaddition to optical flow. Figure 5 demonstrates the ImageIQ3D Depth from Motion process in action:

    Figure 5. Using occlusion and motion to generate a depth map. Top Left image reveal and hide occlusionsare marked in red and yellow. Top right image optical flow. Bottom image generated depth map frommotion and occlusions.

    HDlogixInc

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixcom

  • 8/3/2019 HDLogix 3D IQ Overview 001

    6/11

    Figure 6. The generated depthmap has been used to generate a synthetic left/right image pair. (One can getthe 3D effect by crossing ones eyes to fuse the left and right sides).

    Figure 7. The depthmap was used to generate a synthetic red/cyan anaglyph (if one has a cheap pair ofred/cyan glasses, you can view the effect).

    HDlogixInc

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixcom

  • 8/3/2019 HDLogix 3D IQ Overview 001

    7/11

    ImageIQ3D: Scene Change Detection

    What happens if the camera is panning, and then suddenly stops? Previous algorithms would lose alldepth information and the 3D Video would flatten out. The solutions to this problem areconceptually simple somehow accumulate depth for each pixel as you go along. Of course, pixelsmove, and this requires a very accurate motion compensation to do properly. Another problem is thataccumulating depth values as a history can significantly corrupt the depth map for the current frameunless the system knows when the shot has cut to a new scene, or even if all of the relevant pixelshave panned off the screen. Carrying over depth from a previous shot can result in seriousdistortions, and in some cases, a violent motion sickness response in some viewers.

    Clearly, a shot change detection method is required, and this is a well traversed area of study andpractice but for the 2D to 3D case, its not enough to know if the editor cut away to an entirelydifferent scene. One must know, reliably, when each individual pixel has moved offscreen and nolonger has any history and when new pixels appear, one has to know that too. Of course, if thecurrent shot cuts away, all depth assumptions have to be reset as well.

    ImageIQ3D has a very robust scene-change detection engine that provides exactly this capability for every pixel, individually . When everything has panned or zoomed offscreen, or the currentscene has cut or faded away, ImageIQ3D knows how to reset its assumptions a very important partof making the 2D to 3D process seamless and requiring no user intervention. This is also veryimportant to ensure that these changes dont cause viewers eyes to cross (or cause them to throwup) when errors due to scene changes cause left/right disparity issues.

    ImageIQ3D: Depth-from-Vanishing Points via Radon Transform

    Video does not always include motion. Sometimes, other cues are necessary to obtain depth. Onesolution is to use geometric clues in the images themselves to assist if one knows where thepredominant straight edges are, and has some information about the faces of objects in a scene,

    some information about the depth of foreground objects and the background can be obtained. LikeMRI machines, ImageIQ3D performs a Hough/Radon transform to correlate image edges andstructure except ImageIQ3D does it in real-time:

    Figure 8. Not all images have good geometric depth cues. Original frame, marked up with predominantstraight edges in red (left), Generalised Hough/Radon on right. Crossings of the curves and bright yellow/whitedots indicate position and slope of significant straight lines. This frame is ambiguous, so other information (likemotion/occlusion) is needed to infer depth.

    Depth from 2D is a specialized case of superresolution using multiple pieces of information to fill inan incomplete (or sometimes, overcomplete) estimation. In the classic superessolution case, one is

    HDlogix,Inc.

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixcom

    HDlogix,Inc.

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixcom

  • 8/3/2019 HDLogix 3D IQ Overview 001

    8/11

    trying to enlarge an image and fill in missing pixels with information from previous frames with motiongiving the critical clues . In this case, ImageIQ3D fills in depth information from previous framemotion, plus multiple other sources like geometric cues.

    Figure 9. Other images have excellent geometric depth cues. Original frame marked up with straight edges in

    red (left), Generalised Hough/Radon on right. This frame has several clearly distinguishable straight edgesindicated by convergence of crossing curves in the transform on the right. Vanishing points can be clearlydetected, and are used to constrain the depth map estimation.

    Figure 10. Depth map obtained from geometric depth cues plus motion.

    ImageIQ3D: Depth-from-Focus via PSF Estimation

    Another way to increase the robustness of depth estimation is to include information about how muchdifferent objects in the scene are blurred, relative to each other. In combination with motion,occlusions, and geometric cues, a robust depth estimation can be obtained by performing Point

    Spread Function (PSF) estimation . This process estimates the focus and motion blur for each pixelin a scene.

    The information from the Radon/Hough transform is not only used to estimate geometric features, butalso to find relevant edges which can be used to estimate the blur of objects in the scene. Incombination with the structure analysis performed by ImageIQs optical flow analysis, the blur of each pixel (if it is near an edge feature) can be obtained. A robust regularization function is used topropagate values to adjacent non-edge pixels. These per-pixel focus cues, like the geometric cues,are incorporated into the overall model that builds the final depth map.

    HDlogix,Inc.

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixcom

  • 8/3/2019 HDLogix 3D IQ Overview 001

    9/11

    ImageIQ3D: Putting it all Together

    A great deal of the magic of ImageIQ3D is not just performing optical flow, Hough/Radon, andintelligent, adaptive operations but intelligently applying brute force. Like ImageIQ, ImageIQ3Dtreats 2D to 3D as a superresolution problem instead of creating pixels in the X and Y directions byusing X, Y and Time information, ImageIQ3D creates new pixels in the Z direction using X, Y, andTime information. Most of the intelligence is embedded in how all of this information is combined, andhow it constrains the final solution that solution being a consistent, reliable depth map that can beused to translate any 2D or 3D video into any other suitable 3D video format.

    ImageIQ3D: Theres One More Animal in the 3D Zoo to Tame

    A different set of problems are presented when converting anaglyph (colored-glasses) video to full-color stereo but, the toolset that ImageIQ3D uses lends itself extremely well to this circumstance aswell. Consider a green/magenta anaglyph video:

    Figure 11. A frame from a green/magenta anaglyph 3D film.

    In this case, the left eye is coded into the green channel of the RGB image, and the right eye is codedinto the red and blue channels. The full-color stereo version can be reconstructed, as long as there isa robust method of optical flow that knows about occlusions -- this sounds fami liar! Conceptually, itssimple estimate the optical flow between the Green, and the Red/Blue and motion compensatethe green toward the Red/Blue add them together, and this becomes the full-color right eye image.Next, estimate the optical flow between the Red/Blue, and the Green and motion compensate theRed/Blue toward the Green add them, and this becomes the full-color left eye image.

    HDlogix,Inc.

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixcom

    HDlogix,Inc.

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixcom

  • 8/3/2019 HDLogix 3D IQ Overview 001

    10/11

    Figure 12. ImageIQ3D color anaglyph to full-color stereo conversion process.

    More properly stated, this is actually a problem of disparity estimation (not optical flow) between theright and left images -- but either way, the right and left images are using different colors. This makessolving this problem very difficult because the green color for the left eye, and the magenta for theright eye, cannot easily be compared, because their colors and brightness (and pixel values) aredifferent. However, the Optical Flow engine in ImageIQ3D does not use block matching, or colors,but uses actual object structure, per-pixel, to determine motion and optic flow so in this case, itsperfectly suited to the problem at hand.

    Figure 13. The same green/magenta anaglyph 3D film, reconstructed as a full-color 3D stereo pair. Theoriginal full-color movie was NOT used to construct this stereo pair .

    In short, this means that a player incorporating ImageIQ3D can not only convert from 2D to 3D, buttake any legacy 3D format (including color anaglyph) and convert to full-color, full-stereo 3D, in real-time, with no operator or user intervention or tuning on commodity, off-the-shelf, inexpensive GPUhardware.

    HDlogixInc

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixcom

  • 8/3/2019 HDLogix 3D IQ Overview 001

    11/11

    ImageIQ3D: Many Deployment OptionsImageIQ3D has been developed as a consumer DVD player application for Windows and MacOS,and as a batch-mode processor running on Linux, and is ready for low-BOM and parts-count-sensitive consumer electronics applications. To find out how you can leverage ImageIQ3D in yournetwork, workflow or consumer electronics solutions contact HDlogix at [email protected] .

    ImageIQ is a registered trademark of HDlogix, Inc. ImageIQ3D is a trademark of HDLogix, Inc. 2009 HDlogix, Inc.

    HDlogix,Inc.

    26MayfieldAve

    EdisonNJ08837(732)623-2067

    wwwhdlogixco

    mailto:[email protected]:[email protected]:[email protected]:[email protected]