Upload
others
View
34
Download
0
Embed Size (px)
Citation preview
Extend User Experience of WebRTC with Unique Sensor Devices
Masashi Ganeko INFOCOM CORPORATION Nov. 2, 2017
Extend User Experience of WebRTC with Unique Cool Sensor Devices
Masashi Ganeko INFOCOM CORPORATION Nov. 2, 2017
About myself
• Masashi Ganeko / @massie_g – Manager of a research team – INFOCOM CORPORATION (from Japan, Tokyo)
• http://infocom.co.jp/english/index.html
• One of Organizers of WebRTC Meetup Tokyo – https://atnd.org/groups/webrtc
• English Presentation for WebRTC (2013-2017)
– https://speakerdeck.com/mganeko
• Japanese Presentation for WebRTC (2013-2017) – http://www.slideshare.net/mganeko
3
What is WebRTC
• Web Real-time Communication for – Video
– Audio
– Data
• Open standard – W3C WebRTC Working Group ... API
– IETF RTCWEB Working Group … Protocol
– Core library is open source software
• Designed for Web Browsers, and other web connected devices
• Easy to combine with other Web technologies
10101110100…
What I want to talk about today
• WebRTC is a very useful tool to build your own communication application
• WebRTC + Sensor Devices more Interesting & Exciting user experience
• Introduce two experimental projects, to show “the Power of WebRTC” – Shotoku-Tamago
– Virtual Teleport
Shotoku-Tamago
Shotoku-Tamago
• First Prize of RICOH THEATA x IoT Developers Contest 2016
– http://contest.theta360.com/index-en.html
– http://award.contest.theta360.com/prize1-e.html
• 347 entries from 33 countries, 54 projects submitted
Problem in Web Meeting • Web Meeting is very common
– It works pretty well for 1 to 1
– It works for 3 or 4 distributed members
• But it is poor experience for a meeting,
– between a group and 1 remote member
– Hard to understand who is speaking, a remote member
? ? ?
Current Solution
• Wide Camera
– Too small faces
• Swing Camera
– Not automatic
– Expensive $1000
Purpose of Shotoku-Tamago
• Improve experience of remote member, – at the meeting between a group and 1 person
• Make easy to understand: – who is/are speaking – their expression, such as smiling, angry, happy,
disappointed, …
• With not expensive devices • With fixed camera, without manual operation
Cool sensor devices in Shotoku-Tamago
• RICOH THETA S (360° Camera) – Dual fisheye lenses
– Capture the whole area of the meeting room at once, without swinging or moving
• SYSTEM IN FRONTIER TAMAGO-03 – Egg Shaped Microphone Array
– Locates and tracks who are speaking automatically
http://www.sifi.co.jp/system/modules/pico/index.php?content_id=39&ml_lang=en
Origin of Name: Shotoku Taishi / 聖徳太子
• Legendary Prince of Japan, AD. 600
• Many episodes
• Some might be true, some might not be
• One of the most famous episodes:
• When 10 people were talking to him at the same time, he could understand each one’s talk.
• So, He is knows as “prince with multiple ears”.
• Shotoku-Tamago
Origin of Name: Shotoku Taishi / 聖徳太子
• Legendary Prince of Japan, AD. 600
• Many episodes
• Some might be true, some might not be
• One of the most famous episodes:
• When 10 people were talking to him at the same time, he could understand each one’s talk.
• So, He is knows as “prince with multiple ears”.
• Shotoku-Tamago
“Egg” in Japanese
Usual Web Meeting VS. Shotoku-Tamago Usual Web Meeting
Shotoku-Tamago Shotoku-Tamago
Whole architecture of Shotoku-Tamago
Web Browser
Web Browser
Web Browser
Web Browser Video/Audio media
360° Video/ mono Audio
Video/Audio Direction of speaking member
WebSocket
WebRTC Render with
WebGL
1. Detecting who are speaking
HARK - Robot Audition Software http://www.hark.jp/ • By Honda Research Institute
Japan with Kyoto University • Royalty free for research use
Microphone array - consists of 8 small microphones - work with HARK
Using “source tracker” of HARK tool, to locate and track speaking members
2. Connecting HARK tool and Web Browser
Web Browser
Web Browser Video/Audio media
360° Video/ mono Audio
Video/Audio
WebSocket
WebRTC
Web Browser
Web Browser
Direction of speaking member
2. Connecting HARK tool and Web Browser
• HARK tool is command line standalone native app. • It is not possible to send data from HARK tool to a Web
Browser directly. • Write a pipe tool, with Golang as WebSocket server.
HARK tool
Standalone native app
Web Browser
Web Browser
USB
stdout stdin
Convert tool As
WebSocket Server
WebSocket
3. Sending direction of speaking member
Web Browser
Web Browser
Web Browser
Web Browser
Direction of speaking member
WebSocket
4. capturing 360 ° video
Video/Audio
WebSocket
Direction of speaking member
Web Browser
Web Browser Video/Audio media
WebRTC 360° Video/ mono Audio
Web Browser
Web Browser
mediaDevices.getUserMedia()
Dual-fisheye format Video
5. sending 360 ° video with WebRTC
Video/Audio
WebSocket
Direction of speaking member
360° Video/ mono Audio
Web Browser
Web Browser
Web Browser
Web Browser Video/Audio media
WebRTC
Dual-fisheye format Video
Web Browser
Web Browser
6. rendering 360 ° video with WebGL
https://github.com/ricohapi/video-streaming-sample-app/tree/master/samples/oneway-watch
RICOH sample
Dual-fisheye format Video
Map to sphere, with UV mapping
Render with WebGL (three.js)
Web Browser
Web Browser
7. Cropping members face who are speaking
Which areas to crop are decided by sound direction located with HARK
Sphere of 360 video
Up to 5 WebGL cameras Up to 5 canvas elements
Whole architecture of Shotoku-Tamago (again)
Web Browser
Web Browser
Web Browser
Web Browser Video/Audio media
360° Video/ mono Audio
Video/Audio Direction of speaking member
WebSocket
WebRTC Render with
WebGL
Power of WebRTC in Shotoku-Tamago
• Easy to handle 360 video with WebGL
– Use VR technology in a web browser
• Easy to utilize real-time data of sensor devices with WebSocket
– data from a sensor device, such as microphone array
– data processed by signal process software, such as HARK
• Makes web meeting much more vivid, by combination of all of these technologies
Virtual Teleport
Virtual Teleport
• Real-time communication tool with
– Forward: Real-time 3D scanned Hologram
– Backward: 360°video
• Demonstrated in AppsJapan exhibition of Interop Tokyo June 2017.
– More than 800 guests enjoyed the new experience with Holographic communication
• Referred in Web Media of Japan – http://www.watch.impress.co.jp/headline/docs/extra/vr/1064673.html
Challenge in Virtual Teleport • Communication with Web meeting today
– Only 2D videos of faces are transferred
• Try Future communication with Virtual Teleport
– Transfer your existence to remote place
– Show your whole body in 3D Hologram, such as “STAR WARS”
Cool devices in Virtual Teleport
• Real-time 3D scan device
– Intel RealSense R200 • https://www.intel.com/content/www/us/en/support/emerging-technologies/intel-realsense-technology/000016214.html
• Depth Camera (IR Laser Projector, Dual IR Camera)
• Holographic Display devices
– Dreamoc HD3 • https://www.realfiction.com/solutions/dreamoc-hd3
– Microsoft HoloLens • https://www.microsoft.com/en-us/hololens
IR Laser Projector
IR Camera
RGB Camera
Real-time 3D scan
Show in Holographic Display
Forward: Show you in 3D Hologram
Real-time 3D scan
Show in Holographic Display
Forward: Show you in 3D Hologram
Architecture of Virtual Teleport: Forward
HDMI
WebSocket
DataChannel
Forward: Show you in 3D Hologram
Render 360° video Capture
360°Video
Backward: Watch remote 360°Video
Architecture of Virtual Teleport: Backward
MediaStream
HDMI
Backward: Watch remote 360°Video
Real-time 3D scan
Show in Holographic Display
Show 3D Hologram / Watch 360° Video
Render 360° video Capture
360°Video
Whole architecture of Virtual Teleport
MediaStream
HDMI
WebSocket
DataChannel
Forward: Show you in 3D Hologram
Backward: Watch remote 360°Video
1. Capturing 3D in point cloud data
MediaStream
HDMI
WebSocket
DataChannel
1. Capturing 3D in point cloud data
• Capture with RealSense, from 4 directions
– Data is called as “Point Cloud”, a set of 3D points
• Merge 4 sets of point cloud from 4 directions – Shown in 4 different colors in the right figure
Merging 4 directions
• Using multiple depth cameras is not easy – Each camera projects IR Laser pattern – Multiple patterns collide usually
IR Laser Projector 1
• Intel RealSense R200 can avoid collision – With libRealsense – https://github.com/IntelRealSense/librealsense
IR Laser Projector 2
Processing point cloud data
• Reduce points – to support HoloLens (not so powerful)
– to control network bitrate (< 100Mbps)
• Remove noise – remove splattered points
• Make 3D mesh object – find triangles for polygon
– connect polygons to make mesh
– repair holes of mesh
– reduce polygons of mesh
1.7 M points 15 K points
26 K polygons 5.5 K polygons
Point Cloud Library
• a standalone, large scale, open project for 2D/3D image and point cloud processing.
– http://pointclouds.org/
• Development is active, after Kinect V1 released
• Using with libRealsense – https://github.com/lebronzhang/pcl
– https://github.com/lebronzhang/pcl/blob/master/visualization/tools/real_sense_viewer.cpp
Using PCL for point cloud processing
• Reduce points – pickup center area … PCL PassThrough – choose 1 from dense points … PCL VoxelGrid
• Remove noise – remove splattered points … PCL OutlierRemoval – smoothen points … PCL OutlierRemoval
• Make 3D mesh object
– find triangles for polygon … GreedyProjectionTriangulataion – connect polygons to make mesh – repair holes of mesh … VTK
• https://www.vtk.org/Wiki/VTK/Examples/Cxx/Meshes/FillHoles
– reduce polygons of mesh … Reduction Polygon • https://github.com/PointCloudLibrary/pcl/issues/967
2. Sending 3D Data
HDMI
WebSocket
DataChannel
• Sending same data to – Dreamoc HD3 over WebRTC DataChannel
– HoloLens over WebSocket
• Data – 3D mesh data
• build from point cloud of 4 IR camera
– Texture of RGB camera • 4 jpeg images, 640 x 480
– UV Map • How to map texture to mesh
• Convert different coordinate system – PCL … Right-handed coordinate system – Unity … Left-handed coordinate system
• 2 – 3 frames / sec
- 1 MB / frame - about 20 M bits / sec
Inside of 3D data
4. Displaying to Dreamoc HD3
MediaStream
WebSocket
DataChannel HDMI
Building Dreamoc HD3 App
• Build with Unity C# for Windows app
HDMI
• Use 3 camera and 3 image for 3 mirrors
– Front view, Left view, Right view
• Unity Asset: WebRTC Network – https://www.assetstore.unity3d.com/en/#!/content/47846
DataChannel
Demo Video 1. Hologram on Dreamoc HD3
• https://lab.infocom.co.jp/2017/06/appsjapan2017.html – https://lab.infocom.co.jp/o2/20170601/sample.gif
5. Displaying to HoloLens
MediaStream
HDMI
WebSocket
DataChannel
Building HoloLens App
• Use Unity for 3D programming with C#
– use MixedRealityToolkit-Unity (a.k.a HoloToolKit-Unity) • position detection, gesture detection
– export Visual Studio project
• Use Visual Studio 2017 to build Windows 10 UWP app
– UWP: Universal Windows Platform (Store App)
Demo Video 2. Hologram on HoloLens
• https://lab.infocom.co.jp/2017/06/appsjapan2017.html – https://lab.infocom.co.jp/2017/06/12/VirtualTeleport_Mixed_640.mp4
Real-time Hologram is not perfect yet
• Many holes and bumps in 3D mesh object
– Algorithm of real-time point reducing and polygon detection is not mature yet
– But it will be improved soon with machine learning
• Not smooth motion
– CPU power is not enough to handle high frame rate
• But CPU / GPU is improved year by year
– Bitrate is too high in case of transfer many frames per second
• But 3D data compression method is coming, such as Draco
I believe that it will be improved in 2 – 3 years.
HDMI
WebSocket
DataChannel
MediaStream
6. Backward Watch remote scene with 360°video
Backward: 360°video with multiple displays • 360 °camera to capture video • Multiple displays to cover 180°view
– Synchronized Scroll as one large screen – NO VR headset, NOT to hide face
• WebRTC for video / audio • WebGL / Three.js for rendering
DO NOT Use, NOT to hide face 3 (or more) displays as 1 large screen
MediaStream Synchronize direction
with WebSocket
Demo Video 3. Wide video with multiple display https://lab.infocom.co.jp/2017/06/appsjapan2017.html
https://lab.infocom.co.jp/2017/06/01/02948838aa4562da5ebc936e8e11550f097db0b0.gif
Whole architecture of Virtual Teleport
MediaStream
HDMI
WebSocket
DataChannel
Forward: Show you in 3D Hologram
Backward: Watch remote 360°Video
Power of WebRTC in Virtual Teleport
• Make “Holographic communication” possible, today – Transfer 3D data in real-time over WebRTC DataChannel – With 3D scan camera, such as RealSense – With Holographic device, such as Dreamoc HD3 and HoloLens
• Holographic communication is very attractive experience – even with rough model and not smooth motion
• Real-time 3D scan with depth camera is evolving rapidly
– There are may useful Open Source Software – Machine learning may improve 3D scan stunningly
• WebRTC works in many platforms, as well as Web Browser – Linux C++ app – Windows Unity C# app
Conclusion • WebRTC is very powerful, because easy to combine with
– many Web technologies, such as WebSocket, WebGL
– many open source software, such as three.js, PCL
• It is possible to make exciting user experience – with many cool sensor devices and display devices
• I hope you will build your own great application with WebRTC!
ANY QUESTION ?
THANK YOU !