58
A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS M. Al-Mouhamed, O. Toker, A. Iqbal, and M. Nazeeruddin

A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

  • Upload
    minda

  • View
    38

  • Download
    1

Embed Size (px)

DESCRIPTION

A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS. M. Al-Mouhamed, O. Toker, A. Iqbal, and M. Nazeeruddin. Contents. Introduction Background Status Of The Problem Literature Review Thesis Objectives Video Client-Server Framework Distributed Telerobotic Framework - PowerPoint PPT Presentation

Citation preview

Page 1: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

M. Al-Mouhamed,O. Toker, A. Iqbal, and M. Nazeeruddin

Page 2: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

2

Contents Introduction Background Status Of The Problem Literature Review Thesis Objectives Video Client-Server Framework Distributed Telerobotic Framework Augmented Reality Conclusions

Thesis Contributions Future Research Directions

Page 3: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

3

Introduction Telerobotics: humans to extend their manipulative skills

over a distance, extend eye-hand motion coordination. Telerobotic applications

Scaled-down: nano-scale, micro-surgery, clean-room Hazardous: nuclear decommissioning & inspection,

fire fighting, disposal of dangerous objects, minefield clearance, operation in harsh environments, unmanned, underwater, ice, desert, space,

Safety: rescue, Security: surveillance, reconnaissance, Unmanned: oil platform inspection, repair, Teaching, training, and entertainment.

Page 4: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

4

Introduction … (cont.)Minefield clearance, unmanned underwater

inspection, and search & rescue.

Those where humans adversely affect the environment such as medical applications and clean-room operations.

Those which are impossible for humans to be situated in such as deep space and nanorobotics.

Page 5: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

5

Introduction … (cont.) Extending eye-hand motion coordination using

telerobotics In natural eye-hand motion coordination, operator sees his hand

and react accordingly. In telerobotics:

Operator holds a master arm to dictate his hand motion, Motion is transmitted to a remote slave arm and reproduced

(replica), Operator wears a head-mounted display (HMD) to see in 3D the

effects of his motion on the remote tool, Operator does not see his hand (HMD) nor the master arm, his

hand is logically mapped to the remote tool, Operator logically acts on the remote tool seen through the HMD.

Stereo vision: 3D views of slave scene and a metric to calculate 3D positions and orientations of objects.

Page 6: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

6

Background … (cont.)A two-way logical communication link to transfer

commands from client to the server through a Computer Network and to convey LAN

different kinds of feedback, e.g., video, force etc., back to the client site.

HumanOperator Network

Master Arm, HMD,Glove,

Force feedback

Sensors, Video,Sound, Force

FeedbackActuators,

Hands, Arms,Feet

Page 7: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

7

Background … (cont.) A Telepresence system is one which displays

high quality information from the remote world, visual or otherwise, in such a natural way that the operator feels physically present at the remote site.

Virtual Reality (VR) is the interactive simulation of a real or imagined environment that can be experienced visually or otherwise in the three dimensions of width, height, and depth.

Page 8: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

8

Video Client-Server Framework The provision of stereo video on the client side

imposes severe requirements in terms of bandwidth to transfer real-time stream of video data in a telerobotic environment.

It requires the use of advanced technologies like DirectX and Windows Sockets to accomplish the capturing and relaying of video data over a LAN.

Commercially available software like Microsoft NetMeeting are optimized for a low band-width network like internet so they show too poor display resolution to be used for stereo vision in a telerobotic setup.

Page 9: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

9

Video Client-Server Framework Development of a highly optimized client-

server framework for grabbing and relaying of a stereo video stream

Server tasks:Capture or grab stereo images from two

camerasEstablish a reliable client-server connection Upon requests from the client send this stereo

frame comprising of two pictures to the client through windows sockets

Page 10: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

10

Video Client-Server Framework Client tasks:

Detect and establish the connection with server Establish a highly optimized fast graphic display

system to show the pictures received from the server. Display the pictures arrived from the server and

continue in a loop each time asking a new stereo frame from the server.

Allow the viewer to adjust the alignment of pictures on the HMD to compensate for the misalignment and non-linearity present in the camera at server.

Page 11: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

11

Video Client-Server Framework Proposed client-server framework is based

Microsoft Visual C# and Microsoft DirectX. Microsoft DirectX provides COM based

interfaces for various graphics related functionalities. DirectShow is one of these services. DirectShow, further, provides efficient interfaces for the capturing and playback of video data.

Page 12: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

12

Video Client-Server Framework We can use network services and send/receive data

over a network using windows sockets. The stereo video setup uses synchronous windows sockets as an interface between vision server and client.

Two different schemes were implemented to transfer the video data. The schemes differ in the usage of multiple threads on the server side as well as some optimization steps to reduce the network traffic for the transfer of the video data.

A general overview of the image grabbing and displaying system is given before the detailed description of the above scheme.

Page 13: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

13

Video Client-Server Framework We use a component of DirectShow named

SampleGrabber to capture video frames coming through a stream from a stereo camera setup. A block diagram of the scheme used at the server side to grab stereo frames is shown below:

Page 14: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

14

Video Client-Server Framework In order to show the received pictures from the server,

we need to use GDI (Graphics Device Interface). A block diagram of the client side scheme to display the video is shown below:

Page 15: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

15

Video Client-Server Framework(Single Buffer, Serialized Transfer)

Page 16: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

16

Video Client-Server FrameworkDouble Buffer, De-Serialized Transfer

In this scheme, we try to optimize the transfer of video data over the LAN by using thread manipulation on the server.

Thread overlapping among capture and sending thread is achieved using double buffers on the server side.

It is ensured that the thread responsible for sending the video data over the LAN will not wait after receiving a picture request from the client.

Page 17: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

17

Video Client-Server FrameworkDouble Buffer, De-Serialized Transfer

Page 18: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

18

Video Client-Server FrameworkDouble Buffer, De-Serialized Transfer

This approach enables us to send higher number of stereo frames over the same LAN and hardware.

The only overhead is the allocation of extra buffer in the server DRAM which not a real problem with available systems containing large memory.

Page 19: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

19

Video Client-Server Framework3D Visualization

There can be different methods to produce 3D effects on the client side once we have stereo images of the remote scene.

Similarly different hardware device such as eye-shuttering glasses, HMD (Head Mounted Display) are used to show the images to the user.

We have used following two methods for stereo image production on client side: Sync-Doubling Page Flipping

Page 20: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

20

Video Client-Server FrameworkSync-Doubling Left and right eye images are arranged in an up

and down way on the computer screen. A sync-doubler sits between the display output

from the PC and the monitor to insert an additional frame v-sync between the left and right frames (i.e. the top and bottom frames).

This will allow the left and right eye images to appear in an interlaced pattern on screen.

Using the frame v-sync as the shutter alternating sync allows us to synchronically transmit the right and left frames to respective left and right eyes, thus creating a three-dimensional image.

Page 21: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

21

Video Client-Server FrameworkSync-Doubling

Page 22: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

22

Video Client-Server FrameworkPage Flipping Page-flipping means alternately showing the left

and right eye images on the screen. Combining the 3D shuttering glasses with this

type of 3D presentation requires the application of frame v-sync as the shutter alternating sync to create a 3D image.

HMD can also be used in a way that two different images are sent on two different LCD screens of the HMD. The user sees the different image for both eyes thus feeling the depth of the scene. DirectX can be used to flip both the images simultaneously.

Page 23: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

23

Video Client-Server FrameworkPerformance Evaluation Different experiments were conducted to test the visual

quality of the client-server setup as well as find the time delays and other measures of the video data.

The specifications of the stereo frame are as under: Height of each picture = 288 pixels Width of each picture = 360 pixels Size = 304 KB (311040 Bytes) per picture

= 608 KB (622080 Bytes) per stereo frame

Each stereo frame is of size 0.6 MB and requires a bandwidth of about 5Mbps/Frame on the LAN. This simple calculation shows the limitation of the 100 Mbps LAN to transfer only 20 fps at the highest possible transfer rate.

Page 24: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

24

Video Client-Server FrameworkPerformance Evaluation

Copying from SampleGrabber to DRAM

Case 1: Copy times on server – Single Force Thread 300 stereo frames Mean value = 24.025 ms 95% CI between 23.29

ms and 24.75 ms.

Page 25: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

25

Video Client-Server FrameworkPerformance Evaluation

Copying from SampleGrabber to DRAM

Case 2: Copy times on server - Two Threads 300 stereo frames Mean value = 60.48 ms 95 CI between 8 ms and

150 ms.

Page 26: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

26

Video Client-Server FrameworkPerformance Evaluation

Copying from SampleGrabber to DRAM

Case 3: Copy times on server with Force transfer over LAN 300 stereo frames Mean value = 33.46 ms 9.43 ms additional for

adding network transport thread.

Page 27: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

27

Video Client-Server FrameworkPerformance Evaluation

Transferring over the LAN

Case 1: Single Buffer, Serialized Transfer 300 stereo frames Mean value = 86.1 ms 11.61 stereo

frames/second.

Page 28: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

28

Video Client-Server FrameworkPerformance Evaluation

Transferring over the LAN

Case 2: Double Buffer, De-Serialized Transfer 60,000 stereo frames Mean value = 58.94 ms 17 stereo frames/second. 90% CI between 56.0 and

64.8 ms.

Page 29: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

29

Video Client-Server FrameworkResults Summary

Scheme Cameras to Server DRAM (ms)

Server to Client (ms)

Frames Per Second

Single Buffer, Serialized

24.025 86.1 11.61

Double Buffer, De-serialized

24.025 58.94 17

Housheng et. al.[2001] reported a transfer rate of 9-12 fps for a compressed single image of size 200X150 pixels over a LAN. While our scheme transfers 17-18 uncompressed stereo fps of size 360X288 pixels each.

Network bandwidth is near saturated with 18 fps.

Page 30: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

30

A Multi-threaded Distributed Telerobotic Framework Distributed application programming is one of the

different schemes to establish a reliable connection between master and slave arms.

Different items are realized as software components and then these components communicate with each other using distributed components paradigm.

Object Oriented Approach Software reusability Easy extensibility One time debugging Multi-user environment Data encapsulation

Page 31: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

31

A Multi-threaded Distributed Telerobotic Framework By using the distributed programming, network protocol

issues can be avoided. The distributed framework itself takes care of all the network resources and binary data transfer over the network.

Previously DCOM (Distributed Component Object Model) based components have been used in telerobotics by Yeuk et. al.

.NET components are more advanced than COM based components and offer complete support of .NET framework including .NET Remoting and SOAP technologies.

Several components are developed on server as well as client side and will be explained briefly.

Page 32: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

32

A Multi-threaded Distributed Telerobotic Framework – MasterArm Component Local force feedback uses a second order model

for minimizing the force applied by the operator. In order to estimate the force, the component

maintains a record of all the force data read for a certain number of samples (history) along with the record of the system time.

Then it evaluates the velocity and acceleration of the master arm at each sampling instant and stores them in a circular buffer.

This information is used to calculate the force proportional to what the operator is applying which is then fed back to the master arm.

Page 33: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

33

A Multi-threaded Distributed Telerobotic System

Server

Page 34: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

34

A Multi-threaded Distributed Telerobotic System

Client

Page 35: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

35

Client GUI

Page 36: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

36

A Multi-threaded Distributed Telerobotic System – Performance Evaluation

Force and video streams

3000 force packets. Mean inter-arrival

time = 1.08 ms An addition of 0.4 ms.

90% CI between 0.5 and 3.9 ms.

Worst case inter-arrival = 789.74 ms.

During the transfer of video data

3710 force packets. Mean inter-arrival time

= 3.9 ms 90% CI between 0.5

and 13 ms.

Page 37: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

37

A Multi-threaded Distributed Telerobotic System – Performance Evaluation

Page 38: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

38

A Multi-threaded Distributed Telerobotic System – Performance Evaluation

A magnified plot of inter-arrival times in the presence of force, video and command streams.

Page 39: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

39

A Multi-threaded Distributed Telerobotic System – A comparison Teresa[1999] developed JAVA and VRML based

telerobotic system and reported a image acquisition time of 1s for one single frame of 16 bit depth. Our DirectShow based system reports a 24 ms stereo image acquisition time in a telerobotic system.

Al-Harthy[2001] implemented client-server framework takes around 50ms to transfer a command signal (48 bytes) from client to robot. In our case a similar packet (48 bytes) takes from 0.7 to 1.1 ms due to the efficient utilization of raw network resources by .NET Remoting.

Page 40: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

40

Augmented Reality The basic idea of an AR (augmented reality)

reality system is to mix the real and virtual information in order to provide an augmented view of the remote scene that provides more information than a simple video could offer.

AR can be used as an effective way to overcome the effects of time delays in a telerobotic environment.

The information added locally must fit seamlessly into the remote real data so as to avoid any perplexities for the teleoperator.

Page 41: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

41

Augmented Reality – Work Strategy

To introduce non-existent objects to that they appear to be part of the video scene.

Showing a small red ball in the most recent stereo video frame at the position of the gripper calculated locally using the command data from master arm.

Overlaying requires a one-to-one mapping of remote and virtual world coordinate spaces using a camera model.

We use the weak-perspective camera model.

Page 42: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

42

Augmented Reality – Camera Identification

Using a camera model requires the identification of its projection matrix.

Two projection matrices are needed for left and right images for a stereo projection.

A 3D frame of reference serves as affine basis for all other points in the scene.

This affine relationship between frame of reference and other points remains invariant in the projected points.

Page 43: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

43

Augmented Reality – Camera Identification

IdentifyCamera component is designed to help identify both cameras at the system initialization as well as when required.

Reference Frame

Page 44: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

44

Augmented Reality – Surfaces, HAL, Page Flipping

Microsoft DirectX is a set of highly optimized application programming interfaces (APIs) for developing high- performance 2D and 3D graphics (or multimedia) applications.

A DirectX surface can be thought of a piece of paper that you can draw on. Provides access to pixels data.

HAL (Hardware Abstraction Layer) provides a common set of graphics functions on all hardware devices.

Primary surface is the current video buffer. We write our next frame data to off-screen secondary surface. In one instruction, graphics device flips the addresses of both surfaces sending the off-screen to output surface -- Page Flipping.

Page 45: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

45

Augmented Reality – Component Framework

On the server side, no new component is added for the AR application. However server side requires setting up cameras, placement and removal of reference frame, etc.

Client side has the following components: StereoSocketClient component IdentifyCamera component RobotModel component DXInterface component

Page 46: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

46

Augmented Reality – StereoSocketClient Component

A multi-threaded component initialized by client AR application to: provide necessary un-blocking socket interface to

vision server on the remote side by connecting and receiving data through a dedicated thread.

extract single as well as stereo images from binary video data stream being sent from vision server.

synchronize left and right images while providing stereo frames.

Invokes an event when a new stereo frame is received from the server.

Page 47: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

47

Augmented Reality – StereoSocketClient Component

Page 48: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

48

Acts as a passive proxy of PUMA robot on client side.

Provides updated gripper and joint positions in Cartesian space through PUMA direct and inverse geometric models and respectively.

IDecisionServer cannot be used because it is an active proxy of PUMA which does not allow manipulating the position of robot joints independent of PUMA.

Augmented Reality – RobotModel Component

)(G ),(1 XMG

Page 49: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

49

Central component of AR framework. Runs AR and visualization business in

separate threads. Handles several tasks such as:

Synchronization of real and virtual dataProjection on video surfaceAugmentation of real videoPage Flipping for HMD stereo visualization

Augmented Reality – DXInterface Component

Page 50: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

50

Augmented Reality – DXInterface Component

Page 51: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

51

Augmented Reality – Complete System

Page 52: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

52

Augmented Reality – Augmenting Video

Augmented Ball

Augmented Ball

Page 53: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

53

Conclusions Real-time control of telerobots in the presence of time

delays and data loss is a dynamic research area. Efficient teleoperation by the operator requires the

availability of force and visual feedbacks which, over a LAN, can only be attained through multi-streaming the real-time data.

This work uses .NET based distributed components for the development of a reliable telerobotic scheme that offers multi-streaming the real-time data through extremely fast network connections in a multithreaded environment.

Page 54: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

54

Contributions A highly optimized stereo video client-server

framework is designed and developed using Visual C++ and Visual C#.NET programming languages.

With this framework we are able to achieve an excellent video transfer rate of 18 stereo frames per second over KFUPM LAN.

Different output techniques for stereo video are implemented and performance evaluated like eye-shuttering glasses, HMD page flipping.

Page 55: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

55

Contributions A component based multi-threaded distributed

framework for telerobotics is designed, implemented and performance evaluated to study the effects of multi-threading on real-time telerobotics.

This scheme has significantly reduced the network delays in a given telerobotic scenario while providing a very reliable connection between client and server sides.

Page 56: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

56

Contributions Different geometric working frames are provided for the

operator to enhance his maneuverability in the remote environment.

Force feedback is deployed on the client side as a mean to enhance the tele-presence of the operator tele-manipulating the slave arm.

Computer vision techniques are explored to create AR (augmented reality) on the client side by merging the virtual data with the real video stream from the remote side.

The use of AR has helped in decreasing the network delays by reducing the requirement for fresh video data.

Page 57: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

57

Future Research Directions Implementing hierarchical supervisory control in the

developed telerobotic framework. This will allow repeatability of simple tasks using impedance control.

Incorporation of complex geometrical shapes in the real video in order to provide even richer information to the client side.

Studying the affects of hyper-threading on a multi-threaded telerobotic framework.

Comparison of the projection accuracies of different camera models while augmenting the real data.

Analysis and design of a 6 d.o.f. (3 d.o.f. force feedback) master

arm being developed at KFUPM in COE department.

Page 58: A DISTRIBUTED FRAMEWORK FOR RELAYING STEREO VISION FOR TELEROBOTICS

58