Interaction With Virtual Environment Using Verbal Non-Verbal Communication

Embed Size (px)

Citation preview

  • 8/3/2019 Interaction With Virtual Environment Using Verbal Non-Verbal Communication

    1/6

    Interaction with virtual environment using verbal/ non-verbal com municationTomoaki Ozaki*,*Faculty of Computer S cienc e and System Engineering,Kazuaki Tanaka" & Norihiro Abe*

    Kyusyu Institute of TechnologyIizuka-shi, 820-8502 [email protected] Introduction

    2 System configurationMany expert systems have been made up to now. Because thesystem can't detect human's behavior justly, there is the case thatdialogs with the system fail. It is epoch-making in the point thata virtual reality system can solve this problem by leading thehuman being to the space of a computer. In the virtual realitysystem, human's non-verbal behavior can be input into thecomputer system using sensors and data gloves.This system is different from the traditional interface using bothmouse and keyboard, it is possible to watch the behavior of auser and to instruct a right method when it is wrong. But thissystem wasn't able to know the intention of user definitelybecause n o voice interaction facility is provided with the user.In other words, only in human non-verbal behavior, System can'tcompletely detect the human intention and hesitation [2].In communication between human being, a spoken languagebecomes important element besides non-verbal behavior. So, inthis research, using speech recognition technology and adoptingvoice interaction to the virtual reality system, we propose verbal/non-verbal communication with human being and a computersystem.The interface that both spoken language and non-verbal behaviorcan be input at the same time in order to realize verbal/ non-verbal communication. A virtual reality system can be broughtclose to a real environment more by using this interface. Forexample, we can ask or issue an order on an object to the systemusing a spoken language while pointing at the object with a dataglove.We selected assembly/ disassembly of a virtual machine as thefield of application of verbal/ non-verbal communication. Webuilt the assembly training system which has a user acquire aright assembly/ disassembly method by allowing a user tosimulate assembly operation and to issue an inquiry or commandto the system using data glov e and a spoken language in a virtualreality system in which three-dimensional object models ofmechanical parts are arranged.

    0-7803-5731-0/99$10.001999 EEE

    2.1 Hardware organizationThe system consists of the computer which builds a virtualreality system, a microphone for a user to perform voice input,and 3-dimensional position sensors and data gloves for a user toinput non-verbal behavior. General drawing is shown in figure 1.

    SG I workstation

    Figure 1 . Hardware organization2.2 Continuous speech recognition parser (JULIAN)This research used JULIAN which Prof. Doshita's researchlaboratory in Kyoto University developed as a speechrecognition software. JULIAN is a recognition parser performingcontinuous speech recognition on the basis of a finite stategrammar (DFA). It begins to look for the most plausible word listbased on a given DFA for voice input from the microphone(continuous speech to make a pose with gap) and outputs it as acharacter string. DFA is made from vocabulary and the syntaxrule that a user registered.

    IV- 170

    mailto:[email protected]:[email protected]
  • 8/3/2019 Interaction With Virtual Environment Using Verbal Non-Verbal Communication

    2/6

    2.3 OpenlnventorTo build a virtual reality system, three-dimensional surfacemodels are used. A three-dimensional graphics library,OpenInventor[4] of SG I Company is used.

    assembly of a virmalmachine-dehitions of operadngprocedure-definitions of mecahnical pan-display ofa vinualmachine-making aresponse to users questions or mmmands

    3 Assembly training system3.1 System configurationThis system consists of 3 parts including an assembly of avirtual machine, spoken language processing unit, and non-verbal behavior analysis unit. We summa rize the m ain facilityof each part in Figure 2. We describe each function in de taillater.

    -speech recognition -analysis ofhand position

    commandsquesrions non-verbal behavior

    Figure 2. System co nfiguration3.2 Verbal/ non-verbal interfaceIn this system, we used the interface that a user could inputspoken language and non-verbal behavior simultaneously.Consequently, a user has only to utter toward a microphone incase issuing voice input without any keyboard action. Theoperation method peculiar to this interface is shown in thefollowing.

    With a traditional interface, the unique name must be usedin order to distinguish the object from others. But whenthere are many same objects like mechanical parts, It isdifficult to designate one of them using the name. This is,however, easily realized simply by pointing or grasping theobject. A user can speak to the system by inputting the sp okenlanguage such as Install this part on that. while pointing at

    the two objects with a data glove.

    * A user will need not me morize the identifier of object partsby admitting the use of the directive. A user is able to orderthe system to do assembly operation. If we should want tointerrupt the operation while the system is executing anassembly operation, we could have the system suspend theoperation by issuing a phrase or sentence that means thesuspension of the op eration.

    3.3 Model of mechanical partThe model of mechanical part at the initial state of assemblyis shown in Figure 3.

    support lineFigure 3. Initial status of assembly

    3.4 Definitions of operating procedureIn this system, the operating procedure (AND/OR procedure)is defined with an AND/OR graph as shown in figure 4.Hereafter, we call the part a mvobject which has a componentto be moved after a user has selected it with a data glove orvoice input, and the partner part a basic part into which themvobject is installed. Each node in the AND/OR graph shownin Figure 4, for examples START, END and points from 1 to 8,express an assembly status of the give assembly.Assembly operation along an arc of the graph (operation) isnecessary in order to change the state of the assembly. Inoperation, operating instruction and the object parts (mvo bject,basic part) are described.Al l nodes of the AND/OR procedure shown in Figure 4consists of OR nodes. In other words, a user has only tosequentially follow the graph from the upper part toward the

    IV- 17 1

  • 8/3/2019 Interaction With Virtual Environment Using Verbal Non-Verbal Communication

    3/6

    lower part. For example, assembly procedure such as [START- 1-5 - END], [START - 3-8 - END] shows one procedure.

    Figure 4. Operating procedure based on AND/OR graph

    3.5 Definitions of mechanical partIn this system mechanical parts are defined with a SceneGraph [4] as shown in figure 5.A MyParts is data node. Apart name and a part number are described in the MyParts. Apart name corresponds w ith the voice input from a user. A partnumber is used for describing the object part in the AND/ORprocedure.

    MyPartsDataNodes:name key shaft:number 2

    @ @ 9paratorTransform Material,

    Coodinate FaceSetMyParts :definition of a part name and a part numberTransform : definition o f a part positionCoordinate, FaceSet : definition of a part shape

    Figure 5. Definition of mechanical part3.5 Selection/ operation of partsIn selection of parts, the system is provided with the methodto orally input a part name, or the method to grasp or point tothe object with a data glove in a virtual reality system.

    On teaching of the assembly , it is important to get a usersimula te the assembly operation using data glove.But if the system shows the user how to move the specifiedobject from the initial state to the completion status of theassembly operation, we think it to be helpful enough for theuser to realize the same operation. So we prepare a method toallow a user to have the system operate the specified object.3.6 Selection of parts with data glove- When a data glove takes an attitude of pointing at an objectas shown in the Figure 6, the system decides the part has beenselected that is nearest to the hand and is included in theneighborhood of a linear line extended from the forefinger,and changes the color.

    When the palm of the data glove is going to be closed tograsp the part as shown in the Figure 7, if the bound box ofthe forefinger and the bound box of the parts interfere eachother, the system decides that the user has grasped the partand changes the color again.

    Figure 6. Selection of part by pointing action

    boundine box ofa data glove

    interference ofbouding boxesFigure 7. Selection of parts by grasping ope ration

    Iv- 172

  • 8/3/2019 Interaction With Virtual Environment Using Verbal Non-Verbal Communication

    4/6

    4. Spoken language process4.1 Natural language processingA spoken language input from a user (Japanese) is convertedinto a character string by JULIAN. Next, a natural languageprocessing program will analyze the character string throughthe speech recognition and the semantics of the voice isextracted. A user must register into JULLIAN the words andsyntax used in the speech recognition as had described with2.2. The dictionary made at that time is also available to thelanguage processing.We show an example of the word dictionary anddictionary in Figure 8an d Figure 9.

    ##An o bje ct of o p e r a t i o h S e m a n t i c s of category%OBJ . category name

    bf ---------+The word to be reco gn ized& R $ % F-

    ##Method of operation (- ab)%OPV-Ai%?+GT@!lHbf

    ##The end of a verb (imperative)%AUX-A57 . 2 l

    ##Particle (e)%W O-32

    Intax

    Figure 8.A part of dictionaryThe syntax rule is registered assuming the categoriesregistered in the dictionary to be non-terminal sym bols.

    OBJ WO OPV-A AUX-AOPV-A AUX-A OBJ WO

    Figure 9. A part of syntax d ictionarySemantic analysis is done in top-down fashion. When asentence Assemble the hexagon head bolt. ( ,ffi$lb F-&?+GT . is input, the input sentence is matched to the

    syntax OBJ W O OPV-A AUX-A in the syntaxdictionary, and a category shown in the Figure 10 is obtained.Because a category is registered corresponding to a functionof a word, semantics of the word becom es possible.At this time whether the content of the sentence can behandled with the system or not is judged. To increase anumber of sentences to be understood, you have only to addwords belonging to a category or categories and syntax rules.Inversion expression and more than one expression can bealso accepted. The second rule in syntax rules shown in theFigure 9 is the inversion form of the first syntax rule. Theflow of the process is shown in Figure 10.

    Recognition Syntax rule

    *HPV-ASematics ofcategory

    Method of operation (- $y)I The end of a verb(imperative) I

    Figure 10. A result of language processing

    4.2 Constructing the contents of dialogAn analysis result provided with the natural languageprocessing exploits the knowledge of assembly, and is storedin a list called a contents list. When Assembled hexagonhead bolt.(&@!$& b %%f145T3. is a recognizedcharacter string, a contents list as shown in the Figure 11 ismade.

    I order 1 operation I

    Figure 11. Contents list

    Key words corresponding to the contents of the sentencestored in the first line of the contents list. The systemdistinguishes the contents of the utterance with the key words.If some assembly operation is necessary, a method realizingthe operation is put in the second line, and the object isentered in the 3rd and the 4th line. Because two parts aremainly selected as objects of one operation, the 3rd and the

    IV--173

  • 8/3/2019 Interaction With Virtual Environment Using Verbal Non-Verbal Communication

    5/6

    4th line are prepared.4.3 Estimationof a contents listThe system makes an appropriate response by matching acontents list with information about the assembly at hand. Asthis process depends on the contents of the utterance, we willexplain it in detail while showing typical examples in the 5thsection.4.4. Flow of process.We describe the flow of process of spoken language in thefollowing. At first, a user issues an inquiry or command to thesystem using a spoken language. Making natural languageprocessing on the spoken language next, If the system is ableto accept the contents, a contents list is made. Otherwise, theuser must repeat the voice input. W hen a contents list is made,the system matches the contents with the information aboutthe current assembly and makes a response.

    5 Examplesof interactionTwo interaction examples are shown. These examples showverbal/ non-verbal communication with human being and acomputer system typically.5.1 The Example a user asks the system about a part nameA user can ask the system about a part name pointing at a partwith data glove. If a part is selected with a data glove, thesystem gets the MyParts (Figure 5 ) of the selected part. Thenthe system teaches the user the part name described in theMyParts.Typical interaction examples are shown i n the following.A user points at a bearing (holder) with a data glove whilesaying, What is t h i s ? ( Z ; h k k ~ T ~ d ~ . .As a result aresponse This is a holder. is generated.Figure 12 shows an interaction contents list made then.Figure 13 shows the circumstance the holder is pointed atwith a data glove.

    I order I operation II oqeration 1 ~ ; h Iobectl nothinobject2 nothingFigure 12.The co ntents list of dialog

    Figure 13. The circumstance the key shaft is pointed at5.2 The example a user has the system to operate a partIf operating instruction and an object are given to the system,it performs assembly.When an operation command is given from a user, the systemmatches the operation described in the AND/OR procedureshown in 3.4 to the contents list of the given utterance, and arespon se is made. If the operation comm and fits the operationin the AND /OR graph, the operation is performed. When theoperating instruction is wrong, a response T he givenoperation is wrong because it is impossible to cany it out. isgenerated.When an operating instruction is right but a part to beoperated is w rong, a response, The part to be operated iswrong. is generated. If there are several parts with the sam ename as the object designated with voice input, the systemexecutes the given instruction on finding an instance of th eoperational part.

    Typical interaction examples are shown in the following.In the assembly shown in the Figure 15, let assume that theuser grasps a bearing (holder) with a data glove while saying

    Install this part . (L$l??~c?($kf3. ) As a result thesystem replied as follows.Figure 14 shows an interaction contents list made then.Figure 15 shows the circumstance the bearing is grasped witha data glove.Figure16 shows the situation the bearing has been completelyinstalled.

    N - 17 4

  • 8/3/2019 Interaction With Virtual Environment Using Verbal Non-Verbal Communication

    6/6

    orderoperationobject1objec 2

    Figure 15. Grasp of a part with a data g love

    operationfx0 ($tj-rtLnothing

    Figure 16. A situation the bearing is p erfectly installed.As a given operation command fits the operation described inthe AND/OR procedure, the system performed th e installationoperation of the bearing.6 ConclusionIn this research, verball non-verbal communication betweenhuman and a computer system in virtual space is proposed. Asa concrete example, we have applied it to the field of themechanical assembly domain.The system accepts questions/ command from a user in aspoken language and is successful in the correct interpretationof the users intention and maintenance of the com municationbetween the user and the system. Using the non-verbaloperation such as a pointing action, a user was able to point

    out an object simply.Because there is no depth feeling in the virtual space used inthe system, it is often difficult for a user to grasp or point toan object using a data glove.The way a user instructs a system to assemble a virtualmachine has be en mainly reported in this paper.We have already reported the way a system watches a usersbehavior while he/she is assembling a virtual machine. But asystem simply points at the part erroneously operated, anyinstruction was not given to the user. An avatar shouldperform this instruction along with the explanation in voicewith manipulating mechanical parts.In the same way, for questions from a user, the avatar mustanswer with a gesture and a spoken language.If the bi-directional verbal/ non-verbal communication c an berealized, a user has only to imitate the avatars action.This means that mutual comprehension w ill be promoted.

    6 References

    [l ] Norihiro Abe and Saburo Tsuji A consulting systemwhich detects and undoes erroneous operations by novicesProc. of SPIE, pp.352-358, (10 1986)[2] Norihiro Abe, Tomohiro Amano, Kazuaki Tanaka,J.Y.Zheng, Shoujie He, and Hirokazu Taki A TrainingSystem for Detecting Novices Erroneous Operation inRepairing Virtual Machines International Conference onVirtual Reality and Tele-Existence(ICAT), pp.224-229,( 1997)[3] Norihiro Abe, J.Y.Zheng, Kazuaki Tanaka and HirokazuTaki A training System using Virtual Machines forTeaching Assembling/ Disassembling Operations to NovicesInternational Conference on System, Man andCybemetics,pp.2096-2101 (1996)[4] J,WerneckePublishing Company (1994)

    The Inventor Mentor, Addison Wesley

    [5] Norihiro Abe, Atsushi W ada, Kazuaki Tanaka, J.Y.Zheng,Shou jie He and Hirokazu Taki Verification ofAssembability of Mechanical Parts and Visualization ofMachinery of Assembly in Virtual Space InternationalConference on Virtual Reality and Tele-Existence (ICAT),pp.208-215, (1997)

    Iv-175