Augmented Reality Face Recognition for Mobile Devices

Augmented Reality Face Recognition For Mobile Devices

Gerry HendrickxFaculty of Engineering, Department of Computer Science

Katholieke Universiteit [email protected]

AbstractThis paper covers the current status and past work

of the creation of an iPhone application. The appli-cation being created uses face recognition provided byFace.com [1] to recognize faces seen through the cam-era of the iPhone. Via augmented reality, the recognizedfaces will be named and enable the user to get informa-tion from different social networks. This paper will de-scribe the research, the related work, the user interfaceprototyping and the current state of the implementation.

1. INTRODUCTION

Over the years, a lot of research has been done foraugmented reality (AR). AR is a futuristic technologywhich augments a users view of the world. This can bedone by using head mounted displays (HMDs) or smart-phones. The technology can add extra information tothe screen, based on geolocation or pattern detection.Geolocation apps have seen a big rise in the last yearson the smartphones, with Layar as one of the leadingapps [2]. Pattern detection apps are not yet as com-mon, so they still offer a lot of unexplored possibilities.This paper describes the process of creating a patterndetection app for the iPhone. It will use face recogni-tion to offer the user extra information about the per-sons seen through the camera. This extra informationwill come from various social networks, the most im-portant being Facebook, Twitter and Google+. Its goalis to offer users fast access to online data and informa-tion that is publicly available on the internet. This canbe used in a private context, to enhance conversationsand find common ground with your discussion partner,or in an academic context, to be able to easily find in-formation, like slides or publications, of the speaker atan event you’re attending. The app will be created foriOS because the SDK offers a built in face detectionmechanism since iOS5 [3]. This feature will drasticallysimplify the tracking of the faces. Android, the other

option, does not have this built in feature, so the choisewasn’t hard to make.

2. OBJECTIVE

A brainstorm and a poll on Facebook resulted in alist of the wanted requirements for the face recognitionapp. First of all the application should work fast. Hold-ing your smartphone up and pointed to another person isquite cumbersome, so the faster the recognition works,the smaller this problem becomes. Another requirementis privacy. A second Facebook poll revealed that out of34 voters, 14 did not like the idea of them being rec-ognized on the streets. There was a strong need for pri-vacy, so looking at the Face.com API, the following pol-icy was decided upon: Face.com allows a user access toall of his Facebook friends. The app could be limitedto this, but the need to recognize your Facebook friendsis lower than to recognize strangers. Face.com also al-lows private namespaces per app, which means an appcan create its own domain in which all the users of theapp are stored. If a namespace is used, users of the appwould be able to recognize other users. The general pol-icy will be: if you use the app, you can be recognized.An eye for an eye. The functionality requirements fromthe poll and brainstorm and thus the goals to achieve arethe following:

• Detection and recognition of all the faces in thefield of view.

• Once recognized, the name and other options (likeavailable social networks) will appear on screenwith the face.

• Contact info will be fetched from the networks andcan be saved in the contacts app.

• Quick access to all social networks will be avail-able, along with some basic information such asthe last status update/tweet.

Information available will differ from person toperson. To add an extra layer of privacy settings, a userwill be able to link his networks and choose to enable ordisable them. When the user gets recognized, only hisenabled networks will show up.

3. RELATED WORK AND RESEARCH

3.1. Related Work

For this masterthesis we went to search for relatedmobile applications. No application was found thatdoes exactly the same as in this project. However, somesimilar were found.

• Viewdle [5]: Viewdle is a company focusing onface recognition. They have several projects on-going, and have already created an application forAndroid, called Viewdle Social Camera. It recog-nizes faces in images based on Facebook. Viewdlehas a face recognition iOS SDK available, but aftercontacting the company multiple times, no replywas received and we moved on to Face.com.

• Animetrics [6]: Animetrics created apps to helpgovernment and law enforcing agencies. It hasmultiple products like FaceR MobileID, which canbe used to get the names and percentage of thematch of any random person. FaceR CredentialMEcan be used for authentication on your own smart-phone. It recognizes your face and if it matches,it unlocks your data. Animetrics also focuses onhome security face recognition. However they donot seem to have an API, which is a pity becausetheir technology seems promising.

• TAT Augmented ID [7]: TAT Augmented ID is aconcept app. Basically it’s exactly the idea fromthis article. It recognizes faces in real time anduses augmented reality to display icons around theface. The concept is the same, but the resultinguser interface is different.

Another non-commercial related work is a mas-terthesis of 2010 at the Katholieke Universiteit Leuven[10]. The author created a HMD-based application torecognize faces and get information. From his work welearned that HMD’s are not the ideal practical setup (hisapp required a backpack with a laptop and heavy head-gear), and that the technology used (OpenGL) is a cum-bersome way to develop. Using iOS simplifies theseaspects. The author used Face.com as face recognitionAPI and was very satisfied with it.

3.2. Face Recognition API’s

A couple of promising face recognition API’s werefound and compared in order to find the one most suitedto our goals. A quick summary of the positive and neg-ative points:

• Viewdle: As said above, we’ve tried to contactViewdle to get more information about the API.Sadly, they did not respond, therefore Viewdle isno option.

• Face.com: Face.com offers a REST API and iswell documented. It offers Facebook and Twitterintegration and a private namespace that will helpwith the privacy concerns. There is a rate limit onthe free version of Face.com.

• Betaface [8]: The only API to work with imagesand video. However, it is Windows only, hasn’tbeen used with iOS yet and is not free.

• PittPatt [9]: PittPatt was a very promising service,but sadly it got acquired by Google. The servicecannot be used at this time.

It seemed that Face.com is the only option andluckily the best one found. It has an iOS SDK and so-cial media integration, which are both very useful in thecontext of this application.

4. PAPER PROTOTYPING

Paper prototyping is the process of designing theuser interface based on quick drawings of all the differ-ent parts of the UI. By doing this with paper parts, itbecomes easy to quickly evaluate and adapt the inter-face. The prototyping phase consisted of 3 iterations:the interface decision, the interface evaluation and theexpert opinion.

4.1. Phase one: interface deciding

The first phase of the paper prototyping was to de-cide which of the 3 possible interfaces would be used.The interfaces were:

• Interface 1: A box of information attached to thehead of the recognized person. This is the best useof augmented reality, but you have to keep yourphone pointed to the person in order to be able toread his information.

• Interface 2: A sidebar with information, whichtakes about 1/4th of the screen. This way, users

lower their phone when a person is selected, butthey can still use the camera if they want.

• Interface 3: A full screen information screen. Thismakes minimal use of augmented reality but offersa practical way to view the information. Users seethe name of the recognized person in the cameraview, and once tapped, they get referred to the fullscreen information window.

These interfaces are evaluated using 11 test sub-jects, aged 18 to 23, with mixed smartphone experience.The tests were done using the think aloud technique,which means they have to say what they think is goingto happen when they click a button. The interviewerplays computer and changes the screens. The same sim-ple scenario was given for all interfaces where the testsubject needed to recognize a person and find informa-tion about him. After the test, a small questionnaire wasgiven to poll the interface preference. None of the userspicked interface 1 as their favorite. The fact that youshould keep your phone pointed to the person in orderto read and browse through the information, proved tobe a big disadvantage. The choice between interface 2and 3 was harder. People liked the idea that you couldstill see the camera in interface 2, but then again they re-alized that, if interface 1 thought us that you would notkeep your camera pointed at the crowd, you wouldnt dothis in interface 2, so the camera would be showing yourtable or pants. This reduces the use of the camera feed.The smartphone owners also pointed out that using onlya part of the screen would be too small to actually putreadable information on it. 27% chose interface 2, 73%chose interface 3. Thus interface 3 was chosen and elab-orated for the second phase of paper prototyping.

4.2. Phase two: interface evaluation

For the second phase, 10 different test subjectswere used. An extended scenario was created which ex-plored the full scope of functionality in the application.The testers needed to recognize people, adjust settings,link social networks to their profile, indicate false pos-itives and manage the recognition history. Think aloudwas applied once again. At the end of each prototypetest, the tester needed to fill in a USE questionnaire [4].This is a questionnaire consisting of 30 questions, di-vided into 4 categories to poll to different aspect of theevaluated application. The categories are usefulness,ease of use, ease of learning and satisfaction.

4.2.1. Usefulness. The overal results were good. Peo-ple seem to understand why the app is useful and itdoes what they would expect. The scores on whether

it meets the need of the users were divided, becausesome users (especially the ones with some privacy con-cerns or without a smartphone) did not see the useful-ness of the application. However, the target audience(the smartphone users) did see its usefulness, resultingin higher scores in this section.

4.2.2. Ease of use. From the ease of use questions theneed to add or rearrange buttons became clear. Userscomplained about the number of screen transitions ittook to get from one screen to somewhere else in theapplication. This calls for more buttons, to ease thenavigation. For instance, home button to go to the mainscreen will be added on several other screens, insteadof having to navigate back through all previous screens.Using the application was effortless for all iPhone users,because the user interface was built using the standardiOS interface parts.

4.2.3. Ease of learning. None of the ease of learningquestions scored below 5 on the 7-point Likert ratingscale. This is also due to the standard iOS interface,which is developed by Apple to be easy to work withand easy to learn.

4.2.4. Satisfaction. Most people were satisfied withthe functionality offered by the application and howit was presented in the user interface. Especially theiPhone users were enthusiastic, calling it a innovative,futuristic application. Non-smartphone users were moreskeptical and did not see the need for such an applica-tion. Aside from this, the application was fun to use andthe target audience was satisfied.

4.2.5. Positive and negative aspects. The users wereasked to give the positive and negative aspects of theapplication. The positive aspects were the iOS-style ofworking and the functionality and concept of the appli-cation. The negative aspects were more user interfacerelated, such as not enough home buttons, and the cum-bersome way to indicate a false positive. This buttonwas placed on the full screen information window of aperson. Everybody agreed that this was to late, becauseall the data of the wrongfully tagged person would thenbe displayed. So the incorrect tag-button should beplaced on the camera view. Some useful ideas like en-abling the user to follow a recognized person on Twitterwere suggested.

4.3. Phase three: expert opinions

For this phase, the promoter and tutors of the the-sis were used in the paper prototype test. They all haveextensive experience in the field of human-computer in-

teraction and can thus be seen as experts. They tookthe tests and gave their opinion on several aspects of theprogram. A small summary:

• There were concerns about the image quality ofthe different iPhones. Tests should be done to testfrom what distance a person can be recognized.

• The app should be modular enough. In a rapidlyevolving 2.0 world, social network may need to beadded or deleted from the application. If all thenetworks are implemented as modules, this will bea simpler task.

• The incorrect tag-button could be implemented inthe same way as the iPhoto app asks the user if animage is tagged correctly.

• The social network information should not just bestatic info. The user should be able to interact di-rectly from the application. If this is not possible,it would be better to refer the user directly to theFacebook or Twitter app.

• More info could be displayed in the full screen in-formation window. Instead of showing links to allnetworks, the general information about the personcould already be displayed there.

When asked which social networks they would liketo see in the application, nearly everybody said Face-book, Twitter and Google+. In an academic context,they would like to see Mendeley and Slideshare.

5. IMPLEMENTATION

With the results of all paper prototyping test beingpositive, the next step is the implementation. The app iscurrently in development, and a small base is working.The main focus so far is on the crucial functionality,the face recognition. It is important to get this partup and running as fast as possible, because the entireapplication depends on it. So far the application isable to track faces using the iOS5 face detection. Atemporary box frames the faces and follows them asthe camera or person moves. This functionality couldbe used to test the quality of the face detection API. Asyou can see in figure 1, the face detection algorithmof iOS5 can detect multiple faces at ones, and in suchdepth that the smallest recognized face is barely biggerthan a button. These are great results, because thealgorithm appeared to be fast and reliable and detailedenough for our purpose. Detection of even smallerfaces is not necessary, because the boxes will become

Figure 1. Face tracking results.

harder to click if they are smaller than a button.

The next step was the face recognition. This is han-dled by Face.com. An iOS SDK is available on the web-site [11]. This SDK contains the functionality to sendimages to the Face.com servers, and receive a JSON re-sponse with the recognized faces. It also covers the nec-essary Facebook login, as Face.com requires the user tolog in using his Facebook account. This login is onlyneeded one time. One problem was that Face.com onlyaccepts images, not video. To be able to get the facerecognition working as fast as possible, a recognize-button was added to the camera view. Once clicked,a snapshot is taken with the camera. This snapshot issend to the servers of Face.com and analyzed for dif-ferent faces. The JSON response gets parsed and thepercentage of the match and the list of best matches canbe fetched from it. At the moment, only one face canbe recognized at the same time, because there is no wayto check which part of the response should be matchedto which face on the camera. This is thus temporarilysolved by limiting the program to one face at a time.Figure 2 shows the current status of the application. Aface is recognized and its Facebook ID is printed above.

Figure 2. Face recognized and matched withthe correct Facebook ID.

6. NEXT STEPS AND FUTURE PROB-LEMS

The next step in development is the further creationof the user interface. Now that we have a basic im-plementation of the main functionality, it is importantto finish the outlines of the application. This way, adummy implementation of several screens can be usedto test the interface in several iterations using the dig-ital prototype. When these tests happen, the underly-ing functionality can be extended and implemented inthe background. This way, we can combine the twotasks. If the full underlying functionality would be im-plemented first, the user tests would start too late. Incase big UI problems arise, it would be better to catchthem early in the process.Several big problems need to be solved. The biggestbeing the matching of faces detected by iOS5 face de-tection and the faces recognized by Face.com. Be-cause Face.com recognizes faces by using images, away needs to be found to match these results to thefaces on screen. If the user moves the camera to otherpeople after pressing the recognize-button, the resultsfrom Face.com will not match the faces on screen. Thesolution in mind is to use an algorithm to match theFace.com results with the face detection based on pro-

portions. If we succeed in finding a correlation betweenfor instance the eyes and the nose of a person in bothservices, it should be possible to find which detectedface matches which Face.com result.Another smaller problem is the use of the Face.comSDK. It has a limited Facebook graph API built intoit. However, this API can not be used to fetch the nameof an ID or to get status updates. Therefore the realFacebook iOS SDK should be used. To prevent the appfrom working with two separate API’s, the Face.comAPI needs to be adapted so that it uses the real Face-book iOS SDK instead of the limited graph API.

7. CONCLUSION

This masterthesis is still a work in progress. We al-ready have good results from paper prototyping, and thecore of the application has already been implemented.In the following months, some problems will have tobe solved and a lot of user testing is still required tomake the application match its goal: a fast, new way todiscover people using the newest technologies and net-works.

References

[1] http://www.face.com[2] http://www.layar.com[3] https://developer.apple.com/library/mac/documentation/

CoreImage/Reference/CIDetector Ref/Reference/Reference.html[4] Arnold M. Lund, Measuring Usability with

the USE Questionnaire, in Usability and UserExperience, vol. 8, no. 2, October 2001,http://www.stcsig.org/usability/newsletter/0110measuring with use.html

[5] http://www.viewdle.com/[6] http://www.animetrics.com/Products/FACER.php[7] http://www.youtube.com/watch?v=tb0pMeg1UN0[8] http://www.betaface.com/[9] http://www.pittpatt.com/

[10] Niels Buekers, Social Annotations in Augmented Real-ity, Masterthesis at KULeuven, 2010-2011

[11] Sergiomtz Losa, FaceWrapper for iPhone,https://github.com/sergiomtzlosa/faceWrapper-iphone

Documents

Augmented Reality Face Recognition for Mobile Devices