BSc Informatica - UvA/FNWI (Science) Education …1 BSc Informatica Camera pose estimation with circular markers Joris Stork August 13, 2012 Supervisor: Rein van den Boomgaard Signed:

1

BSc Informatica

Camera pose estimation withcircular markers

Joris Stork

August 13, 2012

Supervisor: Rein van den Boomgaard

Signed:

2

Estimating the position of a fiducial marker relative to a robot up to a distanceof 10 to 20 metres requires only a single camera in the one to two megapixel range,and a pattern produced on an A4 sheet of paper with a consumer grade inkjetprinter. Pose estimation of this type can lower the overall cost of robot naviga-tion systems. A review of related work shows that circular fiducial markers offersuperior pose estimation characteristics compared to square fiducial markers, butthat no up-to-date implementations based on circular markers are freely availablein the public domain. This project implements a technique for estimating the poseof a circular marker. In 2004, Chen et al. described the technique in theoreticalterms. Pagani et al. reported in 2011, in general terms, that they had implementedthe technique with promising results. The implementation is, however, not freelyavailable. Here, a new implementation of the technique is presented, and its per-formance is evaluated in detail. This thesis confirms that the technique proposedby Chen et al. does produce estimates that systematically approximate the posi-tion of a circular marker within certain parameters. With further development inmind, the system presented here promises to have useful applications in roboticnavigation and task execution.

Contents

1 Introduction 51.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.3 Research question . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Background 72.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Choice of technique . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Theory 143.1 Oblique elliptical cone Q . . . . . . . . . . . . . . . . . . . . . . . . 143.2 Oblique circular cone Qc . . . . . . . . . . . . . . . . . . . . . . . . 153.3 Rotation R1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.4 Rotation R and translation t . . . . . . . . . . . . . . . . . . . . . . 163.5 Centre C and normal N of circle . . . . . . . . . . . . . . . . . . . 17

4 Implementation 194.1 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.2 Pose estimation pipeline . . . . . . . . . . . . . . . . . . . . . . . . 244.3 Notes from debugging . . . . . . . . . . . . . . . . . . . . . . . . . 29

5 Experiments 345.1 Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345.3 First experiment: real camera . . . . . . . . . . . . . . . . . . . . . 355.4 Second experiment: simulated . . . . . . . . . . . . . . . . . . . . . 395.5 Third experiment: simulated . . . . . . . . . . . . . . . . . . . . . . 42

6 Conclusion 566.1 Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.2 Further work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586.4 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4

CHAPTER 1

Introduction

“The nature of God is a circle of which the center is everywhere andthe circumference is nowhere.” - Empedocles

The scientific literature describes practicable techniques to deter-mine the poses of readily observable instances of ellipse. As used inthis document, “pose” is a concept from the field of computer vision,denoting the position and orientation of an object relative to a givencoordinate system.

This thesis attempts to describe the feasibility of implementing onetechnique for the estimation of a circle’s pose from its image on acamera sensor.

The remainder of this section explains the motivation and aims be-hind this project, and formulates the research question. Chapter 2places the research question in a broader context and justifies the choiceof pose estimation technique. Chapter 3 explains the technique in the-oretical terms, and chapter 4 presents an implementation of the tech-nique. Three series of experiments to evaluate the implementation aredetailed in chapter 5. By way of a conclusion, chapter 6 reviews andinterprets the experimental findings, before suggesting areas for furtherwork.

1.1 Motivation

The motivation for this project lies both in the general advantage ofgaining a greater familiarity with the field of computer vision, and inthe specific advantage of obtaining a potentially useful marker-based,single camera pose estimation system. As shown in section 2.1, thepresent landscape of such systems offers a limited range of free - as inboth “free beer” and “free speech” - implementations with the charac-

5

teristics needed for indoor and outdoor robotic navigation.A robotic navigation system centred on a single, simple digital cam-

era, possibly in combination with rudimentary obstacle avoidance sen-sors, such as ultrasound range finders, promises the advantages of lowcost and low weight. In addition, location-dependent task informationmay be embedded in the data payload of markers deployed for a fiducialpose estimation system. A well designed combination of the pose andtask components of such a system could represent a low cost robotictask execution framework that would be easy to deploy in a wide rangeof applications and environments. Given the active review of regula-tions governing the commercial use of robots in various jurisdictions, avery high level, easy to use yet flexible framework to design and deployrobotic task execution algorithms may soon become desirable.

Should the process of implementing a promising but non-proprietaryvisual pose estimation technique turn out to be successful, it would con-tribute to the construction of a robotic navigation and task executionsystem that the author has envisaged (c.f. section 2.2.1). Whether ornot this project results in a useful implementation, it will have taughtme a useful thing or two, namely about: concepts and techniques incomputer vision; conducting research; implementing a mathematicallyformulated technique; building and conducting experiments; and writ-ing a reasonably scientific report.

Last but not least, this thesis could partly satisfy the curiosity ofanyone who wonders whether the technique in [6] is valid and whetherit was indeed successfully implemented in [18] (refer to chapter 2 foran explanation of this question).

1.2 Objective

This thesis is constructed to achieve four successive goals. First itshould demonstrate a reasonable understanding of the chosen pose es-timation technique: in other words, why should it work? Second, itshould offer a reasonably functional and maintain-able implementationof the technique. Then, insight into the performance of the implemen-tation should be achieved through relevant experimentation. Finally,the performance data should be analysed to identify prospects for fu-ture improvements and applications.

1.3 Research question

Is it possible to implement the technique for visual pose estimationdescribed in Chen et. al.’s 2004 paper, [6], as a system with potentialapplications in robotic navigation?

6

CHAPTER 2

Background

The field of computer vision has produced dozens of pose estimationsystems since the 1980s. The types of marker-based systems that aremost useful for robotic navigation rely on so-called fiducial markers.Table 2.1 presents an overview of the better known fiducial markerbased systems, collated in preparation for this thesis during a reviewof the relevant literature. Fiducial markers serve primarily to reveal in-formation about the poses of objects in a scene, in contrast to “barcodestyle” visual markers, which are designed to work as general purposedata media in a scene. Section 2.1.1 describes some notable examplesof the barcode-style marker.

2.1 Related work

2.1.1 Barcode-style markers

A few examples of the barcode type of marker will help the reader todistinguish these from fiducial markers. While some of these markersallow for a degree of pose estimation, their primary design goal is ro-bust marker identification and marker data registration to improve onthe one-dimensional barcode. The best known system is the QR Code,a proprietary design intended for “direct part mapping” (DPM), whichits creators, Denso Wave, released into the public domain, allowing itsfree use. QR code has found a wide range of applications, notably innon-industrial applications. The ECC200 Data Matrix is another sys-tem developed for DPM, specifically along assembly lines.[9] An exam-ple of a large, non-DPM system is the US Postal Service’s deploymentof Maxicode[5],to track parcels in its distribution network.

2.1.2 Fiducial markers

Fiducial marker based systems have applications in various fields thatrequire reliable identification and relatively precise pose estimation.

7

These fields include Augmented Reality (AR), photogrammetry, medi-cal imaging, and robotics. Table 2.1 lists notable fiducial marker basedsystems, along with, where possible, details of their respective geome-tries, source code availability, licensing, and functional characteristics.The best known pose estimation system that employs fiducial mark-ers is ARToolkit.[12] ARToolkit has been used widely for research andcommercial applications since the late 1990s due to its relative ease ofuse and due to the availability of the application source code. How-ever, ARToolkit has been superseded in terms of performance by morerecent systems.

One can classify fiducial markers in terms of their geometry, whichmay consist of: square or generally rectilinear patterns; circular pat-terns; or dots. Different materials may be specified, too. For exam-ple: polychrome inks to increase information density; infrared reflectivecoatings to make the markers invisible to the naked eye; retro-reflectivematerials to enhance marker visibility under illumination from the cam-era’s position, and; three-dimensional substrates to enlarge the fieldwithin which the camera can register the marker.

Most fiducial markers, including ARToolkit, carry little more datathan is required to distinguish marker instances. Several marker speci-fications allow a choice of data size to provide some flexibility in tradingoff information density against error correction and effective image size.

System Geometry Notes

AprilTag (2010) square open source / “fully open”; developed at U. ofMichigan; similar to ARTag; claimed: good doc-umentation; employs line detection.[17]

ARTag (2005) square GPL only; claimed: enhanced data encoding ro-bustness due to implementation of checksumsand ”forward error correction (FEC)”; up to2002 unique IDs; no need to store patterns; re-silience against partial occlusion by “estimatingborder segments”.[9].

ARToolkit (2000) square GPLv2 and commercial license.[12]

ARToolkit+ (2007) square GPLv3; claimed improvements over ARToolkit:easier C++ api; reduced jitter; compensation forvignetting; auto thresholding. [28]

Aruco (2011) square BSD license; developed at U. of Cordoba; C++

library; claimed: only dependency is OpenCV;up to 1024 distinct markers.[8]

8


Cantag (2006) n/a GPLv2; developed at Cambridge U.; modularframework for comparing and developing fiducialsystem designs, including a simulation module;written in C++; no official release; source codenot maintained since 2009.[20]

CyberCode (2000) square commercial; designed for consumer AR applica-tions involving mobile phone cameras.[19]

Cho et. al. (1998) circular unknown licensing / availability; multi-ringed,colour markers; theoretical research, implemen-tation unavailable.[7]

Fourier Tag (2007) circular unknown licensing / availability; developed atMcGill and Montreal U.s; uses Fourier transformand periodic patterns of circles; claimed: grace-ful degradation; designed for “human robot in-teraction”.[23]

HOM (2001) square Hoffman Marker; reportedly used by SiemensCorporate Research and Framatome ANP.[1, 29]

Homest (2010) n/a GPL; homography estimation library written inC and C++; does not find point correspondences;depends on the levmar library.[15]

Intersense IS-1200 circular commercial, patented; MS Windows only; closedsource; claimed: 5 mm position error; 1◦ angularerror; 150Hz update rate; 6 DOF; “unlimited”markers.[11]

IGD marker (1998) square commercial; developed by the Fraunhofer IGDinstitute; performed well in AR competitions,pre-ARToolkit.

isotropic (2003) circular theoretical, implementation unavailable, licens-ing unknown; developed at U. of Taiwan.[25]

MFD-5 (2008) square licensing / source code status unknown; de-veloped by Mark Fiala; deployed in ISMAR2008.[13]

Nakazato et. al. (2005) square unknown source, licensing status; retro-reflectivematerial to make markers invisible except usingan infrared flash; designed to be affixed on ceilingfor detection by “wearable” AR systems.[16]

9


Photomodeler circular commercial; Eos Systems’ Photomodeler soft-ware package; circular markers named ”codedtargets”; photogrammetry applications.[10]

Ptrack (2006) dots proprietary, commercial; developed by theFraunhofer IGD institute; infrared reflectivemarkers.[22]

Runetag (2011) circular open source; license unknown; in development;claimed: strong occlusion resilience[2]

SCR marker (2000) square unknown source code, license status; developedby Siemens Corporate Research.[30]

Studierstube (2008) square unknown source, licensing status; developed byGraz U. of Tech.;designed for low-power plat-forms, mobile phones; successor to ARToolk-itPlus to replace “dated” techniques; modularwith six marker types, two pose estimators, threethresholding algorithms.[24]

SwisTrack (2011) n/a APL; modular (marker/markerless) trackingframework; requires OpenCV; written in C++.[14]

Uchiyama et. al. (2011) dots [27]

Table 2.1: Fiducial marker systems

2.2 Choice of technique

2.2.1 Requirements

The motivation for this project (discussed in section 1.1) includes theprospect of building a system to enhance robot navigation and taskexecution. A marker based pose estimation mechanism with certaincharacteristics would be central to the envisaged system. Here followsa review of the characteristics in question (a few of these criteria areinspired from [9]), and, where appropriate, specifications that we canidentify at this stage.

10

Pose

In this document, pose is taken to signify the coordinates of the markercentre and of the marker normal vector, in the same order of decreasingpriority, with respect to the camera’s pre-determined coordinate system(c.f. chapter 4).

Number of markers

A system that requires a single marker per pose estimate is preferred,as this would be beneficial in terms of: system complexity; visual foot-print; ease of deployment; and cost.

Marker image size

This requirement combines factors such as sensor size, field of view,marker size, system range, and marker payload size. Taking the A4format (210× 297 mm) for its ubiquity in consumer printers, a squaremarker of 210× 210 mm offers the largest marker size - minimising therestrictions on the combination of camera, payload and range - thatremains practical to produce and use. Then, to minimise cost, thechosen sensor size is taken to be a relatively modest 1280x960, or justunder 1.3 megapixels. Note that a larger sensor increases the cost ofthe image processing hardware as well as of the camera. A horizontalfield of view of around 43◦ (resulting in a vertical field of view of around33◦) reflects a fair compromise between marker depth range and thesystem’s ability to spot markers at a greater angle from the optica axis,keeping in mind the kind of pre-positioning afforded by GPS navigationand odometry. With these restrictions in place, a detection range ofone to 20m and a registration and pose estimation range of one to 10mimpose minimum bounds on the marker image height of 17 pixels and34 pixels respectively for those categories.

Occlusion

In a non-laboratory environment, objects are liable to partly or com-pletely occlude the marker in the camera image. In outdoor scenarios,rain and snow could also cause a degree of marker occlusion. Thepose estimation system should provide some resilience against partialocclusion.

Lighting and colour

Non-laboratory conditions include variability in environmental lightintensity, direction and colour. To minimise the impact of these fac-tors, and to otherwise reduce the cost and complexity of the markers,the implementation will involve bi-tonal, namely black and white pat-terns rather than polychromatic patterns. The use of fewer colours

11

implies a reduced data payload. This will have little impact on theintended system, as its data requirements extend to fewer than a hun-dred unique IDs plus error correction overhead. High-contrast blackand white patterns additionally reduce the requirements relating tocamera sensitivity and signal linearisation.[9]

False positive/negative rates

False positive and - to a lesser extent - false negative marker identifi-cation and registration could significantly impair the performance of anavigation and task execution system.[9] Unique characteristics of themarker patterns, such as ratios of the sizes of key pattern elements, andconscious data payload design should reduce the risk of false markeridentification and registration to negligible levels.

Substrate shape

While a three-dimensional marker, such as the conic markers describedin [26], offers a greater “working volume” in which the camera is ableto register the marker, two dimensional markers of the format alreadyspecified are better suited to the intended application’s requirementsfor cost and ease of use.

Marker viewing angle

A maximum effective marker viewing angle of 60◦ from the markernormal is envisaged.

Data payload

As discussed above, the data payload requirement is for the numberof identifiers envisaged - less than a hundred - in addition to the errorcorrection overhead.

Speed and jitter

To allow for vehicle vibration and velocity, the sensor should be capa-ble of taking sufficiently exposed images within a shutter time of onemillisecond or less. The pose estimation pipeline should additionallyprovide updates within the kinds of bounds relating to throughput andjitter that enable reasonably responsive and consistent navigation.

2.2.2 Chen et. al., 2004

A review of related work (c.f. section 2.1) reveals that, as Rice et. al.conclude in [20],

12

“There are numerous digital marker systems and numeroustag designs but most are basically either a square or a circularshape. [. . . ] Our results demonstrate that square tags arebeneficial for large data capacity and that circular tags arebeneficial for location recovery due to the behaviour of acircle under perspective distortion.”,

and as Pagani et. al. point out in [18]:

“[circular markers] are easier to detect and provide a poseestimate that is more robust to noise.”,

circular markers generally offer better pose estimation characteristicsthan square ones. However, the literature review has also shown thatthere are no tried and tested circular marker systems which do notrequire multiple markers and which are free to use and open-source.

These insights dovetailed with a review of [6], which describes amethod for camera calibration using the images of circles. The methodwas originally intended to facilitate eye tracking. Given two or morecoplanar circles of unknown size, the method can derive the cameracoordinates of the centres and normal vectors of the circles, as wellas the camera’s focal length. If the camera’s intrinsic parameters areknown, the pose of a single arbitrary circle of known radius can berecovered from its image. This last capability, and the performance ofone implementation, as described in [18], indicates that the techniquein [6] offers the best prospect for satisfying the requirements outlinedin section 2.2.1. The technique additionally does not in principle con-tradict any of those requirements. As with other circular marker basedsystems, it is possible with current libraries to accurately fit an ellipseeven if only a fraction of its length is visible, meaning that reason-ably accurate pose estimates are possible for partially occluded markerimages.

Enquiries with the authors of [6] and [18] to obtain a working im-plementation were unsuccessful. The former no longer possess usefulsource code related to that work. The latter, who built their implemen-tation at the German Research Centre for Articial Intelligence (DFKI),did not make their source code available. As a result, this thesis is di-rected at answering the question that is formulated in 1.3.

13

CHAPTER 3

Theory

This chapter attempts to explain why the pose estimation method pro-posed in [6] should, in theory, work. We examine the geometry of acircle and its image in a camera, and how this can be used to deriveequations for the rotation R and translation t relating the marker co-ordinate system and the camera coordinate system.

3.1 Oblique elliptical cone Q

Let us consider the oblique cone with the camera focal point at itsapex and the elliptical image of the marker as its intersection with theimage plane. In the camera coordinate system (CCS), we can expressthe ellipse as a general quadratic curve, with equation 3.1.

Ax2e + 2Bxeye + Cy2e + 2Dxe + 2Eye + F = 0. (3.1)

Describing a point on the curve with the vector

xeye1

, we can encode

the parameters of the quadratic curve in matrix form, as:

[xe ye 1

] A B DB C ED E F

xey31

= 0. (3.2)

In the CCS, z0 = −f where f is the camera’s focal length. Then,by multiplying the vector of a point on the ellipse by a scale factor kdirectly related to the distance of the point to the cone’s apex, we canencode all points on the cone with equation 3.3.

P = k

xeye−f

. (3.3)

14

Now, to substitute the points on the ellipse,

xeye1

, with the points

on the oblique cone, P , in equation 3.2, we must alter the matrix ofparameters so that the resulting equation holds. The matrix becomesthat in equation 3.4. Note that Q is a symmetric matrix: this willenable the eigendecomposition of Q in section 3.5.

Q =

A B −Df

B C −Ef

−Df−E

fFf2

. (3.4)

Equation 3.5 then holds, and encodes the oblique cone under con-sideration.

P TQP = 0. (3.5)

3.2 Oblique circular cone Qc

Let us now consider the oblique cone with the camera focal point atits apex and the marker circle as its intersection with the marker’ssupporting plane. Let us call the orthonormal coordinate sytem with

as its Z-axis the unit normal vector originating in the centre

x0y0z0

of

the marker circle, the marker coordinate system (MCS). Then, taking r

as the ray of the circle of points

xyz

, equation 3.6 encodes the marker

circle in the MCS. Note that in the MCS, z = z0 = 0.

(x− x0)2 + (y − y0)2 = r2. (3.6)

Note that equation 3.6 encodes the circle under consideration in thissection in the MCS, just as equation 3.1 encodes the ellipse underconsideration in section 3.1 in the CCS. We can now encode the points

Pc =

xyz

on the oblique cone under consideration in this section

analogously to equation 3.2 by substituting Pc for P and substitutinga matrix Qc for Q so that equation 3.6 holds for the oblique cone underconsideration. We then have the matrix Qc in equation 3.7.

Qc =

1 0 −x0z0

0 1 −y0z0

−x0z0−y0z0

x20+y20−r2z20

. (3.7)

15

Equation 3.8 describing the oblique cone under consideration in thissection then holds.

P Tc QcPc = 0 (3.8)

3.3 Rotation R1

The cones described in sections 3.1 and 3.2 are in fact the same cone,described with different orthonormal bases. The result is that a ro-tational transformation of the one cone (or seen another way, of itsbasis), places all its points on the other. So, a rotation - let us call itR1 - relates the points P and Pc. Encoding R1 as a matrix, we obtainequation 3.9.

P = R1Pc (3.9)

3.4 Rotation R and translation t

Now, determining the unit normal vector and centre of the circle amountsto determining R1 and Qc. Let us isolate those terms, as follows.

From equations 3.5 and 3.8 we obtain equation 3.10

P TQP = P Tc QcPc (3.10)

Then, substituting equation 3.9 in equation 3.10, we obtain (3.11).

(R1Pc)TQR1Pc = P T

c QcPc⇔ P T

c RT1QR1Pc = P T

c QCPC⇔ P T

c (RT1QR1 −Qc)Pc = 0

(3.11)

Given that, if for any n-dimensional vector v and n × n matrix A,vTAv, then A = 0, we can derive equation 3.12 from equation 3.11.

RT1QR1 = Qc (3.12)

Since for any scalar kc, kcQc encodes the same cone as Qc, equa-tion 3.12 can become equation 3.13.

kcRT1QR1 = Qc (3.13)

Now, given that Q is a symmetric matrix, we can factorize it byeigen decomposition, into the form in (3.14). Λ is the 3 × 3 diagonalmatrix of eigenvalues, {λ1, λ2, λ3}. V is the 3×3 matrix whose columnvectors are Q’s eigenvectors.

Q = V ΛV T =

| | |v1 v2 v3| | |

λ1 0 00 λ2 00 0 λ3

− v1 −− v2 −− v3 −

, (3.14)

16

Substituting equation 3.14 into equation 3.13 we obtain equation 3.15.

kRT1 V ΛV TR1 = Qc (3.15)

Let us take R = V TR1 ⇔ RT = RT1 V for convenience and substitute

into equation 3.15 to obtain equation 3.16

kRTΛR = Qc (3.16)

At this stage, Chen et al. appear to combine the derivations explainedso far in this section with the general form of a three dimensionalrotation (left handed, clockwise with Euler angles φ, θ and ψ)),

cos θ cosψ − cosφ sinψ + sinφ sin θ cosψ sinφ sinψ + cosφ sin θ cosψcos θ sinψ cosφ cosψ + sinφ sin θ sinψ − sinφ cosψ + cosφ sin θ sinψ− sin θ sinφ cos θ cosφ cos θ

,to arrive at the definition for the rotation R shown in equation 3.17

and the definition for the translation t, between the MCS and CCSorigins, shown in equation 3.18. Note that these definitions are am-biguous with respect to the undetermined signs {S1, S2, S3} and the asyet unknown rotation angle α.

R = V

√

λ2−λ3λ1−λ3 cosα S1

√λ2−λ3λ1−λ3 sinα S2

√λ1−λ2λ1−λ3

sinα −S1 cosα 0

S1S2

√λ1−λ2λ1−λ3 cosα S2

√λ1−λ2λ1−λ3 sinα S1

√λ2−λ3λ1−λ3

, (3.17)

t =

−S2S3 r cosα

√(λ1−λ2)(λ2−λ3)

−λ1λ3

−S1S2S3 r sinα√

(λ1−λ2)(λ2−λ3)−λ1λ3

S3λ2r√−λ1λ3

, (3.18)

where: λ1λ2 > 0,

λ1λ3 < 0,

|λ1| ≥ |λ2|.. (3.19)

3.5 Centre C and normal N of circle

Chen et al. derive equation 3.20 for the circle’s centre, which is thekey equation for our pose estimation system, and 3.21 for the circle’snormal, in the CCS:

C = z0V

S2

λ3λ2

√λ1−λ2λ1−λ3

0

−S1λ1λ2

√λ2−λ3λ1−λ3

, (3.20)

17

N = V

S2

√λ1−λ2λ1−λ30

−S1

√λ2−λ3λ1−λ3

, (3.21)

where

z0 = S3λ2r√−λ1λ3

, (3.22)

and where λ1, λ2 and λ3 are ordered according to 3.19.

18

CHAPTER 4

Implementation

4.1 Framework

4.1.1 Marker

In its current form the application recognises instances of the markermodel shown in figure 4.1. The thick black circular band against awhite background provides two sharp circular contours that are sep-arated widely enough to be discernable at a distance of up to 20 m,given the marker and camera parameters envisaged (c.f. section 2.2.1),but leave a large enough enclosed white space to include a data payloadpattern.

Figure 4.1: marker model

4.1.2 Language

The pose estimation system presented here is a command-line applica-tion written in Python v.2.7. Choosing a language is partly a matterof taste. Yet Python does offer some concrete benefits, including ma-ture implementations of essential libraries such as OpenCV for imageprocessing, Numpy for numerical work, and PyOpenGL for generating3D graphics - although the latter could do with better documenta-tion. Python is interpreter driven, but CPU-intensive libraries such as

19

the ones just mentioned internally consist of highly optimised C andC++ code. Development time is reduced due to the sparse syntax anduntyped variable names; a large and friendly scientific programmingcommunity; and the interactive interpreters, such as iPython, whichfacilitate tests on new snippets of code before these are included intothe application. Finally, the Python language, or rather its interpreter,is cross-platform, although unfortunately it is not natively supportedon any major mobile operating systems.

As shown in section 5, the use of an interpreted language slows theapplication down compared to, say, an implementation in C++. Still,the speedier development process and the consideration against pre-mature optimisation combine to compensate sluggish performance inearly iterations. Python makes it easy to profile an application’s per-formance at a later stage, and thereafter to incorporate bindings of themost CPU intensive functions if these are re-implemented in C or C++.

4.1.3 Concurrency

In simulation mode the application structure invites a degree of con-currency, as it comprises a relatively separate simulator componentthat generates and then passes images to a pose estimation pipelinecomponent. This structure motivates the separation of the simulationand pipeline components into separate processes handled by Python’smultiprocessing package. Unlike threading the multiprocessing

library spawns separate processes to circumvent Python’s Global In-terpreter Lock, thus enabling parallel execution across multiple coresor machines. The result is a two-process application in simulationmode (c.f. figure 4.2). Synchronisation is achieved with the q2pipe

and q2sim queues, which uses pickle internally to transport objects,such as simulated camera parameters, Numpy arrays of image data, andsynchronisation strings.

4.1.4 A brief manual

The application is designed to be easy to install, use, debug, and main-tain. This section provides basic instructions to get it up and running.

Installation

Please contact the author if you received this report without the ac-companying source code. The code is packaged in a .tar.bz2 archive.The application’s main file, main.py, is located in the root of the un-packed directory structure. In addition to the included modules, theapplication requires the following Python libraries (in some cases loweror higher version numbers will work):

• Python 2.7 (interpreter and standard library)

20

ieee 1394 module

images

pydc1394

pipeline

contours, ellipses, estimates

pipeline, pipeline modules

process 1/1

printer

plots, printouts

outputs.printer

(a) Real camera

simulator

images

simulator.*

process 1/2

pipeline

contours, ellipses, estimates

pipeline, pipeline modules.*

printer

plots, printouts

outputs.printer

process 2/2

(b) Simulation

Figure 4.2: Application concurrency in simulation mode

• OpenCV 2.3.1 (with Python bindings)

• Matplotlib 1.1.1

• NumPy 1.6.2

• Python Imaging Library (PIL) 1.1.7

• PyOpenGL 3.0.1

Note that the syntax for udev rules, which are required on a rangeof Linux configurations in order to interface the camera without rootprivileges, changed in version .17x, making the udev rules currentlyprovided on many forums obsolete. The correct udev rule for recentsystems is included in the application’s root directory.

Interface

On a typical POSIX compliant machine, the programme is executedfrom the application root directory with the command form python

main.py [options]. The bash script in run provides some commonuse cases, and running main.py with -h or --help lists all the avail-able command line options. These include: image source type; pipelinemodule selection; toggling module output windows; camera parame-ters; and execution time. For example, the command

21

python main . py −n 3 −s −1 −t 30

runs the application with the first three pipeline modules (the con-tour finder, ellipse fitter, and first pose estimator), in simulation modewith markers generated randomly within view, for 30 seconds. Thelast execution log is located in log, and error-level log entries are ad-ditionally written to /dev/stdout. All functions are documented witha single docstring,

To exit the application, send ctrl+c from the shell or press q in thesimulation window (if present). One can safely ignore OpenGL bufferswap warnings at this stage.

4.1.5 Application modules

Overall initialisation is performed in main.py whereas the pipeline

module drives the pipeline process and contains the main programmeloop. The CameraVals object is defined in the camera values module.In addition, the application contains the following package directories:

admin modules

The loginit and argparse modules handle logging and command lineoptions, respectively.

calibration

In order to derive pose estimates, the pipeline requires two values inaddition to the image: the radius of the outer marker circle, and thecamera’s focal length. In simulation mode the focal length x and y com-ponents are chosen by editing the relevant values in the camera values

module source code. When using a real camera, the focal length isobtained by calibrating the camera, a process that obtains the cam-era’s intrinsic matrix. The calibrate module facilitates the calibra-tion step.calibrate discovers the camera’s intrinsic matrix by searching for

a best fit solution to the homography that relates the points on a“calibration rig”, often a flat black-and-white chessboard pattern, totheir correspondences in a set of images of the rig taken with thecamera in question. The function used to find the point correspon-dences is OpenCV’s findChessboardCorners. The resulting 20 setsof corresponding points are then passed to another OpenCV function,calibrateCamera, which returns estimates for the camera’s intrinsicmatrix, the extrinsic matrix, and the camera distortion. The cali-bration function performs best when provided with guesses for thecamera’s principal point coordinates and its focal length components.calibrate stores the intrinstic matrix containing the focal length ina Numpy binary file, intrinsic.npy. Whenever a pose estimation

22

object is created, it is initialised with the focal length stored in thatfile.

Note that OpenCV’s findChessboardCorners function requires theparameter CALIB CB FAST CHECK to prevent it from searching for toolong in images that do not feature the entire calibration pattern. A testwith timeit shows a ratio of 14:1 between the execution time with andwithout this optimisation. Another - easily overlooked - requirementfor the corner finding function is that the chessboard dimensions shouldbe given in the order (height, width), as the alternative order resultsin incorrect calibration results.

marker

The markers module defines a marker class. Bitmap and vector graphicrepresentations of the marker model are included, e.g. for printing outmarkers. The dimensions text file contains the respective dimensionsof the marker circles.

output

A wrapper class for all data that needs to be transported between thevarious application components is defined in pipeline output. Thisfacilitates communication between the simulation, pipeline and print-ing/plotting components. The wrapper includes methods for extractingdata, such as arrays of coordinates for plots. Writing these wrapperobjects to disk via pickle allows data from previous sessions to bere-used for plotting or for processing by additional pipeline modules.

The printer module combines various methods for printing outoverviews of pipeline data or for producing plots with matplotlib.

pipeline modules

The pipeline modules respectively define the contour finder, ellipse fit-ter and pose estimator classes. The payload decoding and marker iden-tification modules are included as stub files for later implementation.

pydc1394

This package by Holger Rapp and others (see the readme file in thepydc1394 package directory) is a Python wrapper for Damien Doux-champs’ libdc1394 library to interface ieee 1394 cameras. The packagehas not been maintained for some time and has been modified for thisproject to make it compatible with the PTGrey Chameleon camera,used in experiments, and with the application.

23

simulator

The simulator module uses OpenGL to generate images of the 3Dscene containing the marker model, as shown in figure 4.3.

The pose generator module produces the marker centre and vertexcamera coordinates for each image frame. Note that the marker textureand drawing functions are defined in the markers module mentionedabove.

Figure 4.3: Simulation frame

4.2 Pose estimation pipeline

This section describes the principal algorithms at each stage of thepose estimation pipeline. Chapter 3 places the equations referencedin this section into a theoretical context with regard to the techniquedescribed in [6]. Figure 4.6 shows an overview of the equations usedthroughout the pipeline.

4.2.1 Contours

The first processing stage normalises the brightness and increases thecontrast in the image using OpenCV’s equalizeHist function. Then,the image is blurred with OpenCV’s GaussianBlur function using a7×7 Gaussian kernel and a standard deviation of 1.4. These two stagescombine to reduce noise and improve the likelihood that the images ofcircles have both a steep gradient and are uninterrupted. Now themodule applies a Canny Edge filter in the form of OpenCV’s Canny

function, which finds edges in the image as follows:

• the edge detection operator returns the first derivative of the in-tensity function in the horizontal and the vertical directions: re-spectively, Gx and Gy;

24

image source

pydc1394/simulator

contrast stretchcontour

blurcontour

edge detection

contour

contour detectioncontour

ellipse fitting

ellipse

first pose estimation

posea

camera calibration

calibrate

print, plot

printer

final identification (future work)

identify, decode, poseb

navigation/task application (future work)

image

fx, fy estimates

contrast enhanced img

blurred img

binary img (edge, non-edge)

arrays of contour points

inscribing rectangles

pose estimates, marker images

Ci, Ni, Ri, ti, αi, data(i), i ∈ [0, nr. markers)

Figure 4.4: Core pipeline functions

• the edge gradient is then given by: G =√G2x +G2

y, which is the

norm of the gradient vector, whose components are Gx and Gy;

• the angle of the edge gradient is given by: Θ = arctan Gy

Gx;

• each gradient angle is rounded to one of four values;

• non-maximum suppression (NMS) is carried out: the local maxi-mum gradients are found for each of the four angles / directions.A maximum gradient is one that is greater than those on eitherside of the potential edge (source: OpenCV 2.4 documentation).

25

The opencv edge finding implementation employs hysteresis thresh-olding, which selects initial edge segments using a higher thesholdvalue, and then links edges using a lower threshold value. John Canny,the inventor of the technique, recommended in [4] that the thresholdvalues be set at a ratio between 2:1 and 3:1. The pipeline uses a lowerthreshold of 50 and a higher threshold of 150.

Finally an array of contours is obtained from the binary image viaOpenCV’s findContours function. The pipeline receives each contouras an array of all the pixel coordinates that belong to the contour.

4.2.2 Ellipses

The ellipse module passes set of contours, which contour produces,to OpenCV’s fitEllipse function, which in turn produces the corre-sponding best-fit ellipses, represented in the form:

((x0, y0), (xmajor, yminor), α), (4.1)

where: (x0, y0) is the intersection of the ellipse’s major and minor axes;xmajor and yminor are the lengths of the semimajor and semiminor axesof the ellipse respectively; and α is the rotation of the semimajor axisfrom the x-axis. ellipse then filters the ellipses (see description below)before converting the representation from image coordinates to thepipeline’s camera coordinate convention: z-axis along the optical axis,positive in the direction that the camera faces; y axis positive in theupward direction when the camera is upright, and x axis positive inthe right hand direction when the camera is upright.

Ellipse filter

The contour module produces hundreds of contours per frame duringtypical operation with a real camera (c.f. section 5.3 for the series ofexperiments with a real camera source). As shown in the screenshotsin figure 4.5, OpenCV’s fitEllipse function fits an ellipse to everycontour it is given. This does not cause a significant performancepenalty within that function, but does significantly slow down the poseestimation in posea since that algorithm contains a number of CPUintensive loops written in Python.

To reduce the data flow to the pose estimation bottleneck, ellipsecontains two filters. First, contours containing less than ten pixelsare discarded before the ellipse fitting stage. After the fitting stage, asecond filter retains only those ellipses that meet five criteria, namely:that the ellipse must be the larger of a pair with approximately thesame centres and inclinations; that both ellipses have aspect ratiosthat are not too large; and that the ratio between the sizes of the twoellipses should correspond closely to that in the model.

The filters result in a noticeable pipeline speedup and are likely toreduce the probability of false positives in real-world use.

26

(a) Without ellipse filter

(b) With ellipse filter

Figure 4.5: Fitting the right ellipses

4.2.3 Cones

To obtain the estimator for Q in the form shown in equation 3.4, theposea module first converts the ellipse representation that fitEllipsereturns, which is in the form shown in equation 4.1, into the generalquadratic form shown in equation 3.1. The quadratic form then yieldsthe estimator for the elliptical cone, Q, provided the focal length f isalso known. Note that in simulation mode, the focal length taken fromthe value in the GLCameraValsclass. The values and units chosen forthe focal length in simulation mode, given a field of view, determinethe scale of the pose estimates. The following is an explanation ofthe conversion algorithm implemented in ellipse. The algorithm is

27

inspired from [3].The return values from fitEllipse directly parametrise the follow-

ing equation for the ellipse:

X =

[xy

]= X0 +Re

[a cos θb sin θ

], (4.2)

where Re encodes a counter-clockwise rotation by α:

Re =

[cosα − sinαsinα cosα

], (4.3)

and X0 is the centre of the ellipse:

X0 =

[x0y0

]. (4.4)

To obtain the quadratic form, θ needs to be eliminated. We re-arrange equation 4.2 into equation 4.5, so that, given that uTu = 1, θis eliminated to obtain equation 4.6.[

1/a 00 1/b

]RTe (X −X0) =

[cos θsin θ

]= u. (4.5)

(X −X0)TRe

[1/a2 0

0 1/b2

]RTe (X −X0) = 1. (4.6)

Now, taking M (symmetric):

M =

[A BB C

]= Re

[1/a2 0

0 1/b2

]RTe , (4.7)

we have:

(X −X0)TM(X −X0) = 1

⇔ XTMX − 2XT0 MX +XT

0 MX0 − 1 = 0⇔ Ax2 + 2Bxy + Cy2 + 2Dx+ 2Ey + F = 0,

(4.8)

where: D = Ax0 +By0,

E = Bx0 + Cy0,

F = XT0

[A B

B C

]X0 − 1.

(4.9)

4.2.4 Poses

At this stage, an estimate for the matrix Q, as defined in equation 3.4,is available, and the posea module can proceed to compute the cen-tre of the marker circle, C, using equation 3.20, and the normal ofthe marker’s supporting plane, N , using equation 3.21. The function

28

get chen eigend orders the eigenvalues and eigenvectors to satisfy thesystem of equations 3.19, before poesa calculates the estimates for Cand N .

Since C and N are estimated as a function of three undeterminedsigns, {S1, S2, S3}, there are 23 = 8 estimates of each vector, per ellipse.Note that the marker’s rotation, α, is not needed to establish C andN . Finding α would form part of future work on a identify module,as discussed in sections 4.2.5 and 6.2.

In a final step, posea removes pose estimates that are impossiblea priori. According to [18], four out of the six estimates representmarkers that are either behind the camera’s x-y plane, i.e. where C < 0,or which face away from the camera, i.e. where N > 0.

4.2.5 Future modules

In order to obtain a single pose estimate per marker, and to register themarkers’ data payloads, further work would be necessary to implementthe identify module, described below, as well as a marker payloaddesign.

identify

In a first stage the identify module would un-distort the sub-imagescorresponding to the candidate poses for each ellipse. It would thencross-correlate these images with images in a marker model databasefor a range or possible rotations α around the marker normal, possiblyusing a technique similar to that in [18]. The highest correlation scorewould then yield a single, complete pose estimate for each ellipse to-gether with an identifier from the marker database. The module woulddiscard candidate ellipses that did not achieve a threshold score value.Finally, the the module would query the database for the data payloadscorresponding to the identified markers and append these to the rele-vant marker objects. The resulting list of marker objects could be madeavailable to a robot’s task execution module, to provide the instructioncodes and relative poses of markers in the robot’s environment.

4.3 Notes from debugging

Developing algorithms that closely follow the kinds of mathematicalformulations explained in chapter 3, and shown in figure 4.6, presentsa software engineering challenge, albeit a modest one. This section de-scribes the nature of the challenge and the methods used to overcome it.The changes described in this section contributed to the improvementsseen in the results of the third experiment, as described in section 5.5.

29

ellipse: bounding box

(x0, y0), a, b, α

ellipse: quadratic form

E =

A B DB C ED E F

oblique elliptical cone

Q =

A B −D/fB C −E/f−D/f −E/f F/f 2

eigendecomposition of Q

l = (λ1, λ2, λ3),V =

| | |v1 v2 v3

| | |

six centres and normals

(c1,n1), . . . (c6,n6)

two centres and normals

(C1,N1), (C2,N2)

X = X0 +Re

[a cos θb sin θ

]⇒ Ax2 + 2Bxy + Cy2 + 2Dx+ 2Ey + F = 0

Q = VΛV−1, λ1, λ2, λ3 and v1,v2,v3 ordered as described in section ??

C = S3λ2r√−λ1λ3

V

S2

λ3λ2

√λ1−λ2λ1−λ3

0

−S1λ1λ2

√λ2−λ3λ1−λ3

, N = V

S2

√λ1−λ2λ1−λ30

−S1

√λ2−λ3λ1−λ3

, where Si ∈ {+,−}

require: cz > 0 ∧ nz < 0

Figure 4.6: Pose estimation pipeline, key formulae

4.3.1 Systematic yet opaque

The results of the first experiment (section 5.3) and the second exper-iment (section 5.4) indicated the existence of systematic errors in thepose estimation code. They also seemed to promise that the applica-tion could be debugged to create a working pose estimation pipeline,a positive answer to the research question (section 1.3). The math-ematical nature of the application, however, poses a challenge to thedeveloper, since key sections of code can be semantically opaque andtherefore difficult to debug.

30

4.3.2 Names of variables

The first approach to facilitate bugging was to refactor the mathe-matical functions in posea to more closely reflect the naming conven-tions used in the mathematical formulae shown in chapter 3. Single-character and upper case variable names contradict a programmingstyle that normally favours clear, descriptive and unambiguous variablenaming conventions. In mathematical code the priority shifts towardsnaming schemes that make it easier to spot errors by comparing codewith corresponding formulae. After refactoring it was discovered thatthe conversion between ellipse representations contained discrepancieswith respect to the formulae.

4.3.3 Binary search

A debugging approach used in various engineering disciplines where themechanism in question effectively works as a series of “black boxes”,is to perform a binary-search style series of tests for inconsistencies inthe flow of inputs and outputs. This approach was adopted and helpedto find the bugs that were subsequently fixed.

4.3.4 Skew

The source of the skew shown in figure 5.6 was traced to the conversionin posea of the ellipse parameters from ellipse to the quadratic form.This was done by plotting the quadratic forms being produced, whichshowed that the ellipses were off-centre. The shift represented a missingconversion from a bottom-left x-y origin in image coordinates to aprincipal (middle) point x-y origin in camera coordinates.

4.3.5 Scale

Refactoring the camera and marker classes ironed out a bug involvingdiffering units of measure, which explained the scaling effect on theestimated z coordinates, as shown in figure 4.7.

4.3.6 Sign

Side-by-side printouts of actual and estimated coordinates showed thatsome coordinate elements occasionally had the wrong sign (note thatposea produces two estimates per ellipse). By drawing up the table ofthe most frequent discrepancies of signs, as shown in 4.8a, it became ap-parent that the coordinates were being systematically reflected aroundthe x-z plane, and other causes were ruled out. This was confirmedwith single line plots such as that in figure 4.8, revealing the pitfall ofonly plotting a single line of poses along the optical axis or symmetricallines of poses as shown in figure 4.7, since both these type are invariant

31

(a) “Flattened” estimates (units: mm) (b) Actual and estimated depths (mm)

Figure 4.7: z-scale bug

under x-z plane symmetry. The error was traced to the fact that theimage arrays obtained from OpenGL’s buffers had their origin in thetop-left corner, whereas the pipeline assumes a top-left corner imageorigin.

actual x actual y estimated x estimated y

+ + + −+ − + +− + − −− − − +

(a) Pattern of signs

(b) Symmetry (units: mm)

Figure 4.8: y sign bug

4.3.7 Radii

Interpreting r as the length of the circle diameter instead of the circleradius, significantly reduces the errors in the estimates for C. Thismay indicate an inconsistency in the notation used in [6].

4.3.8 Focal length

The camera incorrectly calculated its focal length as 8.211 instead of6.158, because the image height and width values were reversed at onepoint in the calculation.

32

4.3.9 Ellipse filter

The ellipseFitter function encodes near-circular ellipses with highlyvariable values for the bounding box angle of inclination α. In otherwords, the rotation of the bounding box becomes almost arbitrary fornear-circular ellipses. As a result the inner and outer circles appear tohave non-matching inclinations, and the ellipse filter discarded manyellipses with a small aspect ratio. This partly explains the small num-ber of estimates in the first and second experiments. Restricting therelative inclination test to ellipses with a larger aspect ratio fixed theissue.

4.3.10 Duplicate duplicates

Duplicate ellipses per marker circle gave around two estimates permarker pose, a coincidence - given the pose estimation ambiguity -that concealed a bug in the pipeline, which caused only one of the twoestimates per ellipse to be available downstream of the posea module.

33

CHAPTER 5

Experiments

This chapter describes the goals, methods and results of three seriesof experiments involving the implementation described in chapter 4.The three experiments are presented in chronological order. The firsttwo experiments informed major improvements to the pose estimationpipeline, through debugging work explained in section 4.3. The changesare reflected in the results of the third experiment. Other improvementsseen in the third experiments include more detailed output from theprinter module, most notably the histograms shown in section 5.5.3.

5.1 Goal

The overarching goal of these experiments is to help answer the re-search question formulated in section 1.3. The experiments, then, needto show whether the implementation can estimate the pose of a markeras designed, and what its potential accuracy might be. With the rightexperiments, it may be possible to identify the sources of pose estima-tion errors. These in turn could help future iterations of the pipelineto approach the limits of this pose estimation technique’s accuracy.

5.2 Method

This section describes details of experimental method that are commonto the three experiments. Each experiment also has its own methodsection for details specific to that experiment.

5.2.1 Output

The results of all three experiments in this chapter are produced fromthe application’s execution log and from output from the printer mod-ule. The execution log records general execution data such as the

34

pipeline framerate, and the numbers of frames, contours, pre-filter el-lipses and post-filter ellipses processed. The printer module writespose estimation statistics to /dev/stdout and displays matplotlib

based point clouds, 2D plots and histograms, which can be saved asvector graphics files. Refer to the source code documentation for fur-ther details.

5.2.2 Materials

All experiments were conducted with the same PC. Table 5.1 detailsits configuration.

Chipset Intel i7 2620MArchitecture x86-64GPU Intel HD 3000CPU clock speed (GHz) 2.7-3.4Cores 2Threads (“hyperthreading”) 4L1 cache (KB) 128L2 cache (KB) 512L3 cache (KB) 4096Main memory (MB) 3849OS Linux 3.2.0

Table 5.1: PC configuration

5.2.3 Notation

The notation in this chapter follows that of the application code wherepossible: C and eC are the marker centre and estimated marker centrerespectively; N and eN are the marker normal and estimated markernormal respectively; and µ, σ and Σ denote a mean, standard deviationand covariance matrix, respectively. A subscript x, y, or z representsthe relevant element of the subscripted vector. O is the origin of thecamera coordinate system. All coordinates are rounded to the nearestmillimetre.

5.3 First experiment: real camera

5.3.1 Goal

The goal of the first experiment is to discover patterns in the poseestimation data, at a stage where large and highly variable errors char-acterise the pose estimation results.

35

5.3.2 Method

Figure 5.1: First experiment, setup

Figure 5.1 shows a portion of the experimental setup. The experi-ment consists of collating pose estimates from a series of marker centrepositions (C), approximated by positioning the marker and camera byhand. In the first instance, separate pose estimates were made formarker positions approximately along the camera’s optical axis, at dis-tances respectively of 1, 2, . . . , 5m in front of the camera. The pipelineprocessed several hundred frames per position. Then, the pipeline pro-cessed images of the marker while the latter was gradually slid by handfrom a distance of 1m to a distance of 5m. This sliding position is de-

noted in the results as[0 0 zslide

]T.

The marker positions were located using measuring tape and markedat 1m intervals with sections of sticking tape on the floor. The camerawas fixed to a miniature tripod, in a raised position on a flat surfaceat approximately the same height as the fiducial marker’s centre. Toincrease accuracy, the marker’s centre was then manually aligned withthe image centre by rotating the camera before each experimental test,and the marker was rotated so that its normal appeared to the naked

eye to approximate[0 0 −1

]T.

36

Materials

camera Point Grey Chameleon CMLN-13S2Mlens Fujinon DF6HA-1Blens focal length (mm) 6calibration focal length (mm) 6.1152hor. field of view 42.5828◦

vert. field of view 32.5855◦

sensor Sony 1/3 inch CCD monochromesensor size 1296× 964pixel size (µm) 3.75× 3.75framerate (fps) 15bus USB 2image size 1280× 960

Table 5.2: First experiment, camera configuration

The experimental camera is described in table 5.2. The marker con-sists of a black-and-white inkjet printout of the marker model shownin figure 4.1, taped to a cardboard file to keep it more or less flat.

5.3.3 Results

The average pipeline throughput for the six tests in the first exper-iment series was 7.39 fps. The tests involved between 348 and 1129image frames, with an average of 593 frames across the tests. The av-erage values of the pose estimates for each marker position are listedin table 5.3, alongside the corresponding covariance matrices for theelements of the pose estimate coordinates. “ellipses” correponds tothe average per-image number of ellipses the pipeline considers outermarker circles for a given position. Figures 5.2 and 5.3 show pointclouds of pose estimates for the various fixed positions and for the1-5m slide, respectively. Each point corresponds to a single estimate.

5.3.4 Interpretation

Point cloud patterns

Figures 5.2 and 5.3 clearly show the ambiguity in every pose estimatefrom the posea module: the points appear in pairs, varying mostlyin the z direction. In addition, the pose estimates in figures 5.2cthrough 5.2h appear in two main clusters per position. The fact, seenin the ellipse fitting window during pipeline execution, that the pipelinefrequently mistakes the inner circle for the outer circle, probably ex-plains these double clusters. The fact that these pairs of clusters donot vary principally in the z direction, as should be the case betweenellipses along the optical axis that vary only with respect to diameter

37

C (m) ellipses µeC (mm) ΣeC (mm)

001

1.0

−368−280

4

0.4 0.2 −0.00.2 0.2 −0.0−0.0 −0.0 0.5

0

02

1.0

−743−603

8

627.0 509.2 −6.0509.2 413.6 −4.9−6.0 −4.9 0.1

0

03

1.2

−1123−913

11

20560 16820 −20016820 13760 −164−200 −164 2.0

0

04

1.4

−1557−1073

14

19500 20110 −21420110 175700 −1075−214 −1075 7.1

0

05

0.9

−1923−1554

19

21710 17540 −21117540 14170 −170−211 −170 2.1

0

0zslide

1.1

Table 5.3: First experiment, pose estimates (zslide ∈ [1000, 5000])

size, is likely due to the same type of pose estimation error that causesthe line of estimates shown in 5.3 to be skewed away from the actualline of marker poses.

Statistical patterns

The number of ellipses in table 5.3 increases with distance, before drop-ping again at 5m. This may reflect some combination of a decrease infalse negatives and an increase in false positives from 1 to 4m. Sincethe camera does not move significantly between marker positions, thiscould reflect an influence of the changing shape of the marker and itssupporting structure on the numbers of ellipses detected, as a functionof distance. The pipeline stage windows showed that ellipse fits alarge ellipse to the supporting structure at 3 and 4m. The size andaspect ratio of the larger, false positive ellipse seems to explain thestripe of estimates away from the two main clusters in figures 5.2dthrough 5.2g, and may contribute to the increased variances at thosedistances. The pattern of variances of the y coordinate suggest it may

38

have successively increased from 1 to 5m had there not been the peakin false positives at 3 and 4m. A corresponding peak in variances isnotably evident for the y and z coordinates at 3 and 4m.

After factoring out the effect of false positives, it appears that thevariances of all three coordinates increase with distance, much as theerror in the x and y coordinate estimates increase with distance. Inter-estingly, the z coordinate estimates increase roughly in step with theactual z coordinate values, but at a much smaller scale.

Overall pattern of errors

Overall, the pattern of clusters in figure 5.2 and the well defined, almoststraight line in figure 5.3 correspond to the pattern of actual poses -a point and a near-straight line - in each case. This and the skew inboth the line and the pairs of inner/outer circle estimates (discussedabove) suggests that the pose estimates may suffer from systematicerrors, indicating a bug in the implementation at this stage. The factthat the z coordinates increase in step with the actual depth as shownin table 5.3 reinforces this hypothesis.

5.4 Second experiment: simulated

5.4.1 Goal

The second experiment is designed to validate and verify the applica-tion’s simulation module; to confirm the interpretation of the previousexperiment; and to gain further insights, if possible, into the patternof pose estimation errors.

5.4.2 Method

simulation unit size (µm) 3.75pixel size (µm) 3.75× 3.75focal length (mm) 6vert. field of view 32.5855◦

image size 1280× 960near clipping depth (mm) 1far clipping depth (m) 100

Table 5.4: Second experiment, simulation parameters

The second experiment replaces the camera and scene of the first ex-periment with an OpenGL-based 3D simulation. Figure 4.3 shows anexample of an image generated in this way. simulator and markers

39

simulate markers with OpenGL polygons bound to a texture gener-ated from a vector graphic of the marker model. Only one marker iscontained in the scene at a time.

Simulations were executed with varying sample sizes (n = 200,n = 1500 or n = 2000) combined with one of two marker positionpatterns (five lines or single line). The marker normal throughout this

experiment was[0 0 −1

]T. No attempt is made in this experiment

to measure the estimation error for N , due to the errors evident inthe estimation of C, which is a greater priority. The third experiment,described in section 5.5, tackles the accuracy of eN .

5.4.3 Results

n (frames) 2000

throughput (fps) 2.17pre-filter ellipses p.f. 1.51post-filter ellipses p.f. 0.24max(||C − eC||) (mm) 10870.6min(||C − eC||) (mm) 459.8µ(||C − eC||) (mm) 3945.6σ(||C − eC||) (mm) 2241.4

µ(C − eC) (mm)

−1320.1−956.33289.8

Table 5.5: Second experiment, statistics

Table 5.5 shows application and pose estimation statistics for a 2000-frame simulation. Figure 5.4 shows the actual positions C at whichan ellipse is detected. The plot in figure 5.5 shows how depths andcorresponding estimated depths are related in a 1500-frame simulation.Figure 5.6 shows point clouds of the actual positions (C) and estimatedpositions (eC) of the marker in various simulations in “straight line”mode (c.f. section 5.4.2).


Application speed

The pipeline throughput has slowed by a factor of 3.4 compared tothe first experiment. A test run without pipeline modules or windowsshows that the application does not speed up significantly with onlythe simulator running. This may be evidence that a bottleneck exists

40

in the simulator rather than in the pipeline or multiple process syn-chronisation, although synchronisation overhead should not be ruledout as a cause. A performance profile as part of further work - as dis-cussed in section 6.2 - should help to identify the source of the reducedperformance in simulation mode.

Detection

As shown in figure 5.4 and table 5.5, the marker detection rates arelow, and fall to nil beyond a depth of around 10m. The cause may bea combination of: the ellipse filter conditions in ellipe, since on aver-age around 84 percent of ellipses are discarded; and artifacts relatingto contrast, lighting, smoothness and other factors, arising from theOpenGL configuration for this experiment.

Statistics

The average estimation error for each element of C confirms the equiv-alent results from the first experiment, in that the averages in eachdirection roughly correspond to the averages of the results in the firstexperiment.

Depth and scale

Figures 5.5 and 5.6 reveal that the estimated z coordinates increasein step with the actual z coordinates, but at a smaller scale. Thisconfirms the interpretation of the first experiment, and indicates thata proportional relation does exist between the estimated z and actualz values. Tests place the discrepancy of scale at around a factor of 200,so that figures 5.6c to 5.6f result when the estimated z coordinates arescaled up by that factor.


Figure 5.6 shows that straight lines of actual poses result in straightlines of corresponding estimates that are apparently skewed to one sideof the original lines. This result confirms the interpretation of the pointcloud patterns in the first experiment, and again points to a systematicerror in an implementation that would otherwise produce meaningfulestimates for C.

The pattern of ambiguous estimates is also confirmed, as shown inthe pairs of estimates along the central line in figure 5.6c and in thepairs of estimates in the detail shown in figure 5.6f. Figure 5.6f alsosuggests that the pairs of estimates converge as the marker-cameradistance increases.

41

5.5 Third experiment: simulated

5.5.1 Goal

The first two experiments hinted at various kinds of systematic errorin the implementation of [6]. These results led to a debugging effort,which is described in section 4.3. The goal of this experiment is, first,to confirm the improvements to the pipeline, and, second, to providea detailed picture of the system’s performance that will inform theconclusions in chapter 6.

5.5.2 Method

This experiment follows the method of the second experiment, as de-scribed in section 5.4.2, with a few additions that are described here.

The simulator now has two modes of operation. The first simulationmode, which ressembles the method used in the second experiment, isenabled with the flag -s -2 and generates sequences of images corre-sponding to straight lines of marker poses with the marker normal set

to[0 0 −1

]T. The second simulation mode - enabled with -s -1 -

generates images of random poses, including random rotations.Several additional parameters are varied in this experiment, includ-

ing the minimum and maximum camera-marker distance; the executiontime in seconds (flag -t); and the maximum value of θ = 6 (−C)ON ,amongst other options. Refer to the source code documentation formore details regarding simulator options.

The experiment consists of two simulations in the first mode, involv-ing five straight lines in the camera-marker distance range [360, 22000](mm), and five simulations in the second - random - mode with vary-ing marker distance and angle parameters. The results are collated andpresented as: tables of statistics; two dimensional plots; point clouds;and histograms. The random simulations dominate the results andanalysis, since various combinations of variables and ranges of valuesmay usefully be analysed from the random samples.

5.5.3 Results

Statistical data for five random-mode simulations are shown in ta-ble 5.6. Point clouds for two straight-line mode simulations and onerandom mode simulation are shown in figure 5.7. Figure 5.8 containstwo dimensional plots relating estimated depth to actual depth. Theremaining figures each show five plots or histograms corresponding tofive combinations of simulation parameter values: maximum cameradistance either 10m or 18m; and maximum θ either 30◦, 60◦ or 90◦.Apart from figures 5.7a and 5.7a, all results are taken from random-mode simulations.

42

Notation

To the notation defined in section 5.2.3, are added the angle betweenN and eN , denoted 6 NOeN , and the angle at which the marker isviewed, θ, which equals 6 (−C)ON .

n (frames) 7900 7813 7665 7947 7439z range (cm) [36, 1836] [36, 1036] [36, 1836] [36, 1036] [36, 1036]θ range (deg) [0, 90] [0, 90] [0, 60] [0, 60] [0, 10]

throughput (fps) 2.15 2.17 2.13 2.21 2.07pre-filter ellipses p.f. 2.96 3.47 3.83 4.00 4.00post-filter ellipses p.f. 1.35 1.55 1.91 1.96 1.95min. ||C|| (mm) 727 720 727 750 720max. ||C|| (mm) 19705 11145 19952 11228 11225max. θ (deg) 89.8 89.8 60.0 60.0 30.0min. ||C − eC|| (mm) 1 1 1 1 1max. ||C − eC|| (mm) 25331 24873 9568 2480 804µ||C−eC|| (mm) 439 318 287 175 137σ||C−eC|| (mm) 985 810 509 136 94

µC−eC (mm)

3−4−313

−1−3−304

3−2−113

1−2−163

2−2−121

µ 6 NOeN (deg) 63.5 66.3 49.0 49.3 26.3σ 6 NOeN (deg) 40.5 42.1 30.2 30.6 15.4

Table 5.6: Statistics: third experiment


Detection

The pipeline detects almost all markers for θ < 60◦ and ||C|| < 18 m.Detection rates decrease markedly beyond θ = 75◦ (c.f. figure 5.12e)and beyond ||C|| = 15 m (c.f. figure 5.11d).

Overall the detection rate has increased dramatically compared tothe previous experiments. This is a consequence of changes to theellipse filtering algorithm in ellipse. Note that the actual detectionrate will be slightly lower after factoring out false positives, though theeffect is likely to be negligible.

43

Depth and scale

The z scale bug appears to have been corrected, with the caveat thatr should be interpreted as the length of the circle diameter and not asthat of the circle radius (c.f. also section 4.3 regarding this issue). Theestimates for Cz are promising, though they show rapidly increasingmean errors and deviations after a camera-marker distance of 10m ormarker angles (θ) greater than 60◦ (c.f. figure 5.8).

Statistics: eC

The estimates of marker positions are promising, and potentially moreso when one takes the ambiguity of the pose estimates at this stageinto account. There also exists the possibility that the ambiguousestimates give a false impression of greater accuracy, if the means ofthe ambiguous estimates are generally closer to the actual positionsthan either position of the estimate pairs.

As expected, the estimates for C are more accurate in the x and ydirections than in the z direction (c.f. table 5.6): intuitively, changesin depth lead to smaller changes in the image than do changes in thex and y directions.

Overall the error in the eCx and eCy estimates is low, between oneand four millimetres either way on average across all simulation sce-narios shown. The statistics show a “sweet spot” for estimates in the||C|| ∈ [1, 10] metre and θ ∈ [0, 30] degree ranges, where the distancebetween actual poses and their corresponding estimated poses averages137 mm despite the pose estimation ambiguity inherent in posea.

Statistics: eN

The estimates for the marker normal show high levels of error andvariability (c.f. table 5.6): even within the “sweet spot” for eC alreadymentioned, the estimates for N are on average 26.3◦ off-target, with astandard deviation of 15.4◦.


The point clouds in figure 5.7 show that the skew effect seen in theprevious experiments has been corrected. The two non-random plotsin that figure however show errors towards the tips of the lines ofposes, which could correspond to the inner circle being taken for theouter circle: estimates are placed along the same line, but further awayfrom the camera. The plots for a random simulation provide someverification of the simulation itself, since a volume of approximatelythe expected size and shape is filled.

44

Application speed

The application remains slow, at just over two frames per second, whichis unsurprising since no optimisations were made with respect to speed.Prospects for further work in this area are discussed in chapter 6.

45

(a) (0 0 1000)T(b) (0 0 1000)T

(c) (0 0 2000)T (d) (0 0 3000)T

(e) (0 0 3000)T (f) (0 0 4000)T

(g) (0 0 4000)T (h) (0 0 5000)T

Figure 5.2: Experiment 1, estimates for fixed positions

46

Figure 5.3: Experiment 1, estimates for slide along z-axis (1-5m)

Figure 5.4: experiment 2, ellipse detection (units: mm)

47

Figure 5.5: experiment 2, eCz vs. Cz (1500 samples, units: mm)

48

(a) five lines, 200 samples (b) five lines, 200 samples

(c) five lines, 200 samples, eCz scaled (d) five lines, 2000 samples, eCz scaled

(e) single line, 200 samples, eCz scaled (f) single line, 200 samples, eCz scaled, de-tail

Figure 5.6: experiment 2, point clouds (green: C; blue: eC, units: mm)

49

(a) five lines, ||C||max = 22m (b) five lines, ||C||max = 18m

(c) random, θmax = 60◦, ||C||max = 18m (d) random, θmax = 60◦, ||C||max = 18m

Figure 5.7: experiment 3, point clouds (units: mm)

50

(a) θmax = 30◦, ||C||max = 10m

(b) θmax = 60◦, ||C||max = 10m (c) θmax = 90◦, ||C||max = 10m

(d) θmax = 60◦, ||C||max = 18m (e) θmax = 90◦, ||C||max = 18m

Figure 5.8: experiment 3, random mode, eCz vs. Cz (units: mm)

51

(a) θmax = 30◦, ||C||max = 10m



Figure 5.9: experiment 3, random mode, µ||C − eC|| vs. ||C|| (units: mm)

52

(a) θmax = 30◦, ||C||max = 10m



Figure 5.10: experiment 3, random mode, µ||C − eC|| vs. θ (units: mm)

53

(a) θmax = 30◦, ||C||max = 10m



Figure 5.11: experiment 3, random mode, detection rate vs. ||C|| (units: mm)

54

(a) θmax = 30◦, ||C||max = 10m



Figure 5.12: experiment 3, random mode, detection rate vs. θ (units: mm)

55

CHAPTER 6

Conclusion

6.1 Assessment

6.1.1 Validation of [6]

The experiments partly validate the theoretical technique in [6] withrespect to the estimation of the marker centre C but not with respectto that of the marker normal N . Development on the pipeline so farhas prioritised the estimates for C. Bugs in the current iteration maycause the errors in the marker normal estimates. Further work on thesystem may prove to completely validate [6].

6.1.2 Speed

The framerate is subjectively slow when the PC configuration is takeninto account (c.f. table 5.1) and needs to be addressed if the applicationis to run on embedded systems.

With a functional prototype complete, it would now be appropriateto optimise the code. To ensure that the optimisation effort is effectivethe code should be profiled for speed bottlenecks with a library suchas cProfile. Any significant bottlenecks - these would most likelybe located in the posea module - could be examined for optimisation,for example with fewer nested loops or more effective use of existinglibraries such as NumPy. If this approach proves to be insufficient, theoffending sections of code could be re-implemented in C++ or C andre-incorporated into the application with a binding.

6.1.3 Accuracy

The estimates for the marker centre are currently accurate enough forrobot navigation in laboratory conditions, notably within the “sweetspot” described in section 5.5.4. The estimates for N would be oflittle use, even in laboratory conditions. With further work on the

56

implementation, the estimates for C could provide sufficient accuracyfor the intended application, which is described in chapters 1 and 2.

6.2 Further work

The results of the third experiment are promising enough to motivatefurther work on several aspects of the implementation. This sectiondescribes some areas that offer room for improvement.

6.2.1 Source code

The emphasis on building a rapid working prototype of a pose estima-tion pipeline came at the cost of unit tests, which are currently notimplemented. To streamline any further work and to minimise futurebugs, unit tests should be implemented as a matter of priority.

Debugging

Several potential bugs are mentioned in this report, and should betackled during future work on the pipeline. Details to check includethe camera calibration values, and the units for all geometric values.

6.2.2 Image processing

As seen during the transition from the second to the third experiment(c.f. 4.3), various image processing steps are open to improvement.Currently used functions may yield better results with different pa-rameters, and libraries outside OpenCV may offer better performancefor certain tasks. The ellipse module returns two ellipses per circlein simulation mode. Changes to any of the image smoothing, edge de-tection, contour finding and ellipse fitting functions may correct this,although this could vary with lighting conditions. The ellipse filterwould benefit from an ellipse fitting error value of some kind to dis-cern the quality of the fit for each ellipse. Alternatives algorithms areavailable for most of the image processing functions. Further researchand experimentation would undoubtedly lead to an improved pose es-timation design.

6.2.3 Experiments

Further experimentation would reveal currently untested performancecharacteristics, such as the system’s resilience against occlusion, andits performance in various real world conditions. The results of futureexperiments as well as those of the third experiment should be com-pared to results, against similar metrics, for popular fiducial systemsin the literature.

57

6.2.4 Marker design

The literature reviewed in chapter 2 includes many insights into markerdesign considerations. Further work on improving the marker designshould begin with a review of the factors described in [21], regarding,for instance, the optimum ratio of circle sizes.

6.2.5 Identication and decoding

To complete the pipeline for incorporation into a robotic task executionsystem, further work is required to implement the modules describedin section 4.2.5.

6.2.6 Simulator

The simulator has proven to be a cost and time effective means ofevaluating the performance of the pose estimation pipeline. Furtherexperiments with a real camera and scene would serve to verify andimprove the simulations.

Further work on the simulator should at least include the additionof an interactively controlled model-view matrix, and the ability to in-clude multiple markers in a simulated environment. This would allowthe user to dynamically control the camera and perform more exten-sive experiments that realistically simulate a robot navigation task.The simulator could also provide a useful platform for system demon-strations.

6.3 Summary

The pose estimation system implemented and described here validatesthe technique for estimating the position of a circle’s centre described,in theoretical terms, in [6]. The system currently performs best at esti-mating the camera coordinates of the centre of a marker within a “sweetspot” in the camera’s field of view, located between a near depth of 1metre and a far depth of 10 metres. Provided the marker viewing angleis below 60◦, the system estimates C with an average error of 1 to 4millimetres in the x and y directions and of 137 millimetres in the z(depth) direction. The mean error and variance in the estimates grad-ually increases - notably in the z direction - beyond 10m. Combinedwith odometry and basic obstacle avoidance sensors, the system couldcurrently support robot navigation in laboratory conditions.

Further work as described in section 6.2 promises to significantlyimprove the system’s performance in future iterations, in terms bothof speed and of pose estimation accuracy. The simulation frameworkimplemented for this project provides a useful testbed for future devel-opment of the pipeline.

58

6.4 Acknowledgements

Significant input from friends and other generous acquaintances ben-efited this thesis, including: Robert Belleman’s technical advice onmaterials and on building a simulation; Rein van der Boomgaard’sextensive guidance, insight and encouragement in all aspects of theproject; Leo Dorst’s assistance with the linear algebra in [6]; CoenStork and Paul Mendel’s support, financial and otherwise; and JeroenZuiddam’s enthusiastic interest and insight.

59

Bibliography

[1] M. Appel and N. Navab. “Registration of technical drawings andcalibrated images for industrial augmented reality”. In: MachineVision and Applications 13.3 (2002), pp. 111–118.

[2] F. Bergamasco et al. “RUNE-Tag: A high accuracy fiducial markerwith strong occlusion resilience”. In: Computer Vision and Pat-tern Recognition (CVPR), 2011 IEEE Conference on. IEEE. 2011,pp. 113–120.

[3] R. Brown. Ellipse implicit equation coefficients. url: www.mathworks.ch/matlabcentral/answers/37124-ellipse-implicit-equation-

coefficients.

[4] J. Canny. “A computational approach to edge detection”. In:Pattern Analysis and Machine Intelligence, IEEE Transactionson 6 (1986), pp. 679–698.

[5] et.al. Chandler. Hexagonal, information encoding article, processand system. US Patent 4,874,936. 1989.

[6] Q. Chen, H. Wu, and T. Wada. “Camera calibration with two ar-bitrary coplanar circles”. In: Computer Vision-ECCV 2004 (2004),pp. 521–532.

[7] Y. Cho and U. Neumann. “Multi-ring color fiducial systems forscalable fiducial tracking augmented reality”. In: Proc. of IEEEVRAIS. Citeseer. 1998, p. 212.

[8] University of Cordoba. Aruco. url: http : / / www . uco . es /

investiga/grupos/ava/node/26.

[9] M. Fiala. “Artag, a fiducial marker system using digital tech-niques”. In: Computer Vision and Pattern Recognition, 2005.CVPR 2005. IEEE Computer Society Conference on. Vol. 2.IEEE. 2005, pp. 590–596.

[10] Eos Systems Inc. Photomodeler automation, coded targets andphotogrammetry targets. url: http://www.photomodeler.com/products/pm-auto.htm.

[11] Intersense. IS-1200 System. url: http://www.intersense.com/pages/21/13.

60

www.mathworks.ch/matlabcentral/answers/37124-ellipse-implicit-equation-coefficients



http://www.uco.es/investiga/grupos/ava/node/26

http://www.uco.es/investiga/grupos/ava/node/26

http://www.photomodeler.com/products/pm-auto.htm

http://www.photomodeler.com/products/pm-auto.htm

http://www.intersense.com/pages/21/13

http://www.intersense.com/pages/21/13

[12] I.P.H. Kato, M. Billinghurst, and I. Poupyrev. “Artoolkit usermanual, version 2.33”. In: Human Interface Technology Lab, Uni-versity of Washington (2000).

[13] S. Lieberknecht et al. “Evolution of a Tracking System”. In:Handbook of Augmented Reality (2011), pp. 355–377.

[14] T. Lochmatter et al. “Swistrack-a flexible open source trackingsoftware for multi-agent systems”. In: Intelligent Robots and Sys-tems, 2008. IROS 2008. IEEE/RSJ International Conference on.IEEE. 2008, pp. 4004–4010.

[15] M. Lourakis. homest: AC/C++ Library for Robust, Non-linearHomography Estimation. 2010.

[16] Y. Nakazato, M. Kanbara, and N. Yokoya. “Localization of wear-able users using invisible retro-reflective markers and an IR cam-era”. In: Proc. SPIE Electronic Imaging. Vol. 5664. 2005, pp. 1234–1242.

[17] Edwin Olson. “AprilTag: A robust and flexible visual fiducialsystem”. In: Proceedings of the IEEE International Conferenceon Robotics and Automation (ICRA). May 2011.

[18] A. Pagani et al. “Circular markers for camera pose estimation”.In: (2011).

[19] J. Rekimoto and Y. Ayatsuka. “CyberCode: designing augmentedreality environments with visual tags”. In: Proceedings of DARE2000 on Designing augmented reality environments. ACM. 2000,pp. 1–10.

[20] A.C. Rice, A.R. Beresford, and R.K. Harle. “Cantag: an opensource software toolkit for designing and deploying marker-basedvision systems”. In: Pervasive Computing and Communications,2006. PerCom 2006. Fourth Annual IEEE International Confer-ence on. IEEE. 2006, 10–pp.

[21] A.C. Rice, R.K. Harle, and A.R. Beresford. “Analysing funda-mental properties of marker-based vision system designs”. In:Pervasive and Mobile Computing 2.4 (2006), pp. 453–471.

[22] P. Santos et al. “Ptrack: introducing a novel iterative geometricpose estimation for a marker-based single camera tracking sys-tem”. In: Virtual Reality Conference, 2006. IEEE. 2006, pp. 143–150.

[23] J. Sattar et al. “Fourier tags: Smoothly degradable fiducial mark-ers for use in human-robot interaction”. In: Computer and RobotVision, 2007. CRV’07. Fourth Canadian Conference on. IEEE.2007, pp. 165–174.

61

[24] D. Schmalstieg and D. Wagner. “Experiences with handheld aug-mented reality”. In: Mixed and Augmented Reality, 2007. ISMAR2007. 6th IEEE and ACM International Symposium on. IEEE.2007, pp. 3–18.

[25] S.W. Shih and T.Y. Yu. “On designing an isotropic fiducial mark”.In: Image Processing, IEEE Transactions on 12.9 (2003), pp. 1054–1066.

[26] J. Steinbis, W. Hoff, and T.L. Vincent. “3D fiducials for scalableAR visual tracking”. In: Proceedings of the 7th IEEE/ACM In-ternational Symposium on Mixed and Augmented Reality. IEEEComputer Society. 2008, pp. 183–184.

[27] H. Uchiyama and E. Marchand. “Deformable random dot mark-ers”. In: Mixed and Augmented Reality (ISMAR), 2011 10th IEEEInternational Symposium on. IEEE. 2011, pp. 237–238.

[28] D. Wagner and D. Schmalstieg. “Artoolkitplus for pose track-ing on mobile devices”. In: Proceedings of 12th Computer VisionWinter Workshop (CVWW’07). 2007, pp. 139–146.

[29] X. Zhang, S. Fronz, and N. Navab. “Visual marker detection anddecoding in AR systems: A comparative study”. In: Proceedingsof the 1st International Symposium on Mixed and AugmentedReality. IEEE Computer Society. 2002, p. 97.

[30] X. Zhang, Y. Genc, and N. Navab. “Taking AR into large scaleindustrial environments: Navigation and information access withmobile computers”. In: Augmented Reality, 2001. Proceedings.IEEE and ACM International Symposium on. IEEE. 2001, pp. 179–180.

62

Documents

BSc Informatica - UvA/FNWI (Science) Education …1 BSc Informatica Camera pose estimation with circular markers Joris Stork August 13, 2012 Supervisor: Rein van den Boomgaard Signed: