Chapter 2 Reference Frames and Roto-translations...m that is used to represent P. The index m is omitted whenever there are no ambiguities. Given two reference frames R 1 e R 2, the

Chapter 2

Reference Frames andRoto-translations

Before we start to the study the kinematics and dynamics of rigid bodies and multi-body systems, it is appropriate to recall some geometrical concepts used to describethe basic quantities and the associate transformations transformations that charac-terize the motion of a rigid body in space.

We will start with a formal definition of reference frames and then we will introducethe translation, rotation and roto-translation operators, that are essential for thestudy of motion of rigid bodies.

2.1 Tridimensional space

For simplicity, from now on, we will assume to be confined in a tridimensionalworld, except when we will study two-dimensional problem, as in planar motion orin robotic computer vision; consequently vectors will be described as elements of the3D space R3, or E3 if the Euclidean norm is implicit.

With no intention to raise philosophical questions, we can assume that the physicalworld around us, including the geometric entities we perceive, exist independentlyof any reference frame. On the contrary, for modelling purposes, it is very often nec-essary to express vectors with respect to one or more reference frames; we can saythat fixing a coordinate system and the related reference frame “gives substance”to vectors: these can now be compared and measured relative to a common ruler.Moreover, suitable operations acting on vectors allow to determine, represents andmeasure geometric entities as angles, distances, orthogonality, projections, or phys-ical quantities, as fields, powers, angular and linear velocities, etc.

In principle, there is a difference between a reference frame and a coordinate system:the latter indicates the abstract structure used to define vectors, while the former

9

Basilio

Rettangolo

10 Basilio Bona - Dynamic Modelling

indicates a specific way to define the parameters that characterize the vectors; ex-amples of coordinate systems can be the cylindrical, the spherical or the rectangular(cartesian) coordinate system, while examples of (cartesian) reference frames areEarth centered inertial frame, the International Celestial reference frame (ICRF)and others [25]. In the following we will use indifferently the two terms to indicatea cartesian reference coordinate system, as specified in the following paragraphs.

2.2 Reference frames

A reference frame is defined by the symbol R(O, i , j , k) or R for short, whereO is a particular point in space, called the origin, and i , j , k are three unit normalvectors, defining the metric properties of the space.

Since it creates a loophole to define vectors in relation to a reference frame that isessentially built on vectors, it is necessary to understand how to constructR withoutmaking use of vectors and operations not yet defined.

The only things we need in order to build a reference frame is the concept of squareangles, i.e., orthogonality between lines, and three geometric directed segments or

directed numbers−−→A′A,

−−→B′B, and

−−→C ′C of equal length, that have the function of

rulers of unit length in the 3D space. That of directed segments is a basic conceptdescribed in any introductory physics textbook, and has been briefly recalled inAppendix A.3.

Having fixed the origin in O, we place the directed segments so that the threestarting points A′, B′, and C ′ coincide with O. Furthermore, the three segmentsmust not be aligned with each other or lying all on the same plane.

Usually these conditions are sufficient to characterize the reference frame, but, inorder to preserve the orthogonal angles and orthogonal projections when using vectorproducts, we orient the three segments at angles of π/2 with respect to each other.These constraints allow only two possibilities, that are illustrated in Figure 1a) andb). The reference system in Figure 1a) is called a right-handed reference frame,while that of Figure 1b) is a left-handed reference frame. These names derivefrom the right-hand rule or the left-hand rule, illustrated in Figure ??

The most commonly used reference frame (from now on simply called RHF) is thefirst one, and we will tacitly adopt this convention in the rest of the book.

In Figure 2.1, the three directed segments have been identified with the symbols

i =−→OA j =

−−→OB k =

−→OC (2.1)

The set i , j , k is called base vector set, and i , j , k are called base or basis vectors.We also say that i , j , k form a right or left orthonormal basis in R3.

We are now able to use R(O, i , j , k) to characterize any other vector in R3.

Basilio Bona - Dynamic Modelling 11

Figure 2.1: a) right orthonormal reference frame; b) left orthonormal referenceframe.

Given a reference frameR, a vector v ∈ R3 is represented by three components, eachone being the orthogonal projection of the directed segment on the three elementsi , j and k . This can be written as the linear combination of the base elements, andgive origin to the well-known algebraic definition of vectors:

v = v1i + v2j + v3k (2.2)

where vi ∈ R, i = 1, 2, 3, are the three vector components with respect to R.

In order to be able to define the values of vi, it is necessary to introduce a binaryoperation between two vectors a and b, called the scalar vector product or dotproduct, and defined as:

a · b = a1b1 + a2b2 + a3b3 = b · a (2.3)

We have defined this product and its properties in Appendix A.4.1; for the moment,relation (2.3) is sufficient for our aims. The three components vi are defined as:

v1 = i · vv2 = j · v (2.4)

v3 = k · v


As i , j , k are vectors themselves, each one can be expressed according to (2.2):

i = i1i + i2j + i3k

j = j1i + j2j + j3k (2.5)

k = k1i + k2j + k3k

Since i is orthogonal to j and k , and has unit length, its first component must beequal to one, while the other two must be zero; the same argument holds for j andk as well.

Hence, if we define

e1 =

100

e2 =

010

e3 =

001

, (2.6)

it follows that

i = e1 j = e2 k = e3 (2.7)

Therefore the representation of the reference frame R into itself can be given by theidentity matrix

[i j k

]=[e1 e2 e3

]= I =

1 0 00 1 00 0 1

. (2.8)

According to (2.2), each vector can be represented by a column of real componentsthat we call a column vector :

v =[i j k

] v1v2v3

= I

v1v2v3

=

v1v2v3

(2.9)

In (2.9) we have implicitly assumed to know how to use the matrix notation and tounderstand the row by column product rule.

Although notation (2.8) may appear pedantic and superfluous, we will see in Section?? that it is possible to represent a reference frame R2 with respect to anotherreference frame R1 replacing the identity matrix I with a square orthogonal matrixR, having the specific geometric meaning properties, described in Section ??;

Relation (2.9) is the simplest form of a more general relation v 1 = Rv 2 that providesthe representation in R1 of a vector v 2 with components given in R2.

To indicate the single components of a vector v one can adopt numerical k = 1, 2, 3or literal k ∈ x, y, z indexes. While the literal indexes have an immediate meaning,

Basilio

Formato

COMPLETARE

Basilio

Formato

COMPLETARE


they are more difficult to use when computer algorithms or mathematical formulasare considered; for example, the norm of a vector can be written as follows

∥v∥2 =∑

k∈x,y,z

v2k

or

∥v∥2 =3∑

k=1

v2k

The second one is much more immediate, and except that some particular case, thenumerical indexes will be adopted throughout these notes.li.

Often we omit to specify the origin of a reference frame that is simply indicatedas R(i , j , k). Other times we use an index to specify a particular reference frame;for example we use the symbol Rm(im, jm, km) to indicate a “local” frame andR0(i 0, j 0, k 0) to specify an “inertial” frame1.

Given a specific reference frame Rm and a geometrical point P , this one is described

in Rm by the (geometrical) vector vmp that represents the oriented segment−→OP :

vmp = vm1 imm + vm2 j

mm + vm3 k

mm (2.10)

This notation puts in evidence the reference frame m that is used to represent P .The index m is omitted whenever there are no ambiguities.

Given two reference frames R1 e R2, the same geometrical point P has two repre-sentations:

v 1p =

[v11 v12 v13

]Tin R1 (2.11)

v 2p =

[v21 v22 v23

]Tin R2 (2.12)

In alternative, we can use one of the following symbols to indicate the representation

of v or−−→UV in Rm:

[−−→UV]Rm

[v ]Rm[v ]Rm

v1v2v3

Rm

(2.13)

2.3 Vector types

We represents vectors with a graphical icon: the most used icon is an arrow, as inFigure 2.2. Unfortunately this icon is sometimes misleading, as we will see, consid-ering that there are two types of vectors with different properties polar vectors

1 We use the term inertial for pseudo-inertial reference frames, as those fixed in the environment.


and axial vectors also called pseudovectors (for details see also [2]). Examplesof polar vectors are those used for geometrical points, linear velocities and acceler-ations, forces, gradients, normals, unit vectors, etc., while examples of axial vectorsare angular velocities, torques, moments, cross products, etc.

Figure 2.2: The vector icon.

To make things clear one should associate a different icon for each type of vectors:the arrow for polar vectors, and a different icon, e.g., a segment with a small curl,to the axial vectors, as sketched in Figure 2.3.

Figure 2.3: Two different icons: an arrow for polar vectors, a segment with a curlfor axial vectors.

Unfortunately this is not the case, and we have the same symbol for both; when wewant to interpret the arrow icon for polar vectors we should perform the implicittransformation depicted in Figure 2.4.

A formal characterization of the two types of vectors is possible considering theproperty of invariance with respect to reflections. Although reflections will be math-ematically defined only in Sections ?? and 2.8;, we can for the moment rely on theintuitive meaning.

Basilio

Formato

COMPLETARE


Figure 2.4: To interpret the arrow icon for an axial vector one shall perform thetransformation depicted here, i.e., align the right hand thumb with the arrow axisto obtain the curl from the other fingers.

2.3.1 Polar vectors

Consider Figure 2.5, where a vector v 1 is reflected with respect to two orthogonalplanes; the plane π1 parallel to v 1, and the plane π2 orthogonal to v 1. The resultingvectors are v 2 and v 3: since v 2 = v 1, the properties of the two vectors are the same,while, given that v 3 = −v 1, the reflection reverse the direction of the vector, andconsequently its physical significance.

Examples of physical polar vectors are displacements, linear velocities and forces,among others.

2.3.2 Axial vectors

Axial vectors are vectors that represent directed quantities that are antisymmet-rical with respect to a reflection through a parallel plane π1, and are symmetricalwith respect to a reflection through a perpendicular plane π2, as shown in Figure2.6.

In this case the reflections have completely different effects: considering the vector v 3

obtained by a reflection with respect to π2, the curl icon does not change direction,so we can say that v 3 = v 1. Should we have used the arrow icon our vector wouldhave changed direction and therefore its physical meaning. The result of a reflectionwith respect to π1 would change the curl direction, and, using the right-hand rule,this means a reverse in effect, i.e., v 2 = −v 1.

Examples of axial vectors are angular velocities and torques.

The distinction between axial and polar vectors is often omitted in mechanical text-books, but, among other application, it will become important when using quater-nions to represent rotations, as presented in Sections 2.12.6 and 2.12.7.


Figure 2.5: The symmetry and anti-symmetry of polar vectors.

It is important to add that the cross product of two polar vectors produce an axialvector

a (polar) × b (polar) = c (axial)

The interested reader can find furthere details in [2] and [51].

2.4 Geometrical and Physical vectors

In these notes vectors will be used to represent both geometrical points, as the setof vertices in a rigid object, the barycenter of a plate, the location of a fixture, theposition of a joint, etc., and signed quantities having a physical significance, as linearor angular velocities and accelerations, forces, moments, torques, etc.

We will indicate the former as geometrical vectors, while the latter are known asphysical vectors.

2.4.1 Geometrical vectors

A geometric point P ∈ R3, is described by the associated vector p that contains the

representation of the directed segment−→OP in R(O, i , j , k).


Figure 2.6: Axial vector symmetry.

Given two different reference frames, R1(O1, i 1, j 1, k 1) and R2(O2, i 2, j 2, k 2), thesame geometric point P is associated to two different vectors p1 and p2, respectivelythe representation of P in R1 and R2.

As an example, a velocity defined in the plane by the vector v in Figure 2.7, had arepresentation va = (va1, va2) in the cartesian reference frame Ra that is differentfrom the representation v b = (vb1, vb2) in Rb.

The relation between p1 and p2 will be discussed in Section 2.9.

2.4.2 Physical vectors

A physical vector−→QP is an oriented segment, also known as directed segment

or directed quantity, that represents a physical quantity, such as linear or angularvelocity, gravitational acceleration, force, torque, etc.

We cite some lines from [24]

(1) A physical vector is a quantity with some physical origin. Howevergeneral this may be, it already expresses some other interpretation of avector, since the mathematical vector space axioms make no requirementas to the origin or characteristic of the vectors.


Figure 2.7: A vector as an abstract entity and its representation in two differentreference frames. The vector v is the same, but the two representations have differentcomponents in Ra and Rb.

(2) A physical vector has a magnitudo, which is not part of the ini-tial description of a mathematical vector. If, however, one introducesmagnitudes by means of the additional structure of an inner product,these are real numbers, and not, as in physics, dimension-laden physicalscalars.

(3) Finally, a physical vector has a direction in (physical) space, be-cause the physical vector spaces described above have a close relationto the position vector space. There is no correspondence here with themathematical concept of a vector, since the axioms make no mention ofa physical space.

A physical vectors has an application point Q, that can be free or constrained, adirection, and a magnitude, as in Figure 2.2.

It is usual to represent the vector−→QP by a difference between two geometrical

vectors, as in Figure 2.9.

v = vQP ∈ R3 = vQ − vP =

p1 − q1p2 − q2p3 − q3

≡

v1v2v3

It should be noted that these vectors can be of two types: what are called free vectorsand what are called applied or point vectors. With free vectors, the application pointQ, also called application point, has no physical or geometric significance; one cantranslate v parallel to itself without producing a variation in any effect it could


Figure 2.8: The oriented segment is represented by the difference of two geometricalpoints: vQP = vP − vQ.

generate. On the contrary, for applied vectors, the application point U cannot bechanged without affecting in some way the physical significance of the effects itrepresents. As an example, a force acting on a rigid body is an applied vector, asone cannot change its application point without, in general, affecting the torqueacting on the mass center of the body itself; the linear velocity of a rigid object,on the contrary, can be translated in any other point of the object without causingmuch trouble.

2.4.3 Vector units

When it is necessary to assign physical units to vector components, two approachesare possible: 1) assign the units to the axes; 2) assign the units to each component.

The first approach is more elegant, but often we have vector components with dif-ferent units; for instance, the state vector of a particle moving on the plane hastwo components, namely its position and its velocity. In this case it is more conve-nient to have adimensional axes and assign units to the components; however thisapproach produces some interpretation problems when we make a scalar productor we compute the norm of the vector, that will result in the square root of spacesquared plus velocity squared. Nevertheless it is a common rule to apply the secondapproach, that will be used also in the present notes.

2.5 Rigid Bodies and Their Displacements

With the term rigid body we define any tridimensional object for which the distancebetween any couple of its points remain constant in time, independently from anymotion or any force or torque applied to it. Similarly we call rigid displacement


or rigid motion the motion of a rigid body in space.

A rigid body is an abstraction, since in nature (or at least in the macro-world wheremechatronic modelling is used) perfectly rigid bodies do not exist. Every objectis distorted, warped or deformed under the effects of static or dynamic forces ortorques; similarly, the rigid motion is an abstraction. Consider a steel plate thatunder the effect of its mass and gravity acceleration (a static force) bends in variousdifferent shapes according to its position with respect to the vertical. Also notconsidering gravity, if we take a steel plate and accelerate it, it will flex, and returninto its original shape only after the resulting vibrations are damped.

Nevertheless those abstractions will make the characterization of the tridimensionalspace geometry simpler and rich of theoretical and practical developments. So, asput in evidence in the introduction, we will make this approximating assumptionand study the models of rigid displacements.

It has been demonstrated by Chasles (1830) that any rigid displacement in spacecan always be decomposed into a translation and a rotation; more precisely theChasles theorem states that the most general displacement of a rigid body in R3 isthe composition of a rigid translation along a line and a rotation around an axisparallel to the same line [3, 4].

Given the characteristics of invariance of the distances between any two points in arigid body, it is possible to describe a rigid body B with respect to a reference frameRB (iB, j B, kB), since once the origin and the orientation of the body reference frameis known, all B points are also known or can be determined.

2.6 Translations

Given a vector v =[v1 v2 v3

]T, a rigid translation is the operator Trasl(v , t)

that displaces v parallel to itself of a given vector t :

Trasl(v , t) ≡ v tdef=

v1 + t1v2 + t2v3 + t3

= v + t (2.14)

Since the oriented segment−→AB is the difference between

−−→OB and

−→OA

vAB = vB − vA (2.15)

the translation of an oriented segment is simply

v tAB = v tB − v tA = vB + t − (vA + t) = vB − vA = vAB (2.16)

from which we deduce that the representation of an oriented segment is invariantwith respect to translations. Since oriented segments usually represent physical


quantities, as forces, torques, velocities, etc., we say that their representation isinvariant to translations.

Now assume to have two different reference frames Rk and Rm, having the sameorigin and the same basis i k, j k, k k = im, jm, km = i , j , k, and assume totranslate Rm with respect to Rk so that its new origin is now O′. We call tkm the

oriented segment−−→OO′, that represents the translation from reference frame k to

reference frame m.

A geometrical point P , represented in Rm by−−→O′P = vmP will be represented in Rk

by−→OP = v kP = vmP + tkm (2.17)

and so we can conclude that, while the representation of physical vectors is unaf-fected by rigid translations as in (2.16), the representation of a geometrical point ina translated reference frame adds the translation vector, as in (2.17).

The inverse operator of the translation is defined as

Trasl(v , t)−1 = Trasl(v ,−t) = −Trasl(v , t) = vmp = v ip − tkm = v kp + tmk

and

−tkm = tmk

Since the transaction operator is represented by a vector sum, it is commutative

Trasl(v ,a)Trasl(v , b) = Trasl(v , b)Trasl(v ,a) = Trasl(v ,a + b)

i.e., given n translations t i, i = 1, . . . , n, the total translation is

t =n∑i=1

t i

Strange as it may appear, the translation of a quantity t is not a linear operator,since it does not obeys the axioms listed in (A.1).

Indeed, the axioms will require

Trasl((λ1a + λ2b), t) = λ1a + λ2b + t

instead it results

λ1Trasl(a , t) + λ2Trasl(b, t) = λ1(a + t) + λ2(b + t) = λ1a + λ2b + (λ1 + λ2)t .

The two relations coincide only if λ1 + λ2 = 1; this fact characterize the propertiesof an affine space and not of a linear space, as described in [14].


2.7 Rotations

A well known theorem by Euler, published in 1776, states that in three-dimensionalspace, any displacement of a rigid body such that a point on the rigid body remainsfixed, is equivalent to a single rotation of a given angle about some axis thatcontains the fixed point.

In modern symbols we will indicate the rotation axis with a unit vector u and theangle with θ, such that the rotation is characterized by the vector v = uθ; unfortu-nately, as we will see, the composition of two or more rotations cannot be reducedto the sum of the related vectors; a more complex characterization is necessary.

We can describe the rotation assuming a rigid body with its reference frame andconsidering a rigid displacement that leaves the reference frame origin O fixed, whilethe basis unit vectors are changed under the rotation. A rotation is therefore char-acterized by the mathematical relation between these two reference frames.

Figure 2.9: An example of rigid rotation of a frame with respect to the origin O.

In order to represent a rotation we must consider two reference framesRm(O, im, jm, km)and Rn(O, in, j n, kn), with common origin O; initially they are coincident, i.e., eachunit vector of the first base im, jm, km coincides with the corresponding unit vec-


tor of the second base in, j n, kn.

Now we take one frame, for instance Rn and rotate it around the common fixedorigin O of a arbitrary angle θ. The rotation axis is itself arbitrary, and we call uthe unit vector that represents it; the only condition is that the origin O must lieon the rotation axis.

At the end of the rotation Rn has taken a different “orientation” with respect toRm, and the two basis will be no more completely coincident, although it is alwayspossible that some of the unit vectors remain coincident. We can now define therepresentation of each basis unit vector of Rn in Rm, as follows

imndef= imn1im + imn2jm + imn3km =

imn1

imn2

imn3

(2.18)

jmndef= jmn1im + jmn2jm + jmn3km =

jmn1

jmn2

jmn3

(2.19)

kmndef= kmn1im + kmn2jm + kmn3km =

kmn1

kmn2

kmn3

(2.20)

Considering the above relations and recalling that the components of a generic vector

v =

v1v2v3

are given by

v1 = v · i = vTi = iTv

v2 = v · j = vTj = j Tv

v3 = v · k = vTk = kTv

it is possible to introduce a matrix Rmn , whose columns are the representations of

the basis unit vectors of Rn in Rk:

Rmn =

[imn jmn kmn

]=

imn1 jmn1 kmn1

imn2 jmn2 kmn2

imn3 jmn3 kmn3

(2.21)

Therefore this matrix Rmn represents the transformation that describes the rotated

frame Rn with respect to the fixed frame Rm, with common origins.

According to (2.21), Rmn has the following structure:


• the first column is the representation of the first unit basis vectors of Rn inRm,

• the second column is the representation of the second unit basis vectors of Rn

in Rm,

• the third column is the representation of the third unit basis vectors of Rn inRm,

It follows that the matrix Rmn can be interpreted as the representation of Rn in Rm.

By a convention adopted in many textbooks, the lower index n of R denotes therepresented reference frame (the rotated one), while the upper index m denotes thereference frame where we represent it (the fixed one).

Notice that the terms “rotated” or “fixed” are only used to give a meaning to arelative displacement and does not always imply a “real” rotation; we can imagineas well to leave Rn “fixed” and rotate Rm.

We can build Rnm, instead of Rm

n using similar arguments.

In conclusions, we have

Rmn =

iTmin j Tmin kT

min

iTmj n j Tmj n kTmj n

iTmkn j Tmkn kTmkn

and

Rnm =

iTnim j Tnim kT

nim

iTnjm j Tnjm kTnjm

iTnkm j Tnkm kTnkm

and, by inspection,

Rmn = (Rn

m)T (2.22)

Since the matrix Rmn describes a rigid rotation around an axis that goes through

the common origin of two reference frames, this matrix is therefore called rotationmatrix.

Equation (2.22) shows that the inverse rotation from n to m is represented by thetranspose of the matrix that represents the rotation from m to n. A matrix whoseinverse is equal to its transpose is called orthonormal (or, as in some textbooks,orthogonal). All rotation matrices are orthonormal matrices, whose propertieshave been described in Appendix B.7.1

Now we consider two vectors, the first one indicated as vnP is the representation of ageometrical point P in Rn, while v

nAB = vnB−vnA is the representation of an oriented

segment−→AB in Rn.


We want to know how to represent both in the reference frame Rm. It turns outthat the formula is:

vmP = Rmn v

nP and vmAB = Rm

n vnAB

i.e., when the reference frame origins are the same, both types of vectors transformin the same way

vm = Rmn v

n (2.23)

and conversely

vn = Rnmv

m = (Rmn )

Tvm (2.24)

We conclude that a rotation is represented by a square matrix and the rotationoperator applied to a vector is

Rot(v ,R) = Rv

In conclusion, when a generic (orthonormal) rotation matrix Rmn is considered, all

of the three following characterizations are true

1. Rmn represents a geometrical rotation Rot(u , θ) of an angle θ around an axis

whose unite vector is given by u (θ > 0 is given by the right-hand-rule), thatbrings Rm to overlap with Rn (from m “fixed” a n “rotated” or “mobile”).

The value of the angle θ and the components of u do not appear immediatelyfrom the matrix elements, but we can compute them, as shown in Section2.8.1;

2. Rmn characterize the description of the unit basis of Rn in Rm (frame n “ro-

tated” or “mobile” in the frame m “fixed”);

3. Rmn is the representation of the linear operator that transforms a vector from

the reference frame Rn (“rotated” or “mobile”) into the reference frame Rm

(“fixed” ).

The rotation operator is linear, since it is represented by a matrix that obeys totthe following property

Rot(Rmn , λ1v

n1 + λ2v

n2 ) = λ1R

mn v

n1 + λ2R

mn v

n2 = λ1Rot(R

mn , v 1) + λ2Rot(R

mn , v 2)

If we need to make more than one rotation, each one represented by a matrixRot(u i, θi) = Ri−1

i with respect to the reference frame obtained by the previousrotation, the total rotation is given by the ordered product of the rotation operators,as

Rot(u1, θ1)Rot(u2, θ2) · · ·Rot(uN , θN) = Rot(u , θ) (2.25)


It is very important to note that the total angle θ is not the sum of the single angles

θ = θ1 + θ2 + · · ·+ θN

and the same applies for the unit vector u

u = u1 + u2 + · · ·+ uN

The matrix representing the global rotation is the ordered product of the singlematrices

Rot(u , θ) ⇔ R0N

def= R0

1R12 · · ·RN−1

N =N∏k=1

Rk−1k (2.26)

Since matrix product does not commute (apart from some particular cases, that wewill see later), the factors order is important. The upper index of a rotation matrixshall be equal to the lower index of the preceding matrix (on its left). We will seein Section 2.10.2 that the order of the product is connected to a precise geometricalconcept and we will provide a simple mnemonic rule to remember it.

2.7.1 Elementary Rotations

We call elementary rotations or basic rotations the rotations that take placearound some particular axis; the first one is the matrix that represents a rotation of

an angle θ around a generic axis, defined by the unit norm vector u =[u1 u2 u3

]Twith∥u∥ = 1:

Rot(u , θ) ≡ R(u , θ) ≡ Ru ,θ

def=

u21 (1− cθ) + cθ u1u2 (1− cθ)− u3sθ u1u3 (1− cθ) + u2sθu1u2 (1− cθ) + u3sθ u22 (1− cθ) + cθ u2u3 (1− cθ)− u1sθu1u3 (1− cθ)− u2sθ u2u3 (1− cθ) + u1sθ u23 (1− cθ) + cθ

(2.27)

where we have adopted the following conventions

sθdef= sin θ cθ

def= cos θ

The determinant is

detR(u , θ) = 1

while the trace is

trR(u , θ) = (1− cθ)(u21 + u22 + u23) + 3cθ (2.28)

Other basic rotation are those around the x, y and z axes; in these case u is givenby the basis unit vectors i , j e k , as follows:


• Rotation around the x axis of an angle α:

Rot(x, α) ≡ Rot(i , α) ≡ R(i , α) ≡ Ri ,αdef=

1 0 00 cα −sα0 sα cα

(2.29)

• Rotation around the y axis of an angle β:

Rot(y, β) ≡ Rot(j , β) ≡ R(j , β) ≡ Rj ,βdef=

cβ 0 sβ0 1 0

−sβ 0 cβ

(2.30)

• Rotation around the z axis of an angle γ:

Rot(z, γ) ≡ Rot(k , γ) ≡ R(k , γ) ≡ Rk ,γdef=

cγ −sγ 0sγ cγ 00 0 1

(2.31)

where

sαdef= sinα; cα

def= cosα; sβ

def= sin β; cβ = cos β; sγ

def= sin γ; cγ = cos γ

Example 2.7.1

We want to compute the matrix representing the basic rotation around an axis givenby the vector

v =

011

The corresponding unit vector is

u =v

∥v∥=

0

1√2

1√2

Now, using (2.27) we have:

R(u , π/2) =

0 − 1√2

1√2

1√2

1

2

1

2

− 1√2

1

2

1

2

(2.32)


We can check that the determinant of R is equal to +1

detR(u , π/2) = (−1)1+2

(− 1√

2

)(1

2√2−(− 1

2√2

))+ (−1)2+2

(1√2

)(1

2√2−(− 1

2√2

))=

1

4+

1

4+

1

4+

1

4= 1

(2.33)

and that

trR =1

2+

1

2= 1

Considering relation (2.28), given later, we have

trR = 1 + 2 cos θ

from whichcos θ = 0 → θ = ±π/2

From a simple inspection of the matrix in (2.29) we notice that it obeys to thefollowing relations:

R(u , θ) = R(−u ,−θ)R(u ,−θ) = R(u , 2π − θ) = R(−u , θ)

R(u , θ) = [R(u ,−θ)]T

R(−u ,−θ) = [R(−u , θ)]T

R(u , θ1)R(u , θ2) = R(u , θ1 + θ2)

(2.34)

We recall again that the sign of the rotation angle is given by the right-hand-rule,When we change the direction of the vector u , if we want to represent the samephysical rotation we have to change also the sign of the angle θ; on the contrary wewill obtain the inverse rotation, represented by the transpose matrix.

Example 2.7.2

We want to compose two different rotations, represented by the following matrices,in the stated order

R(j , 45) =

0.7071 0 0.70710 1.0000 0

−0.7071 0 0.7071


and

R(k , 60) =

0.5000 −0.8660 00.8660 0.5000 0

0 0 1.0000

We have

Ra = R(j , 45)R(k , 60) =

0.3536 −0.6124 0.70710.8660 0.5000 0−0.3536 0.6124 0.7071

(2.35)

While, exchanging the order we have

Rb = R(k , 60)R(j , 45) =

0.3536 −0.8660 0.35360.6124 0.5000 0.6124−0.7071 0 0.7071

(2.36)

with

Ra = Rb

2.7.2 Planar rotations

If the motion takes place in a plane, it is customary to define the plane with thecommon reference frame, where i is horizontal, j is vertical and k is orthogonal tothe plane and points toward the eyes of the reader. In this context, all rotationsare around the unit vector k (positive counterclockwise) as in Figure 2.10, and therotation matrices can be written ascγ −sγ 0

sγ cγ 00 0 1

or simply as [

cγ −sγsγ cγ

]In this case the matrix product commutes, since, according to the last of (2.34)[

cθ −sθsθ cθ

] [cγ −sγsγ cγ

]=

[cγ −sγsγ cγ

] [cθ −sθsθ cθ

]=

[cθ+γ −sθ+γsθ+γ cθ+γ

]


Figure 2.10: A planar rotation of an angle θ.

2.8 The Rotation Matrix

In the following Section we will define the main mathematical properties of therotation matrices. A detailed description can be found in several textbooks, as[1, 3, 5, 13, 17, 31, 32, 34, 35, 39, 41, 48, 52].

A generic rotation matrix is always square and orthonormal, and its properties havebeen described in B.7.1.

It is interesting to note that the translations, the rotations and the reflexions areall members of a the set of the so-called isometries, i.e., those transformations ofthe three-dimensional space that keep constant the Euclidean distance d(P,Q) =∥xP − xQ∥ between every couple of points belonging to a rigid body. Among theisometries there is also the so-called glide, i.e., a reflexion followed by a translation.

However, among the isometries that are represented by orthogonal matrices, we areinterested only in the rotations, since the reflections are nota applicable to rigidbodies: your left hand cannot be transformed into your right hand.

2.8.1 Rotation Matrix Properties

All rotations, also called proper rotations, are expressed by square orthonormalmatrices R with determinant detR = +1; if detR = −1 the matrix represents areflexion, also called roto-reflections.

According to the Euler theorem any rotation or composition of more rotations isalways represented by a single matrix R(u , θ). The subspace represented by u hasdimension 1, and is invariant with respect to the rotation (since the point on it do


not rotate); hence one can writeRu = u (2.37)

This relation correspond to the classic definition of the eigenvalues of a matrix

Ru = λu

with λ = 1.

A proper rotation R has the following canonical modal decomposition

R = MΛM H =[u v v ∗] 1 0 0

0 ejθ 00 0 e−jθ

[u v v ∗]H (2.38)

where jdef=

√−1 and M H is the hermitian matrix (i.e., conjugate-transpose) of M .

In (2.38) the matrix M is the modal matrix, i.e., the matrix of the eigenvalues of R;as said above, u identifies the rotation axis, while v and v ∗ define the plane normalto u .

On this plane we have a planar rotation that can be represented by the complexrotation operator (also called a phasor)

ejθdef= cos θ + j sin θ

The right-handed reference frame

R(u ,Re(v), Im(v))

has i = u , while the other two basis unit vectors are j = Re(v) and k = Im(v).

The determinant isdetR = 1 · ejθ · e−jθ = +1

as we already knew, while the trace is

trR = 1 + (cos θ + j sin θ) + (cos θ − j sin θ) = 1 + 2 cos θ

that is equal to (2.28) for ∥u∥ = 1

If we introduce S(u) as the skew-symmetric matrix

S(u) =

0 −u3 u2u3 0 −u1−u2 u1 0

(2.39)

R obeys to the following relations:

R(u , θ)def= eS(u)θ def

= I +sin θ

∥u∥S(u) +

1− cos θ

∥u∥2S 2(u) (2.40)


R(u , θ)Tdef= e−S(u)θ def

= I − sin θ

∥u∥S(u) +

1− cos θ

∥u∥2S 2(u) (2.41)

where

S 2(u) =

−u22 − u23 u1u2 u1u3u1u2 −u21 − u23 u2u3u1u3 u2u3 −u21 − u22

(2.42)

The two formulas (2.40) and (2.41) are very important, since they define in the mostgeneral way the rotation matrix as an exponential of an anti-symmetric matrix;we omit the proof since it requires some notion of differential geometry, and inparticular, of the Lie algebras; the interested reader can look at the textbook [34]for further elements.

Notice that a planar rotation is represented by

ejθ

while a tridimensional rotation is represented by

eS(u)θ

2.8.2 Rotation Matrix Parametrization

We are now able to solve the following two problems

Problem 1

Given the rotation matrix R, compute the rotation angle θ and the unit vector u .

Solution

The angle θ is computed according to (2.28)

θ = ± arccos

(tr (R)− 1

2

)(2.43)

The sign ambiguity in arccos cannot be avoided, being implicit in the inverse trigono-metric formulas.

The unit vector u could be obtained from the eigenvector relative to the unit eigen-value of R:

Ru = 1u (2.44)

The rotation axis, without the positive direction information, can be computed alsofrom any non zero column of the following matrix(

R +RT)− (tr (R)− 1) I . (2.45)


However this procedure does not allow to link the choice of the sign in ± arccos(·)with the positive verse of θ, because of the ambiguity that comes from the solutionof the eigenvector computation; indeed (2.44) allows to compute only the subspacespanned by the eigenvector, but not its positive direction. We cannot therefore bea-priori sure that this eigenvector is coherent with the sign of θ, and we need tomake an a-posteriori check.

To solve this ambiguity it is better to follow a more complex method, that makessure that the angle sign and the positive direction of u are coherent.

The angle θ is computed following (2.43), with any choice of the sign; at this pointone defines a symbolic skew-symmetric matrix S(u) according to (2.39), and nu-merically compute S(u) subtracting (2.41) from (2.40):

S(u) =∥u∥2 sin θ

(R −RT

). (2.46)

the components ui are obtained equating term by term the elements of the symbolicmatrix (2.39) with those obtained from (2.46).

As one can see, when R is symmetric, S(u) in (2.46) is zero. One can proceedconsidering two cases:

Case 1. R = RT = I

S(u) is not determined since sin θ = 0 ± 2kπ; by convention θ is set to zeroand u is undetermined, since no rotation occurs.

Case 2. R = RT = I

From the orthogonality properties of R follows that RTR = RR = R2 = I .This means that after two rotations of the angle θ around u the orientation isthe starting one; this gives 2θ = 2π, therefore one sets θ = π.

To compute u one builds S 2(u) as the sum of (2.40) and (2.41):

S 2 =∥u∥2

1− cos θ

(R +RT

2− I

)(2.47)

and then applies

S 2 + ∥u∥2 I = uuT =

u21 u1u2 u1u3u2u1 u22 u2u3u3u1 u3u2 u23

(2.48)

in order to compute the components of u , equalling the numerical values of(2.47) with the corresponding symbolic terms in (2.48).


Problem 2

Given the rotation axis represented by u =

u1u2u3

T

and the angle θ, compute R.

Solution

One computes the skew-symmetric matrix S(u), from the known components of u ,then computes the norm ∥u∥ and after thatthe matrix R according to (2.40).

Now we present some examples to illustrate the various procedures.

Example 2.8.1

Let’s take Rb obtained from Example 2.7.1.

Rb = R(k , 60)R(j , 45) =

0.3536 −0.8660 0.35360.6124 0.5000 0.6124−0.7071 0 0.7071

(2.49)

and compute the corresponding u and θ.

We start computing θ from (2.43)

θ = ± arccos

(tr (R)− 1

2

)= ± arccos

1.5607− 1

2= ±73.7

then we obtain u from (2.45), as

A =(R +RT

)− (tr (R)− 1) I

It results

A =

0.1464 −0.2537 −0.3536−0.2537 0.4393 0.6124−0.3536 0.6124 0.8536

Once normalized, the three columns are respectively

v 1 =

0.3190−0.5525−0.7701

v 2 = v 3 =

−0.31900.55250.7701

Now, in order to decide if the rotation axis is v 1 or v 2 one should do the a-posterioricheck, computing both

R(v 1, θ) = R(v 2,−θ) =

0.3536 0.6124 −0.7071−0.8660 0.5000 00.3536 0.6124 0.7071


andR(v 1,−θ) = R(v 2, θ) = R(v 1, θ)

T = R(v 2,−θ)T

We observe thatRb = R(v 1,−θ) = R(v 2, θ)

hence we have at the end

u = v 2 =

−0.31900.55250.7701

and θ = +73.7

Now we use the other approach using the skew-symmetric matrix from (2.46); wehave already computed the angle θ = ±73.7 and we choose, for instance, thenegative sign, so that θ = −73.7.

If we now assume ∥u∥ = 1, we have the equality between the symbolic (2.39) andthe numerical form (2.46)

S(u) =

0 −u3 u2u3 0 −u1−u2 u1 0

=

0 0.7701 −0.5525−0.7701 0 −0.31900.5525 0.3190 0

By inspection of the elements of S(u) we have

u1 = 0.3190; u2 = −0.5525; u3 = −0.7701

but, considering that Rot(u , θ) = Rot(−u ,−θ), we obtain the same results as theprevious method, with no need to perform the sign check between θ and u .

Example 2.8.2

We want to compute the rotation matrix R representing a rotation around the axis

u =[

1√3

1√3

1√3

]Tof θ = 120

We compute the two matrices S e S 2

S =1√3

0 −1 11 0 −1−1 1 0

; S 2 =1

3

−2 1 11 −2 11 1 −2

e successivamente calcoliamo R secondo l’equazione (2.40) ottenendo

R =

0 0 11 0 00 1 0


Example 2.8.3

Given

R =

0 0 11 0 00 1 0

(2.50)

we want to compute u and θ.

The eigenvector (with non-unit norm) relative to the unit eigenvalue of R is u =

λ[1 1 1

]T, as one can obtain from the equation0 0 1

1 0 00 1 0

u1u2u3

=

u1u2u3

⇒u3 = u1u1 = u2u2 = u3

⇒ u = λ

111

Using equation (2.45) we would have obtained the same results, since

(R +RT)− (tr (R)− 1)I =

1 1 11 1 11 1 1

The angle θ is computed as

± arccos

(−1

2

)= ±2π

3= ±120.

If we assume, for example, θ = − arccos(−1

2

)= −120, from (2.27) we would have

obtained

R =

0 1 00 0 11 0 0

that is not equal to the initial matrix (2.50).

So we use the method of the skew-symmetric matrix, after choosing θ1 = 120,

sin θ1 =

√3

2. From (2.46) we compute

S(u1) =1√3

(R −RT

)=

1√3

0 −1 11 0 −1−1 1 0

obtaining u1 = λ

[1 1 1

]T, with λ = ∥u1∥ =

1√3.

Should we have assumed θ2 = −120, sin θ2 = −√3

2, we would have obtained

S(u2) = − 1√3

(R −RT

)= − 1√

3

0 1 −1−1 0 11 −1 0

and consequently u2 = λ

[−1 −1 −1

]T= −u1.


Example 2.8.4

Given the matrix

R =

0 0 −10 −1 0−1 0 0

we want to compute u and θ; we observe that R is symmetric and

R = RT = I

therefore we use relation (2.47) and (2.48). We know that in this case θ = π andcos θ = −1. Taking a unit norm vector u we compute

∥u∥2

1− cos θ

(R +RT

2− I

)=

1

2

−1 0 −10 −2 0−1 0 −1

≡ S 2

now using (2.47) we obtain

S 2 + ∥u∥2 I = uuT =

u21 u1u2 u1u3u2u1 u22 u2u3u3u1 u3u2 u23

=

0.5 0 −0.50 0 0

−0.5 0 0.5

from which one obtains

u =

[√2

20 −

√2

2

]Tor

[−√2

20

√2

2

]T

2.9 Roto-translations

In the previous Sections we have detailed the representation of both translationsand rotations; now we are ready to combine them in a single operator, the so calledrototranslation or roto-translation operator.

We take two reference frames, one identified as R0 the other identified as Rm,initially with common origins O and coinciding axes. For simplicity, R0 identifiesthe “fixed” frame, while Rm identifies the “mobile” frame; as already noted above,these two terms are used only for ease of explanation, in order to apply the roto-translation to Rm and characterize it with respect to R0.

In addition we consider a geometrical point P , attached to Rm and represented by

the vector vmP ; we consider also a physical vector−→AB represented by vmAB.

Now we want to apply a roto-translation to Rm, defined as follows


• a rotation of an angle θ around the axis u going thought the common originO; the rotation is represented by the matrix Rot(u , θ) = R0

m

• a translation of the origin of Rm with respect to the origin of R0; the trans-lation is represented in R0 by the vector t0m.

If we want to compose rotations and translations (and rotations with rotations, aswell) in a proper order, we should give additional information, namely

• The order of the various displacements: what do you apply first, the translationor the rotation?

• The reference frame with respect to which the displacement is performed: youdo it with respect to the fixed frame or with respect to the mobile frame?Indeed, because of the previous displacements, the two reference frames donot coincide any more.

According to the two decision we have four possible choices, considering that thefirst displacement is made indifferently with respect to the fixed or mobile frame,since they coincide.

a) First perform a rotation of the initial reference frame around the commonorigin, then perform a translation with respect to the axes of the fixed frameas in Figure 2.11 a), for a simple planar case.

b) First perform a rotation of the initial reference frame around the commonorigin, then perform a translation with respect to the axes of the mobileframe as in Figure 2.11 b).

c) First perform a translation of the initial reference frame, then perform arotation around the origin of the mobile frame as in Figure 2.11 c).

d) First perform a translation of the initial reference frame, then perform arotation with respect around the origin of the fixed frame as in Figure 2.11d).

As illustrated in Figure 2.11, the two cases a) and c) produce the same final roto-translation, while case b) and d) produce a different one.

Now we consider the two vectors vmP and vmAB; since they are “attached”, to themobile frame Rm, before the displacement their representation is

v 0P = vmP and v 0

AB = vmAB

while after the displacement they will assume a different representation in R0, andprecisely


Figure 2.11: Diversi modi di effettuare una rototraslazione planare.


• In the two cases a) and c) we have

v 0P = R0

mvmP︸︷︷︸

Rot

+t

︸︷︷︸Transl

(2.51)

and

v 0AB = R0

mvmAB

where R0m is the rotation matrix between R0 e Rm (i.e., Rm represented in

R0) and t is the translation vector from R0 to Rm, represented in R0.

• In the two cases b) and d) we have

v 0P = R0

m (vmP + t ′)︸︷︷︸Trasl︸︷︷︸

Rot

= R0mv

mP +R0

mt′ (2.52)

and

v 0AB = R0

mvmAB

but now t ′ is the translation vector from R0 to Rm, represented in Rm, thatbecomes R0

mt′ when represented in R0.

First you notice that the “physical” vector vmAB does not translate, but only changeits representation and becomes R0

mvmAB due to the rotation of the reference frame;

this is correct, since a change of reference frames does not change a force, a torque,a velocity, etc. except for their representations.

Second, you see that the results are different; therefore it is necessary to solve anyambiguity that may arise. We will see in Section 2.10.2 a simple mnemonic rule toperform roto-translations in correct order.

Esempio_4-12

Inserire esempio di rappresentazione vettoriale da Matlab see Figure xxx.

2.10 Homogeneous Coordinates

We have seen in the previous Section that a rotation operator corresponds to amatrix product, while a translation operator corresponds to a vector sum. It ispossible to use a unique operator for translations and rotations if we introduce the

Basilio

Formato

COMPLETARE


Figure 2.12: An example of vector transformations between two rotated frames.

so-called homogeneous coordinates or, with a less common term, perspectivecoordinates.

They come from the projective and epipolar geometry and find their use in manydifferent contexts as computer graphics [41] and 3D vision systems [47]. The inter-ested reader can find a detailed description of projective/epipolar geometry and itsapplications to computer vision in [10, 11, 19, 37, 44].

Given a point P in a 3D space, its homogeneous representation is given by theassociated 4× 1 homogeneous vector x , defined as

vdef=

λp1λp2λp3λ

= λ

p1p2p31

(2.53)

where λ ∈ R is a scale factor; in the following this factor will be set to 1, givingorigin to the homogeneous coordinates representation commonly adopted to studythe rigid roto-translations in the 3D space,

vdef=

p1p2p31

(2.54)

Now, considering again Eqn. (2.51), written in a simpler form

v 0 = R0mv

m + t

we notice that it can be written as

v 0 = T 0mv

m =

[R0m t

0T 1

] [vm

1

](2.55)

Basilio

Formato

FARE FIGURA


As one can see from the previous relation, a roto-translation, i.e., a rotation plus atranslation is represented by a 4× 4 homogeneous matrix (HM)

T 0m

def=

[R0m t

0T 1

](2.56)

where R0m and t are the rotation and the translation, respectively, and

0T def=[0 0 0

]2.10.1 Homogeneous Transformations

We can define two basic homogeneous transformations, namely, the pure rotation

T (R) ≡ TRdef=

[R 00T 1

](2.57)

and the pure translation

T (t) ≡ T tdef=

[I t0T 1

](2.58)

The generic homogeneous transformation in (2.56) can be obtained as the productof a pure translation and a pure rotations as in

T = T tTR =

[I t0T 1

] [R 00T 1

]=

[R t0T 1

](2.59)

Reversing the factors order one obtains a different HM

T ′ = TRT t =

[R 00T 1

] [I t0T 1

]=

[R Rt0T 1

](2.60)

We have already given in (2.55) the rule for transforming the representation of ageometrical point from Rm to R0 using the homogeneous coordinates; in brief:

1. We build the HM T 0m from R0

m and t0m in the right order: (2.59) or (2.60);

2. We write vmP in homogeneous coordinates vmP as in (2.54);

3. We compute v 0P as

v 0P = T 0

mvmP

4. We transform back from homogeneous coordinates to vector form v 0P in v 0

P .


It is important to notice that homogeneous vectors do no follow the classical sumor difference rules, since

x + y =

x1 + y1x2 + y2x3 + y3

2

For this reason we introduce the homogeneous sum operator, denoted by ⊕, andthe homogeneous difference operator, denoted by ⊖, respectively defined as:

x ⊕ y =

[x + y

1

]x ⊖ y =

[x − y

1

](2.61)

Moreover these operators are not distributive, since:

T (x ⊕ y) = Tx ⊕TyT (x ⊖ y) = Tx ⊖Ty

(2.62)

Relations (2.62) are important only for their use on geometrical vectors, as vP ;indeed when we have to do with oriented segments (physical vectors) vmAB = vmB−vmA ,one can use the homogeneous difference, as in

v 0AB = TvmB ⊖TvmA = RvmB + t −RvmA − t = R(vmB − vmA ) (2.63)

or the usual difference, as in

v 0AB = T (vmB − vmA ) = T

[vmB − vmA

0

]= R(vmB − vmA )

Inverse Homogeneous Matrix

The generic HM T consists of two factors: a 3 × 3 submatrix R (the rotationoperator) and a 3× 1 vector t (the translation operator).

For this reason the inverse matrix is not simply its transpose, but (see also Figure2.13):

[T ]−1 =

[RT −RTt0T 1

](2.64)

It follows that, while the inverse rotation matrix is its transform, the inverse trans-lation is t ′ = −RTt ; this has an immediate geometrical meaning: the minus signchanges the orientation of the transaction vector and RTt is the representation inRm of the original t , that was represented in R0).

Example 2.10.1


Figure 2.13: La trasformazione omogenea diretta e inversa tra due sistemi di riferi-mento.

Given the physical vector v 0AB = v 0

B−v 0A, represented in R0, with v 0

A =[0 0 1

]Tand v 0

B =[1 0 0

]T, we want to compute its representation vmAB = vmA − vmB in

Rm, where

T 0m =

0 −1 0 11 0 0 10 0 1 10 0 0 1

.We have to find the inverse of T 0

m as

Tm0 =

(T 0m

)−1=

0 1 0 −1

−1 0 0 10 0 1 10 0 0 1

;

Now using (2.63) we have:

v 0AB =

0 1 0 −1

−1 0 0 10 0 1 10 0 0 1

0011

⊖

0 1 0 −1

−1 0 0 10 0 1 10 0 0 1

1001

=

−1121

⊖

−1011

=

0111

hence v 0

AB =

011

.


The same results can be obtained computing directly the rotation of the orientedsegment:

v 0AB = Rm

0

(v 0B − v 0

A

)=

0 1 0−1 0 00 0 1

−101

=

011

2.10.2 Composition Rule for Roto-Translations

It is possible to perform any number of roto-translations obtaining a final roto-translation as the ordered product of the single displacements; it is therefore neces-sary to organize the product terms in the correct sequence.

We assume to start with two reference frames, for simplicity one called “fixed”,the other “mobile”, that are related by a homogeneous transformation T . The wewant to execute n generic displacements, each one represented by the related matrixT i, i = 1, . . . , n.

To perform in the correct order the various matrix products we must apply thefollowing rules.

We call T (i) the matrix product obtained after the first i-th displacements

1. Set i = 0 and initialize T (0) = T .

2. If the i-th roto-translation T i is defined with respect to the fixed referenceframe, one should pre-multiply the previous matrix T (i − 1) by T i; theresult will be

T (i) = T iT (i− 1)

Mnemonic Rule: pre-fix(ed)

3. If the i-th roto-translation T i is defined with respect to the mobile referenceframe, one should post-multiply the previous matrix T (i − 1) by T i; theresult will be

T (i) = T (i− 1)T i

Mnemonic Rule: post-mob(ile)

These rules are valid also for the product composition of rotation matrices. For thecomposition of translations alone, since the sum is commutative, the order is notimportant.

Observing again Figure 2.11 we can interpret the results as follows


• Figure 2.11 a)

It represents a rotation around the common origin of the two reference frames,followed by a translation with respect to the fixed frame

T (0) = I , T (1) = TR, T (2) =x

T tTR

• Figure 2.11 b)

It represents a rotation around the common origin of the two reference frames,followed by a translation with respect to the mobile frame

T (0) = I , T (1) = TR, T (2) =y

TRT t

• Figure 2.11 c)

It represents a translation with respect to the original frame followed by arotation around the origin of the mobile frame

T (0) = I , T (1) = T t , T (2) =y

T tTR

• Figure 2.11 d)

It represents a translation with respect to the original frame followed by arotation around the origin of the fixed frame;

T (0) = I , T (1) = T t , T (2) =x

TRT t

It is evident that we can interpret the final result T (2) in two different ways, thatdiffers only by the verbal description associated with their description. For instance,considering case a) and c), i.e.,

T (2) = T tTR

we can give two different linguistic description, that nevertheless produce the samefinal result

• a rotation with respect to the original frame followed by a translation withrespect to the fixed frame (rule pre-fix)

• a translation with respect to the original frame followed by a rotation withrespect to the mobile frame (rule post-mob)

In conclusion, it is worth noticing again that a HM asT 0m represnts a “mobile” frame

Rm with respect to a “fixed” frame R0; the 3× 3 rotation submatrix represents themobile reference frame with respect to the fixed one, while the last column partprovides the translation of the origin. All components are expressed in R0.


Example 2.10.2

We want to compute the HM T 0m representing the frame Rm obtained rotating the

fixed frame around the basis unit vector i of an angle +90, followed by a translationt =

[0 2 0

]along the y axis of the resulting (mobile) frame, followed by a rotation

around the axis z of the fixed reference frame of an angle −90.

We have three displacements defined by the following HM:

First displacement = rotation:

T 1 = T (R(i , 90)) =

[R(i , 90) 0

0T 1

]=

1 0 00 0 −10 1 0

0

0T 1

Second displacement = translation:

T 2 = T (t) =

I

020

0T 1

Third displacement = rotation:

T 3 = T (R(k ,−90)) =

[R(k ,−90) 0

0T 1

]=

0 1 0−1 0 00 0 1

0

0T 1

The composition rule is: first T 1, then post-product by T 2, then pre-product byT 3:

T 0m = T 3T 1T 2

=

0 1 0 0−1 0 0 00 0 1 00 0 0 1

1 0 0 00 0 −1 00 1 0 00 0 0 1

1 0 0 00 1 0 20 0 1 00 0 0 1

=

=

0 1 0 0−1 0 0 00 0 1 00 0 0 1

1 0 0 00 0 −1 00 1 0 20 0 0 1

=

=

0 0 −1 0−1 0 0 00 1 0 20 0 0 1


Example 2.10.3

A geometrical point P is represented in Rm by vm =[4 3 2

]T; Rm is translated

by t =[−3 0 7

]Tand then rotated according to

R =

0 −1 01 0 00 0 −1

with respect to the new frame. Find the components of P in R0.

T 0m is obtained post-multiplying the translation operator by the rotation operator,

as

T 0m =

I

−307

0T 1

︸︷︷︸translation

0 −1 0−1 0 00 0 −1

0

0T 1

︸︷︷︸

rotation

=

0 −1 0−1 0 00 0 −1

−307

0T 1

At this point one computes the homogeneous representation of the transformedvector as

vm =

0 −1 0−1 0 00 0 −1

−307

0T 1

4321

=

−6−451

obtaining

vm =

−6−45

2.11 Rigid Body Representation

A rigid body B in a 3D space can be simply represented by a right hand referenceframe RB associated to it; it is not strictly necessary that the frame lies inside thebody or on its surface, but it must be rigidly joined to it.

We call pose of a rigid body the set parameters that uniquely define its position andits orientation in R3. As we will detail in a successive Section, the pose of a rigidbody, not subject to any constraint in R3 is given by three position parameters andthree orientation parameters. In R2 the pose is given by two position parametersand one orientation parameter.


The pose is computed as follows: once the frame RB is given, its pose with respectto an external reference frame, often called world or (pseudo-)inertial referenceframe R0 can be obtained from the homogeneous transformation T 0

B (see Figure2.14).

Figure 2.14: A rigid body pose is characterized by the relation between the bodyframe RB and the world frame R0.

There are sixteen elements in T 0B =

[R0

B t0T 1

], but only six are independent, in

particular the three elements of the translation vector and only three elements ofthe rotation matrix (the rotation matrix is orthonormal, so it is subject to sixconstraints).

We can now state that the pose of a free rigid body in space is defined by six degreesof freedom (dof), three related to the position of the origin reference frame thatdescribes the body, and three related to it the orientation with respect to a worldreference frame. The motion of a rigid body is described by the time equations ofits pose; the study of the pose and its time history defines what we call the bodykinematics. In this Section we will only detail the different ways in which the poseis described, postponing to successive Sections the characterization of the time lawsinvolved.

Usually the three position dof’s are chosen as the cartesian coordinates of the origin,although sometimes it is useful to adopt spherical coordinates (but we will neveruse them in these notes). On the contrary, the orientation coordinates are morecomplex to choose: we will devote the entire Section 2.12 to describe the variouspossible parameterizations of orientation; for the moment we will call refer to these


three numbers as the “vector” α =[α1 α2 α3

]T. The body pose is therefore

formally defined as

p(t)def=

p1(t)p2(t)p3(t)

p4(t)p5(t)p6(t)

=

x1(t)x2(t)x3(t)

α1(t)α2(t)α3(t)

=

[x (t)α(t)

](2.65)

The first three components arranged in the vector x , that is a true vector, i.e., ageometrical vector, that can be subject to all the vector operations, while the secondthree components that describe the orientation, arranged in α do not formally obeyto the vector rules, since the “sum” of two orientations α and β is obtained by the

product of two rotation matrices and not by the sum[α1 + β1 α2 + β2 α3 + β3

]Tα→ R(α)β → R(β)

⇒ R(α)R(β) = R(γ) ⇒ γ

So it is not possible to consider p a vector, and the operations as sum and scalarproduct have no sense. This fact come from the formal definition of the quantitiesinvolved: indeed

p ∈ R3 × SO(3), with x ∈ R3, α ∈ SO(3)

where R3 is the real vector space of dimension 3, and SO(3) is the special or-thonormal group of dimension 3, also known as the rotation group; for furtherdetails on this group SO(3), see [34].

Nevertheless the habit to call p a vector is common, and we will continue to refer top or α with the term “vector”, since they are organized as a column of real numbers.

With this in mind, we can state that two are the possible representations of a rigidbody in space

1. The vector p(t) defined in (2.65).

2. The homogeneous matrix T (t), defined as:

T (t)def=

[R(t) t(t)0T 1

](2.66)

These two representations are equivalent and it is possible to compute one from theother, as we will show in the next Section.


Figure 2.15: Dato p calcolare T .

2.11.1 From Pose to Homogeneous matrix

Given p, we have to compute T , i.e., R and t (see Figure 2.15).

The vector t is simply obtained taking the first three elements of p, i.e., x

t ⇐

p1p2p3

≡ x (2.67)

We note that this is true only if (as we do) we give the position in cartesian coordi-nates and not in cylindrical or spherical coordinates; if this is the case it results in

general that t = f (x ): as an example, from spherical coordinates x sph =[ρ θ ϕ

]T,

to cartesian coordinates x car =[xc yc xc

]T, we have

xc = t1 = ρ sin θ cosϕ = f1(x sph)

yc = t2 = ρ sin θ sinϕ = f2(x sph)

zc = t3 = ρ cos θ = f3(x sph)

To compute R from α it is necessary to define the physical meaning of the threecomponents αi of α. We assume that they are three angles characterizing theorientation of RB in R0. The most used angles are the so called Euler angles orthe RPY angles, that will be introduced in Section 2.12.2 and 2.12.3, respectively.

It exists a nonlinear function R = gE(αE) that gives the elements rij of a rota-tion matrix R from the Euler angles (2.69), and another nonlinear function R =gRPY (αRPY ) that gives the elements rij of a rotation matrix R from the RPY angles(2.76).

2.11.2 From Homogeneous matrix to Pose

Given T , we have to compute p (see Figure 2.16).


The vector x is simply obtained taking the first three elements of the last columnof T

x ≡

p1p2p3

⇐

t1t2t3

≡ t (2.68)

Figure 2.16: Dato T calcolare p.

Notice that in this way the obtained coordinates are relative to the origin of thereference frame described by T ; if we need to compute the coordinates of a differentgeometrical point, we should characterize this new point as a vector in RB andtransform it using eqn. (2.55).

To compute α from R it is necessary to define the physical meaning of the threecomponents αi of α. We assume to use the Euler angles or the RPY angles.

It exists the inverse nonlinear function αE = g−1E (R) that gives the Euler angles

from the elements rij of a rotation matrix R (2.73), and another inverse nonlinearfunction αRPY = g−1

RPY (R) that gives the RPY angles from the elements rij of arotation matrix R (2.77).

The inverse solution, if exists, could be non unique, since a finite number of anglesα may exist that produce the same rotation matrix R. Since inverse trigonometricfunctions are involved, we consider as equaivlent solutions those that differ for integermultiples of 2π.

In the following Section we will illustrate the most used ways to characterize theorientation of a rigid body in space.

2.12 Orientation Parameters

We have seen that a rotation is a geometrical transformation acting on R3, definedby a orthonormal matrix R. At the same time the orientation of a rigid body, witha local reference frame attached to it, is represented by the rotation of the localframe with respect to the world or global frame. We conclude that it is equivalentto speak of the orientation of a rigid body or of its reference frame with respect to


some other “fixed” frame

Body B Orientation ⇔ Rotation Matrix RB

We recall that a rotation is characterized by only three parameters, and all the repre-sentation described in this section will be a different way to “organize” such param-

eters; in eqn. (2.65) these three parameters were indicated as α =[α1 α2 α3

]T,

but there are other parameterizations that use more than three parameters, as thequaternions, introduced in Section 2.12.6. Another example comes from the Eulertheorem, that states that any composition of rotations is always a rotation of anangle θ around an axis defined by a unit vector u ; therefore we can represent a ro-tation by the set (u , θ), where u has two free parameters, since there is an implicitconstraint ∥u∥ = 1), and the third one coincide with θ. Other authors use a nonunit vector v , whose norm provides the angle value ∥v∥ = θ.

Now we describe the most common forms used to represent the orientation of a rigidbody in R3.

2.12.1 Direction Cosines

Direction cosines are nothing else that the rotation matrix R itself. Indeed Rcontains in its columns (or rows) the representation of the unit basis vectors ofthe local frame with respect to the world frame. This representation needs nineparameters that must obey to six unit norm orthogonality constraints. In this

case the parameters[α1 α2 α3

]Tare hidden in R but can be extracted using the

equations (2.12.2) e (2.12.3).

2.12.2 Euler Angles

Historically this is the first representation of the orientation of a body: it associatesto α1, α2, α3 three angles, called Euler angles and usually denoted by ϕ, θ, ψ. Tounderstand how to build the Euler angles it is necessary to define them through animplicit procedure. The orientation of a mobile reference frame Rm described bythe Euler angles is obtained by three successive rotations around the principal axes,following a precise rule

First rotation, angle ϕ

Rz,ϕ ≡ R(k , ϕ) =

cϕ −sϕ 0sϕ cϕ 00 0 1

rotation of an angle ϕ around the local (mobile) axis z.


Second rotation, angle θ

Rx,θ ≡ R(i , θ) =

1 0 00 cθ −sθ0 sθ cθ

rotation of an angle θ around the local (mobile) axis x.

Third rotation, angle ψ

Rz,ψ ≡ R(k , ψ) =

cψ −sψ 0sψ cψ 00 0 1

rotation of an angle ψ around the local (mobile) axis z.

Using the rule “pre–fixed” “post–mobile” we compute the complete rotation matrixbased on the Euler angles as

R (ϕ, θ, ψ) ≡ Rz,ϕRx,θRz,ψ ≡ R(k , ϕ)R(i , θ)R(k , ψ) =cϕcψ − sϕcθsψ −cϕsψ − sϕcθcψ sϕsθsϕcψ + cϕcθsψ −sϕsψ + cϕcθcψ −cϕsθ

sθsψ sθcψ cθ

(2.69)

This composition rule is not unique in technical literature, since many textbooksadopt a slightly different convention: the second rotation around the x axis is re-placed by a rotation around the mobile axis y, producing a different “Euler” matrix:

R (ϕ, θ, ψ) ≡ Rz,ϕRy,θRz,ψ ≡ R(k , ϕ)R(j , θ)R(k , ψ) =−sϕsψ + cϕcθcψ −sϕcψ − cϕcθsψ cϕsθcϕsψ + sϕcθcψ cϕcψ − sϕcθsψ sϕsθ

−sθcψ sθsψ cθ

(2.70)

In this notes we will always adopt the first form (2.69), but we recall that in manyaerospace engineering textbooks the form (2.70) is widely used.

The above formulas are also called direct relations, since, given the three Eulerangles they compute the rotation matrix R(ϕ, θ, ψ). The inverse relation, i.e. howto compute the Euler angles given an generic R matrix is solved considering thegeneric elements of a matrix, provided it is orthonormal

R =

r11 r12 r13r21 r22 r23r31 r32 r33

, (2.71)

where the elements rij are known. To obtain the Euler angles it is necessary to solve


the following nonlinear equation system:

r11 = cϕcψ − sϕcθsψr12 = −cϕsψ − sϕcθcψr13 = sϕsθr21 = sϕcψ + cϕcθsψr22 = −sϕsψ + cϕcθcψr23 = −cϕsθr31 = sθsψr32 = sθcψr33 = cθ

(2.72)

that has the following generic solution

θ = ± arccos (r33)± 2kπ

ψ = ± arccos

(r32sθ

)± 2kπ

ϕ = ± arccos

(−r23sθ

)± 2kπ

(2.73)

Unfortunately this solution presents some drawbacks

1. the inverse trigonometric function arccos(·) is not unique: indeed cos(θ) =cos(−θ);

2. the solution becomes not definite for r33 = 1, i.e., when sθ = 0; in this casethe angles ϕ and ψ are not uniquely known; only their sum is given;

3. when θ → 0 or θ → ±180, the second and the third equations provide inac-curate solutions, since the numeric accuracy of the arccos(·) function dependson the angle value.

To solve these drawbacks, instead of arccos(·) it is customary to use the functionatan2(y, x), that is available in all the mathematical libraries of the most usedcomputer languages. It is definite as:

θ = atan2(y, x) = tan−1(yx

)=

=

0 ≤ θ ≤ 90 if x ≥ 0; y ≥ 090 ≤ θ ≤ 180 if x ≤ 0; y ≥ 0

−180 ≤ θ ≤ −90 if x ≤ 0; y ≤ 0−90 ≤ θ ≤ 0 if x ≥ 0; y ≤ 0

(2.74)

Moreover, by default, atan2(0, 0) = 0.


Using this function, the solution to (2.72) is:

ϕ = atan2 (r13, −r23)± 2kπ

ψ = atan2 (−cϕr12 − sϕr22, cϕr11 + sϕr21)± 2kπ

θ = atan2 (sϕr13 − cϕr23, r33)± 2kπ

(2.75)

Euler angles singularity

We observe that when r33 = 1 the matrix R(ϕ, θ, ψ) is equal to the elementarymatrix R(k , γ), that is function of a single angle. In this case we say that the Eulerrepresentation is singular; from the three possible angles we can obtain only twoangles; from (2.73) we have θ = 0, and the product R(k , ϕ)R(i , θ)R(k , ψ) reducesto

R(k , ϕ)R(k , ψ) = R(k , (ϕ+ ψ))

from which we have γ = (ϕ + ψ); we cannot compute separately the two anglesphi and ψ, but only their sum. This situation is described saying that θ doesnot decouple any more the other two rotations and so a singular configuration isproduced.

2.12.3 RPY Angles

Also in the case of Roll-Pitch-Yaw angles (RPY for short) θx, θy, θz it is necessaryto define them through an implicit procedure. The orientation of a mobile referenceframe Rm described by the RPY angles is obtained by three successive rotationsaround the principal axes, following a precise rule

First rotation, angle θx

Rx,θx ≡ R(i , θx) =

1 0 00 cθx −sθx0 sθx cθx

rotation of an angle θx around the world (fixed) axis x.

Second rotation, angle θy

Ry,θy ≡ R(j , θy) =

cθy 0 sθy0 1 0

−sθy 0 cθy

rotation of an angle θy around the world (fixed) axis y.

Third rotation, angle θz

Rz,θz ≡ R(k , θz) =

cθz −sθz 0sθz cθz 00 0 1


rotation of an angle θz around the world (fixed) axis z.

Using the rule “pre–fixed” “post–mobile” we compute the complete rotation matrixbased on the RPY angles as

R (θx, θy, θz) ≡ Rz,θzRy,θyRx,θx ≡ R(k , θz)R(j , θy)R(i , θx)

def=

cθzcθy sθxsθycθz − cθxsθz cθxsθycθz + sθxsθzcθysθz sθxsθysθz + cθxcθz cθxsθysθz − sθxcθz−sθy sθxcθy cθxcθy

(2.76)

The RPY angles are computed applying the same approach adopted for the Eulerangles, yielding:

θx = atan2 (r32, r33)± 2kπ

θz = atan2 (−cθxr12 + sθxr13, cθxr22 − sθxr23)± 2kπ

θy = atan2 (−r31, sθxr32 + cθxr33)± 2kπ

(2.77)

It is interesting to notice that the product R(k , θz)R(j , θy)R(i , θx) may also beread in a different order, starting from left to right, i.e., applying the “post-mobile”rule: first apply a rotation R(k , θz) around axis z, then apply a rotation R(j , θy)around mobile axis y, then apply a rotation R(i , θx) around mobile axis x. Thedifference is only in the description of the rotations, not in the final result (2.76),that is the same.

Other definitions can be found in various textbooks and are sometimes used, forinstance, the following alternative sequence is quite common

1) A rotation R(j , θy) of θy around the fixed axis y;

2) A rotation R(k , θz) of θz around the fixed axis z;

3) A rotation R(i , θx) of θx around the fixed axis x.

The resulting rotation matrix is:

R (θz, θy, θx) ≡ Rx,θxRy,θyRz,θz ≡ R(i , θx)R(j , θy)R(k , θz)

def=

cθycθz −cθysθz sθysθxsθycθz + cθxsθz −sθxsθysθz + cθxcθz −sθxcθy−cθxsθycθz + sθxsθz cθxsθysθz + sθxcθz cθxcθy

(2.78)

In this notes we will apply only definition (2.76).

RPY angles singularity

Roll-Pitch-Yaw angles are subject to singularity too, as the Euler angles, and ingeneral to all three-angles representation.


In particular the following relations hold

R(k , θz)R(j , 90) = R(j , 90)R(i , θz) (2.79)

R(i , θx)R(j , 90) = R(j , 90)R(k , θx) (2.80)

When θy = 90 the RPY matrix in (2.76) becomes singular, since from (2.79)

R(k , θz)R(j , 90)R(i , θx) = R(j , 90)R(i , θz)R(i , θx) = R(j , 90)R(i , (θx + θz))

and also the RPY matrix in (2.78) becomes singular, since from (2.80)

R(i , θx)R(j , 90)R(k , θz) = R(j , 90)R(k , θx)R(k , θz) = R(j , 90)R(k , (θx + θz))

Both representation are no more function of three angles, but only of a combinationof two of them; this fact is the cause of the so called gimbal-lockproblem thatoccurs in gyroscopes and is well known since the incident on the Apollo 10 MannedLunar Spacecraft [22].

2.12.4 Cardan Angles

We have seen in the previous Section that many alternatives are possible in orderto define three rotations. All possible angles are called generically Cardan angles.

Two possible groups of rotations arise: the first group includes the product obtainedby the product of three elementary rotation (all different); we call θ1, θ2 and θ3 thethree generic Cardan angles involved; Table 2.12.4 list all possible combinations.

The second group includes the product of three rotation matrices, but introducingonly two angles; in this case the Cardan angles are called α, β and γ. Table 2.12.4list all possible combinations.

Cardan angles singularities

As already noted in Section 2.12.2 and 2.12.3 the sequence of three angles producesa singularity when a certain pattern of angles appear. For the first group it is thepresence of a 90 angles that may produce a singular behaviour, since the followingidentities hold

R(i , θx)R(j , 90) = R(j , 90)R(k , θx)

R(i , θx)R(k , 90) = R(k , 90)R(j , θx)

R(j , θy)R(i , 90) = R(i , 90)R(k , θy)

R(j , θy)R(k , 90) = R(j , 90)R(i , θy)

R(k , θz)R(i , 90) = R(i , 90)R(j , θz)

R(k , θz)R(j , 90) = R(j , 90)R(i , θz)

(2.81)

while for the second group it is sufficient that the middle rotation matrix is equalto the identity to give origin of a singular representation.


R(i , θ1)R(j , θ2)R(k , θ3) Eqn. (2.78)

R(i , θ1)R(k , θ3)R(j , θ2)

cθ1cθ3 −sθ3 sθ2cθ3cθ1cθ2sθ3 + sθ1sθ2 cθ1cθ3 cθ1sθ2sθ3 − sθ1cθ2sθ1cθ2sθ3 − cθ1sθ2 sθ1cθ3 sθ1sθ2sθ3 + cθ1cθ2

R(j , θ2)R(i , θ1)R(k , θ3)

sθ1sθ2sθ3 + cθ2cθ3 sθ1sθ2cθ3 − cθ2sθ3 cθ1sθ2cθ1sθ3 cθ1sθ3 −sθ1

sθ1cθ2sθ3 − sθ2cθ3 sθ1cθ2cθ3 + sθ2sθ3 cθ1cθ2

R(j , θ2)R(k , θ3)R(i , θ1)

cθ2cθ3 −cθ1cθ2sθ3 + sθ1sθ2 sθ1cθ2sθ3 + cθ1sθ2sθ1 cθ1cθ3 −sθ1cθ3

−sθ2cθ3 cθ1sθ2cθ3 + sθ1cθ2 −sθ1sθ2sθ3 + cθ1cθ2

R(k , θ3)R(i , θ1)R(j , θ2)

−sθ1sθ2sθ3 + cθ2cθ3 −cθ1sθ3 sθ1cθ2sθ3 + sθ2cθ3sθ1sθ2cθ3 + cθ2sθ3 cθ1cθ3 −sθ1cθ2cθ3 + sθ2sθ3

−cθ1sθ2 sθ1 cθ1cθ2

R(k , θ3)R(j , θ2)R(i , θ1) Eqn. (2.76)

Table 2.1: Cardan angles obtained by three elementary rotations around three dif-ferent angles.

2.12.5 Euler Parameters

Given the rotation described in (2.27) R(u , θ) of an angle θ around an axis repre-

sented by the unit vector u =[u1 u2 u3

]T, we introduce four parameters, called

Euler parameters vi (not to be confused with the Euler angles), defined as:

v1 = u1 sinθ

2, v2 = u2 sin

θ

2, v3 = u3 sin

θ

2, v4 = cos

θ

2(2.82)

Only three out of four of these parameters are independent, since the followingconstraint holds

4∑i=1

v2i = 1 (2.83)

The Euler parameters may be interpreted as the components of a unit quaternionu . The quaternions will be introduced in Section 2.12.6.


R(i , α)R(j , β)R(i , γ)

cβ sβsγ sβcγsαsβ −sαcβsγ + cαcγ −sαcβcγ − cαsγ−cαsβ cαcβsγ + sαcγ cαcβcγ − sαsγ

R(i , α)R(k , β)R(i , γ)

cβ −sβcγ sβsγcαsβ cαcβcγ − sαsγ −cαsβsγ − sαcγsαsβ sαcβcγ + cαsγ −sαcβsγ + cαcγ

R(j , α)R(i , β)R(j , γ)

−sαcβsγ + cαcγ sαsβ sαcβcγ + cαsγsβsγ cβ −sβcγ

−cαcβsγ − sαcγ cαsβ cαcβcγ − sαsγ

R(j , α)R(k , β)R(j , γ)

cαcβcγ − sαsγ −cαsβ cαcβsγ + sαcγsβcγ cβ sβsγ

−sαcβcγ − cαsγ sαsβ −sαcβsγ + cαcγ

R(k , α)R(i , β)R(k , γ) Eqn. (2.69)

R(k , α)R(j , β)R(k , γ) Eqn. (2.70)

Table 2.2: Cardan angles obtained by two elementary rotations around three differ-ent angles.

Given u , θ and the Euler parameters vi, the rotation matrix R(u , θ) can be com-puted as:

R(u , θ) =

v21 − v22 − v23 + v24 2(v1v2 − v3v4) 2(v1v3 + v2v4)

2(v1v2 + v3v4) −v21 + v22 − v23 + v24 2(v2v3 − v1v4)

2(v1v3 − v2v4) 2(v2v3 + v1v4) −v21 − v22 + v23 + v24

(2.84)

Conversely, given R(u , θ), the Euler parameters are computed as:

v4 = ±1

2

√(1 + r11 + r22 + r33)

v1 =1

4v4(r32 − r23)

v2 =1

4v4(r13 − r31)

v3 =1

4v4(r21 − r12)

(2.85)


The sign ambiguity in v4 can be eliminated constraining the angle as

−π2≤ θ

2≤ π

2,

or −π ≤ θ ≤ π; in this way v4 can be only positive.

The rotation angle can be computed from

cos θ = v24 − (v21 + v22 + v23)

and the unit vector u as

u =1

sin(θ/2)

v1v2v3

The Euler parameters can also be computed directly from the Euler angles (ϕ, θ, ψ)as:

v1 = sin

(ϕ− ψ

2

)sin

(θ

2

)v2 = cos

(ϕ− ψ

2

)sin

(θ

2

)v3 = sin

(ϕ+ ψ

2

)cos

(θ

2

)v4 = cos

(ϕ+ ψ

2

)cos

(θ

2

)(2.86)

Often the Euler parameters are denoted by the symbol

v =

v1v2v3v4

that is similar to a vector, but it is not, since the application of vector operators hasno sense.

The Euler parameter are a most convenient form for parameterizing the rotations:they are more compact than the rotation matrix R and more efficient from analgorithmic point of view, since to compute R = R(v) from them, as in (2.84), doesnot require the use of trigonometric functions.

Moreover, given two rotationsR(va) andR(v b), with their relative Euler parametersva and v b, one can compute directly the rotation product R(v c) = R(va)R(v b)using the following matrix product:

v c = F (va)v b =

va4 −va3 va2 va1va3 va4 −va1 va2−va2 va1 va4 va3−va1 −va2 −va3 va4

v b (2.87)


Only 16 multiplications are required compared to the 27 that are involved in theproduct between two 3× 3 rotation matrices.

2.12.6 Quaternions

Quaternions were introduced by Hamilton who discovered them in 1843, in orderto generalize in three dimensions the complex numbers and their characteristic ofbeing plane rotators. For an detailed history of the problems and discussion raisedby the quaternions in the scientific community in the middle of XIX century, readthe interesting monograph by Crowe [7].

The quaternion algebra, with its definitions and operators used for representingrotations, is described in details in Appendix D.

The symbol for the generic quaternion is h.

2.12.7 Quaternions and Rotations

In order to use a quaternion u = (u0, u1, u2u3) to represent a rotation, we assign toeach element ui, an Euler parameter vi

u0 = v4u1 = v1u2 = v2u3 = v3

(2.88)

The quaternion so defined is therefore

u = (u0, u1, u2, u3) = (cosθ

2, u1 sin

θ

2, u2 sin

θ

2, u3 sin

θ

2)

and has unit norm. Since a biunivocal correspondence exists between the Eulerparameters and the unit quaternions, any unit quaternion represents a rotation inthe 3D space, as any unit complex number represents a rotation in the 2D plane.

We write R(u) to indicate that for every rotation there is a corresponding a unitquaternion u and vice versa.

In order to convert a unit quaternion u = (u0,uv) into the corresponding matrixR(u), we use the following relation, that is equal to (2.84), valid for Euler parame-ters:

R(u) = (u20 − uTvuv)I + 2uvu

Tv − 2u0S(uv) =u

20 + u21 − u22 − u23 2(u1u2 − u3u0) 2(u1u3 + u2u0)

2(u1u2 + u3u0) u20 − u21 + u22 − u23 2(u2u3 − u1u0)

2(u1u3 − u2u0) 2(u2u3 + u1u0) u20 − u21 − u22 + u23

(2.89)


Conversely, to compute the quaternion u starting from the elements rij of the rota-tion matrixR(h)we use a relatione that is equal to (2.85), valid for Euler parameters:

u0 = ±1

2

√(1 + r11 + r22 + r33)

u1 =1

4u0(r32 − r23)

u2 =1

4u0(r13 − r31)

u3 =1

4u0(r21 − r12)

(2.90)

When u0 = 0, i.e., θ/2 = π/2 → θ = π = 180, we use a different formula

u0 =1

2

√(1 + r11 + r22 + r33)

u1 =1

2sgn(r32 − r23)

√(1 + r11 − r22 − r33)

u2 =1

2sgn(r13 − r31)

√(1− r11 + r22 − r33)

u3 =1

2sgn(r21 − r12)

√(1− r11 − r22 + r33)

(2.91)

where sgn(x) is the sign function of x

sgn(x) = +1 for x > 0sgn(x) = 0 for x = 0sgn(x) = −1 for x < 0

According to (2.29) – (2.31), elementary rotations R(i , α), R(j , β) e R(k , γ) corre-spond to the following elementary quaternions:

R(i , α) → u1 = (cosα

2, sin

α

2, 0, 0)

R(j , β) → u2 = (cosβ

2, 0, sin

β

2, 0)

R(k , γ) → u3 = (cosγ

2, 0, 0, sin

γ

2)

(2.92)

therefore the vectorial base of the quaternions corresponds to the three elementaryrotations by 180 angles around the principal axes:

i = (0, 1, 0, 0) → R(i , π)j = (0, 0, 1, 0) → R(j , π)k = (0, 0, 0, 1) → R(k , π)

(2.93)


Observe that, while the product of two equal elementary quaternions gives

iı = jj = kk = ijk = (−1, 0, 0, 0),

the analog product of two equal elementary rotation matrices gives the identitymatrix, i.e., a rotation of 2kπ:

R(i , π)R(i , π) = R(j , π)R(j , π) = R(k , π)R(k , π) = R(i , π)R(j , π)R(k , π) = I(2.94)

corresponding to the quaternion (1, 0, 0, 0); this strange property is related toa new entity, called spinor, that will not be further discussed here; the interestedreader can find more details in [1], [18] and [38, Ch. 11].

The quaternion operations that are related to the computation of rotations are thefollowing:

1. Given n rotations R1, R2, · · · , Rn and the corresponding unit quaternions u1,u2, · · · , un, the product of rotations

R(u) = R(u1)Ru2) · · ·R(un)

corresponds to the quaternion product

u = u1u2 · · · un

in the same order;

2. given the rotationR(u) and the corresponding unit quaternion u, the transposematrix RT corresponds to the conjugate unit quaternion u∗

R(u) ⇔ RT(u∗)

3. given a generic vector x , that corresponds to a generic quaternion consistingon the vectorial part only x = (0,xT) = (0, x1, x2, x3) and given the rotationR(u) corresponding to the unit quaternion u, the rotated vector x ′ = R(u)xcoincides with the quaternion product, that has always a zero real part

x ′ = R(u)x ⇔ x′ = uxu∗

4. the product of two or more rotation matrices product R1(u1)R2(u2) is com-puted adopting the following identity:

R1(u1)R2(u2)x ⇔ u1(u2xu∗2)u

∗1 = (u1u2)x(u

∗2u

∗1) = (u1u2)x(u1u2)

∗.

As a last comment, it should be taken into consideration the fact that in spaceapplications and aerospace textbooks often the quaternions are organized in a listtheta is different from the one adopted here: namely the real part is the fourth termof the quaternion, not the first as in our notations.


Example 2.12.1

Given the elementary rotation R(j , 90), find the relative Euler parameters and thecorresponding unit quaternion u

Since

R(j , 90) =

0 0 10 1 0−1 0 0

applying relations (2.85) and (2.90), we have

u0 = v4 =

√2

2u1 = v1 = 0

u2 = v2 =

√2

2u3 = v3 = 0

that are the same as those obtained applying (2.82).

Example 2.12.2

Find the quaternion and the Euler parameters that represent the rotation

R = R1 (i , 90)R2 (j , 90

) ,

then compute the rotation axis u and the angle θ.

The two quaternions are:

R1 (i , 90) → u1 =

(√2

2,

√2

2, 0, 0

)

R2 (j , 90) → u2 =

(√2

2, 0

√2

2, 0

)

from which we obtain the product quaternion

R1 (i , 90)R2 (j , 90

) → u1u2 =

1

2− 1

2i · j︸︷︷︸

0

,1

2i +

1

2j +

1

2(i × j )︸︷︷︸

k

=

(1

2,1

2,1

2,1

2

)

Therefore the Euler parameters are

v4 = h0 =1

2; v1 = h1 =

1

2; v2 = h2 =

1

2; v3 = h3 =

1

2


The angle isθ

2= arccos

(1

2

)⇒ θ = 120 and the axis is u =

[1 1 1

]T; since u

norm is not unit we compute its norm ∥u∥ =√3 and after that, applying (2.82) we

obtain the same result

1

2=

1√3sin

θ

2⇒ sin

θ

2=

√3

2⇒ θ

2= 60 ⇒ θ = 120

2.12.8 Cayley-Klein Parameters

Another parameterizations of rotations was introduced by Felix Klein (1849-1925),with the aim to make easier the integration of differential equations in complexgyroscopic problems [16].

This parameterizations, as well as that with quaternions or Euler parameters, hasthe advantage of not requiring the computation of trigonometric functions.

The Cayley-Klein Parameters (CK parameters) are represented by complex 2×2matrices

Q =

[α βγ δ

](2.95)

whereα β γ δ

are complex variables. Q must be unitary, hence the CK

parameters obeys to the following constraints:

α = δ∗

β = −γ∗

and the matrix Q can be written as

Q =

[α β

−β∗ α∗

]with an additional constraint

αα∗ + ββ∗ = 1

In Q there are three free parameters and they can be used to characterize therotations: indeed from the CK parameters it is possible to compute the rotationmatrix

R =

1

2(α2 − β2 − γ2 + δ2)

j

2(−α2 − β2 + γ2 + δ2) γδ − αβ

j

2(α2 − β2 + γ2 − δ2)

1

2(α2 + β2 + γ2 + δ2) −j(αβ + γ + δ)

βδ − αγ j(αγ + βδ) αδ + βγ

(2.96)


where j is the imaginary unit, j =√−1; the matrix R has real elements, also if it is

function of several complex variables.

Given the Euler angles (ϕ, θ, ψ), the CK parameters are computed as:

α = ej(ϕ+ψ) cos

(θ

2

)β = jej(ϕ−ψ) cos

(θ

2

)γ = je−j(ϕ−ψ) cos

(θ

2

)δ = e−j(ϕ+ψ) cos

(θ

2

)(2.97)

The relation between the CK parameters (α, β), the quaternion (h0, h1, h2, h3) orthe Euler parameters (v1, v2, v3, v4) is the following:

α = h0 + jh3 = v4 + jv3

β = h2 + jh1 = v2 + jv1

Matrix Q can also be written as:

Q = h01+ j (h1σ1 + h2σ2 + h3σ3) (2.98)

where 1 is the 2× 2 identity matrix and σi are the so called Pauli Spin matrices

1 =

[1 00 1

]σ1 =

[0 11 0

]σ2 =

[0 −jj 0

]σ3 =

[1 00 −1

]. (2.99)

defined by the Pauli algebra

σjσk = δjk1+ jϵjkmσm

where δjk is the Kronecker delta and ϵjkm are the Levi-Civita symbols, defined in(2.127).

For further details on the CK parameters one can refer to [16, pages 145-158].

2.12.9 Rotation Vectors – Rodrigues Vectors

Another way to represent the orientation of a rigid body consists in using only threeparameters without incurring in the singularity problems encountered with Euler orRPY angles.

Instead of using four parameters and a unit norm constraint, like in Euler parame-ters or in quaternions, one can introduce the so-called rotation vectors r , whose


components ri describe the rotation axis, while its norm ∥r∥ provides the rotationangle or a trigonometric function of the rotation angle. In general the rotation vectorr is defined as:

r = f (θ)u (2.100)

where u is the unit norm vector characterizing the rotation axis and ∥r∥ = f(θ) isan odd function2 of θ.

The function f(θ) is usually chosen among the following list of odd trigonometricfunctions:

a) f(θ) = θ

b) f(θ) = sin θ

c) f(θ) = sinθ

2

d) f(θ) = tanθ

2

Notice that the choice c) is related to quaternions, being the three components ofr equal to the last three quaternion elements, or the first three Euler parameters.In this case the rotation vector r is called Euler (rotation) vector, not to beconfused with the Euler parameters or the Euler angles.

Choice d) brings to the so-called Rodrigues (or Gibbs) rotation vectors, quiteused in theoretical kinematics.

The relation between the Rodrigues vector r and the Euler parameters v is thefollowing:

r1 =v1v4, r2 =

v2v4, r2 =

v3v4

(2.101)

Notice that the Rodrigues vectors are undefined for odd angles θ = ±(2k + 1)π.

Given two Rodrigues vectors ra and r b, their “product”, denoted by the symbol ⊙,is computed as:

ra ⊙ r b =ra + r b − r b × ra

1− rTar b

(2.102)

The relations between rotation matrices and Rodrigues vectors, are established con-sidering that:

1. the product of two rotation matricesRaRb is equivalent to the product ra⊙ r bof the corresponding Rodrigues vectors;

2. in order to compute R given r , after setting the identities v1 = r1, v2 =r2, v3 = r3, v4 = 1, one applies relation (2.84) and the elements found aredivided by (1 + r21 + r22 + r23);

2An odd function is defined as −f(θ) = f(−θ).


3. in order to compute r given R one builds the skew-symmetric matrix S(r),given by

S(r) =R −RT

1 + tr (R)=

0 −r3 r2r3 0 −r1−r2 r1 0

= (2.103)

and has immediately the elements ri.

Example 2.12.3

Considering again Example 2.12.1, we want to compute the Rodrigues vectors thatrepresent the two rotations R1(i , 90

) and R2(j , 90) and then compute the Ro-

drigues vector of the product R = R1(i , 90)R2(j , 90

).

The Rodrigues vector relative to R1(i , 90) is r 1 =

[1 0 0

]T, while the Rodrigues

vector relative to R2(j , 90) is r 2 =

[0 1 0

]T; The overall rotation corresponds to

a Rodrigues vector computed as:

R = R1(i , 90)R2(j , 90

) ⇒ r = r 1 ⊙ r 2 =

111

Since ∥r∥ =

√3, we have

θ

2= arctan

√3 = 60 ⇒ θ = 120.

2.12.10 Orientation Error and Angle-Axis Representation

Often, in satellite or UAV orientation control problems it is necessary to computethe angular error between two different orientations: when Euler or RPY angles areused, the error between the desired αd and the measured ones αm is simply definedby the difference between the respective angles

∆α = αd −αm or ∆α′ = αm −αd

The first definition ∆α is commonly used by control designers that define as er-ror the difference between the desired and the measured angles, while the seconddefinition ∆α is used by measurement people that call error the opposite difference.

Instead, when the rotation matrices are used, it is necessary to define the error inanother way since it is a big error to compute the difference between two rotationmatrices!

Let us assume that Rd is the desired rotation matrix and Rm the measured one;the error may be defined in four different ways:


1) Re1: defined as RmRe1 = Rd → Re1 = RTmRd.

2) Re2: defined as RdRe2 = Rm → Re2 = RTdRm.

3) Re3: defined as Re3Rm = Rd → Re3 = RdRTm.

4) Re3: defined as Re4Rd = Rm → Re4 = RmRTd .

Notice that Re1 = RTe2 and Re3 = RT

e4.

Relations 1) and 2) define the orientation error as a post-product, and thereforeare to be understood as a perturbation of the rotation with respect to the “mobile”reference frame, while relations 3) and 4) define the orientation error as a pre-product, and therefore are to be understood as a perturbation of the rotation withrespect to the “fixed” or world reference frame.

In any case the orientation error is the product of two rotation matrices, and inorder to compute the axis unit vector ue and the angle ∆θ it would be necessary toperform the products and then use the identities (2.43) and (2.46).

Fortunately there exist another formula that one can use to compute the angle ndthe axis of the rotation resulting from the product of rotation matrices, withoutactually doing the product. This relation is just used to compute the axis-angle ofthe orientation error.

Assuming to have defined the orientation error as Re = RT1R2 – cases 1) and 2) –

then the following identity holds

ue sin∆θ =1

2

(r 2 × r 1 + g2 × g1 + b2 × b1

)(2.104)

where r i, g i and b i are the first, second and third rows of Ri, i = 1, 2.

One can notice that the left-hand term in (2.104) is a type b) rotation vector, thatgoes to zero for angles equal to 0 or ±π, for which we know that identity (2.46) innot applicable.

If we define the orientation error as Re = R1RT2 – cases 3) and 4) – then the

following identity, also known as the unit vector lemma, holds

u sin θ =1

2(r 2 × r 1 + g2 × g1 + b2 × b1) (2.105)

where r i, g i and b i are the first, second and third columns of Ri, i = 1, 2.

2.12.11 Conclusions

In this Section we have seen that three parameters αi are sufficient to define theorientation of a rigid body; the problem with three (angle) parameters is that theyare not singularity-free.


If we use the quaternions, the Euler parameters, the Cayley-Klein parameters orthe Rodrigues vectors, it in unclear what are the parameters that take the placeof the αi in (2.65). Using these alternative representations, we cannot say thatthe αi are “angles”, but it is necessary to speak in more general terms of “angularparameters”, whose knowledge nevertheless allows to compute the chosen (Euler orRPY or Cardan) angles.

For instance, using the quaternions we can define:

α1 = u1 sinθ

2, α2 = u2 sin

θ

2, α3 = u3 sin

θ

2(2.106)

and implicitly assume the unit norm. However it is a very common practice toassociate to the αi the Euler or the RPY angles, although in the last few yearsquaternion representation has gained much attention, mainly in the aerospace andcomputer graphics applications.

2.13 Point Kinematics

Having introduced in the previous Sections the different representations of a geomet-rical point or of a vector, both polar or axial, in a given reference frame, and havingalso characterized the various types of rigid displacements in the Euclidean space, weare now ready to describe the motion of geometrical points, assumed massless. Thisdescription takes the name of kinematics and is distinct from dynamics since theformer studies the motion of points or rigid bodies establishing the relations amongpositions, velocities and accelerations from a pure geometrical point of view, whilethe latter studies the influence of the forces and torques on the bodies on thesequantities.

In few words, kinematics study the motion without considering its causes and itseffects, while dynamics studies how the external actions on the given bodies arerelated to their motion.

Suppose to have a geometrical point P moving in the 3D Euclidean space with atime law specified by a function P (t); we can assume a vector representation of Pin a given reference frame

xP (t) =

x1P (t)x2P (t)x2P (t)

or, to keep notation simple

x (t) =

x1(t)x2(t)x2(t)


Recalling the vector derivative in A.5, we write

d

dtx (t) =

d

dtx1(t)

d

dtx2(t)

d

dtx2(t)

≡ x (t) =

x1(t)x2(t)x3(t)

≡ v(t)

Since the velocity is the limit

dP (t)

dt= lim

∆T→0

−−→∆P

∆T

the signed segment−−→∆P can be different in different reference frames.

If the point P is fixed in a local reference frame Rm, we have[−−→∆P

]Rm

= 0; but if

Rm moves with respect to a world frame R0, then[−−→∆P

]R0

= 0. In the first case

we speak on “local” velocity, in the second case we speak of “global” or absolutevelocity or total velocity. Usually when we do not specify otherwise, we willconsider always the total velocity of a point.

When the vector x (t) is subject to a time-varying rotation R(t)x (t), its derivate iscomputed according to the normal derivative product rule, i.e.,

d

dt[R(t)x (t)] =

[d

dtR(t)

]x (t) +R(t)

[d

dtx (t)

]therefore it is necessary to derive the rotation matrix R(t). To understand theconsequences of this operation, we have to study its properties.

2.13.1 Rotation Matrix Derivative and Angular Velocity

To start we assume that the generic rotation matrix R is function of a genericvariable x, that is R = R(x). We recall that the orthogonality of the matrix implies

R(x)RT(x) = I

We derive this relation with respect to a generic variable x obtaining:

dR(x)

dxRT(x) +R(x)

dRT(x)

dx= O (2.107)

It is evident that the first term is the transpose of the second term, therefore,recalling the definition of skew-symmetric matrix in Appendix B.6, we can definethe following skew-symmetric matrix

S (u(x))def=

dR(x)

dxRT(x), (2.108)


we have introduced a generic vector u(x) because we know that a skew-symmetricmatrix in R3 embeds in its structure the components of a vector. We will laterdiscuss its precise meaning.

Taking both terms of (2.107) and post-multiplying them by R(x ) we obtain

dR (x)

dx= S (u (x))R (x) (2.109)

and it results that the derivative of an orthonormal matrix is the matrix itself pre-multiplied by a suitable anti-symmetric matrix.

S(u) in (2.108) is function of a generic vector u(x), itself a function of a genericscala variable x. In particular, when the rotation matrix R depends on an angleθ(t) around an axis given by the unit vector u , we can write:

dR (u , θ(t))

dt≡ R (u , θ(t))

def= S (ω(t))R (u , θ(t)) (2.110)

where ω(t) is the instantaneous total angular velocity vector of the reference framerepresented by the matrix R(t). The angular velocity is represented in the worldreference frame, but we omit to write it as ω0 for notational simplicity.

Now we take into consideration some simple cases of vector transformationR(t)r(t):we start with r(t) = r constant in time in a “mobile” reference frame Rm a wetransform it n the world reference frame R0

r 0(t) = R0m(t)r

m

the time derivative is computed as:

r 0(t) = R0

m(t)rm (2.111)

the vectors in this expression are represented in two different reference frames, so itis necessary to represent both in R0; considering that

rm = (R0m)

Tr 0

we have at the end, omitting for notational convenience the dependence of R on t,

r 0(t) = R0

1(R01)

Tr 0(t) (2.112)

In mechanics textbooks one often encounters the following notation

Ω(t) ≡ R0

1(R01)

T

that coincide with our previous skew-symmetric matrix S(ω); indeed

Ω = R0

1(R01)

T = S(ω)R01(R

01)

T = S(ω)


This matrix is often calle the angular velocity matrix; in some textbooks theskew-symmetric matrix Ω(t) = S(ω(t) is denoted with the symbol ω(t).

Considering again the identity (2.111), we observe that it expresses the well knownformula that gives the linear velocity of a point P fixed in a reference frame thatrotates with an angular velocity ω:

r 0(t) = Ω(t)r 0(t) = S(ω(t))r 0(t) = ω(t)× r 0(t) (2.113)

Other interesting properties of the skew-symmetric matrices are related to the rigidbody representation in 3D space. As shown in (2.40), it is possible to compute therotation matrix R from the rotation axis represented by the non-unit vector v andthe rotation angle θ = ∥v∥, considering the skew-symmetric matrix S(v).

This equality is a theoretical consequence of the following property, in general validfor Lie groups and algebras (refer to [34] for details):

R(v) ≡ R(u , θ) = eS(v) ≡ eS(θ,u) =∞∑k=0

1

k!S k(θu) (2.114)

that relates the rotation matrix R to the exponential of the skew-symmetric matrixS(v) = S(u , θ), function of the unit axis u and angle θ.

Taking the Taylor series of the matrix exponential and considering that the rotationsare elements of a cyclic group, and this reflects on the strusture of S i.e.,

S 3 = −S , S 4 = −S 2

we obtain:

R = I +sin θ

θS(v) +

1− cos θ

θ2S 2(v) (2.115)

from which one gets the identities (2.40) and (2.41). Without entering into details,that can be found in the cited textbook, we write here the fundamental identity thatrelates the angular velocity vector ω(t), the unit vector u(t) and the angle θ(t):

ω(t) = θ(t)u(t) + sin θ(t)u(t) + (1− cos θ(t))S (u(t)) u(t) (2.116)

From (2.116) we see that the vector ω(t) is not the formal derivative of anothervector, except from the simple case that u is constant, i.e., u(t) = c.

In such a case we have:ω(t) = θ(t)c = r(t) (2.117)

that represents the total time derivative of the vector r(t) = θ(t)c.

Another useful formula to get ω(t) from a rotation matrix R =[r g b

]is the

following (see [17]):

ω(t) =1

2(r × r + g × g + b × b) (2.118)

Notice the similarity of this equality with (2.105).


2.13.2 Infinitesimal Rotations

We have seen that, given two rotations respectively represented by two angular pa-

rameters α1 =[α11 α12 α13

]Tand α2 =

[α21 α22 α23

]T, the rotation resulting

from their composition cannot be obtained (except particular cases) from the simplesum of the two angles:

α = α1 +α2

Now let us consider an infinitesimal rotation defined by the infinitesimal anglevector

dα =[dα1 dα2 dα3

]Tthe associated rotation matrix can be written as

R(dα) = R(u(t), dθ(t)) = Ru(t),dθ(t) (2.119)

From (2.115), and taking into account that for dθ → 0 we have

1− cos (dθ)

(dθ)2→ 0

sin (dθ)

(dθ)→ 0

the following approximate relation holds

R (dα) ≃ I + S (dα) . (2.120)

The inverse of R(dα) is computed as

R(dα)−1 = R(dα)T ≃ I − S(dα) = I + S(−dα) ≃ R(−dα)

This relation can be verified writing

R(dα)TR(dα) = (I + S)(I − S) = I + S − S + S 2 ≃ I

where S 2(dα) ≃ O since it contains second order infinitesimals, we conclude thatthe inverse of R(dα) is R(−dα).

Using (2.120) it is possible to show that dα behaves like a proper vector. Forinstance, the commutative property holds; indeed the following identity is true:

R (dα1)R (dα2) = (I + S (dα1)) (I + S (dα2))

= I + S (dα1) + S (dα2) + S (dα1)S (dα2)

Since the last term goes to zero, and recalling (B.3), we have

R (dα1)R (dα2) = I + S (dα1) + S (dα2)= I + S (dα1 + dα2)= R (dα1 + dα2)= R (dα2 + dα1)= R (dα2)R (dα1)

(2.121)


We know that from (2.110)) it results

R(t) = S (ω(t))R (u , θ(t)) (2.122)

Since R does not depend on the choice of the representation of the parameters αused to characterize the rotation, the formula (2.122) can be re-written as:

R (α(t)) = S (ω(t))R (α(t)) (2.123)

Considering the incremental rate we have

∆R = R (α+ dα)−R (α)

≃ R(α)dt= S (ω(t))R (α) dt= S (ω(t)dt)R (α)= S (dα)R (α)

from which it follows

R (α+ dα) ≃ R (α) + S (dα)R (α) = (I + S (dα))R (α) = R (dα)R (α)(2.124)

We recall that the last term in the above identity does not commute.

We conclude this Section with an observation: the role played by the skew-symmetricmatrix S(ω) is fundamental in the definition of the angular velocity of a referenceframe, and the two relations (2.114) and (2.122) are an example of this role.

Now, if we introduce the following three matrices M i:

M 1 =

0 0 00 0 −10 1 0

M 2 =

0 0 10 0 0−1 0 0

M 3 =

0 −1 01 0 00 0 0

(2.125)

thta are commonly called infinitesimal rotation generators), we are able to buildS(ω) as a weighted sum

S(ω) =∑i=1,3

ωiM i

and observe that the M i form a basis for S(ω). Moreover the M i have the propertythat the differences between their mutual products obey to

M iM j −M jM i ≡ [M i, M j] = ϵijkM k (2.126)

where the ϵijk are called permutation symbols or Levi-Civita symbols definedas

ϵijk =

0 i = j, j = k, k = i; two symbols out of three are equal+1 (i, j, k) ∈ (1, 2, 3), (2, 3, 1), (3, 1, 2)−1 (i, j, k) ∈ (1, 3, 2), (2, 1, 3), (3, 2, 1)

(2.127)

The matrix operator in (2.126) is referred as a commutator or Lie bracket and itdefines the Lie algebra on the orthonormal matrix group. The properties and theimportance of such an algebra will not be detailed here; the interested reader canfind more details in [34].


2.13.3 Infinitesimal Rotations and Quaternions

If the rotation angles are small it is possible to directly compute the correspond-ing quaternions and vice-versa. Let us assume to have a small rotation ∆θ rep-resented by the (small) angle parameters ∆α, and the corresponding quaternionh =

[h0 h1 h2 h3

]; from (2.120) we have

R(∆α) = I + S(∆α) =

1 −∆α3 ∆α2

∆α3 1 −∆α1

−∆α2 ∆α1 1

=

r11 r12 r13r21 r22 r23r31 r32 r33

(2.128)

comparing (2.128) with (2.89), we obtain:

∆α1 =1

2(r32 − r23) = 2h1h0 ≃ 2h1

∆α2 =1

2(r13 − r31) = 2h2h0 ≃ 2h2

∆α3 =1

2(r21 − r12) = 2h3h0 ≃ 2h3

Taking into consideration definition (2.119), the approximation is reasonable; in-

deed, applying (2.114), we obtain h0 = cos∆θ

2≃ 1 since ∆θ is a small angle.

Hence, if we know the value of an infinitesimal rotation dα =[dα1 dα2 dα3

]T,

we can easily compute the corresponding quaternions as

h ≈[1

dα1

2

dα2

2

dα3

2

]

2.14 Total Velocity and Acceleration of Points

When we speak of linear or angular velocites of a geometrical point or a rigid body,we mean the relative velocity of the geometrical point with respect to a welldefined reference frame, or the velocity of the reference frame describing the rigidbody with respect to another well defined reference frame.

To be precise, when we say that a rigid body has at the time t an angular velocityω(t), we mean that at t the relation (2.116), that we report below for the sake ofclarity, is true:

ω(t) = θ(t)u(t) + sin θ(t)u(t) + (1− cos θ(t))S (u(t)) u(t)

This means that we have to give non only θ(t), but also the instantaneous rotationaxis u(t) at time t. If the rotation axis remains constant, then (2.116) reduces to(2.117).


It is important to note that the vector ω(t) cannot be translated parallel to itselfwithout modifying the physical significance of the described phenomenon, since theapplication point of ω(t) gives the position of the axis around which the rotationtakes place; this is not valid for the linear velocity given by the vector v(t): this oneis a free vector, while ω(t) is an applied vector.

We recall that an applied vector is defined by the couple (P,w), where P is the

geometrical point of application and w is defined such that−→PQ = w

To be more precise, when we speak of angular (or linear) velocities we should usethe following (cumbersome) notation:

ωkij(P ) (2.129)

where we denote the relative angular velocity of the reference frame (or of theassociated rigid body) Rj with respect to the reference frame Ri, represented in thereference frame Rk, and where P is the application point of ω; when P is knownor superfluous, we only write ωkij. Usually we implicitly assume to represent theangular velocity as ωiij(P ).

Whit this notation it is always true that

ωkjj = 0 ∀j, k (2.130)

since no relative motion will take place, while, in general, it results

ωjij = ωiij ∀i, j (2.131)

provided that Ri and Rj are non-coincident. Moreover, given the vectorial natureof the velocity, the additive property holds

ωkim + ωkmj = ωkij

from which we have the following property

ωkij + ωkji = ω

kii = 0 ⇒ ωkij = −ωkji (2.132)

that says that, if a reference frame rotates with an angular velocity with respect toanother reference frame, this last rotates with an angular velocity that is the samein module, but opposite in direction to the previous one.

Now assume that x 1 is the representation of the geometrical point A in R1 and thatthis frame roto-translates with respect to R0; assume also that R0

1(t) and t01(t) arerespectively the rotation matrix and the translation vector between the two frames,and that ω0

01(P ) is the relative angular velocity ofR1 with respect toR0, representedin R0. We have still to define the application point P ; it varies according to thefact that the rotation takes place with respect to the fixed frame R0 axes, or withrespect to the translated local frame R1 axes. In the following we will consider thetwo cases.


Case A

We consider here a roto-translation described by

T 01 = T (d 0

1)T (R01) (2.133)

that represents a translation d 01, followed by a rotation R0

1, with instantaneousrotational velocity ω0

01, performed with respect to the mobile frame (post-mobilerule).

The motion defined in (2.133) allows the following alternative description: “a rota-tion followed by a translation performed with respect to the fixed frame (pre-fixedrule)”.

In this case we can state that the application point P of the applied vector(P,ω) isthe origin of R1.

In this case the vector r 1(t) represented in R1 is described in R0 by:

r 0(t) = R01(t)r

1(t) + d 01(t); (2.134)

Computing the time derivative of this identity, recalling (2.113) and omitting forbrevity to write the time dependence in the various terms, we have:

r 0 = R0

1r1 +R0

1r1 + d

0

1

= S (ω001)R

01r

1 +R01r

1 + d0

1

= ω001 ×

(R0

1r1)+R0

1r1 + d

0

1

(2.135)

If now we define

ρ0(t) = R01r

1(t) (2.136)

we have

r 0(t) = ω001(t)× ρ0(t)︸︷︷︸

A

+R01r

1(t)︸︷︷︸B

+ d0

1(t)︸︷︷︸C

(2.137)

This is the classical relation that gives the total linear velocity in R0 of a point Pthat moves with a linear velocity r 1(t) in the reference system R1, itself moving at

a linear velocity d0

1(t) and angular velocity ω001(t) applied at the origin of R1.

The three terms are: A) the tangential velocity due to the rotational velocity of theframe; B) the proper velocity of the point, represented in R0; C) the translationalvelocity of the frame.

The acceleration r 0(t) of the point P expressed in R0 can be computed consideringthe identity:

d (a × b)

dt=

(da

dt× b

)+

(a × db

dt

)


Taking the time derivative of r 0(t) in (2.137), using (2.110) and (2.136), and omittingagain to write the time dependence, we obtain:

r 0(t) =d

dt(ω0

01(t)× ρ0(t)) +d

dt

(R0

1r1(t))+

d

dtd

0

1(t)

= ω001 × ρ0 + ω0

01 × r 1 + R0

1r1 +R0

1r1 + d

0

1

= ω01 × ρ0 + ω0

01 ×d

dt

(R0

1r1)+ R

0

1r1 +R0

1r1 + d

0

1

= ω001 × ρ0 + ω0

01 ×(R

0

1r1 +R0

1r1)+ R

0

1r1 +R0

1r1 + d

0

1

= ω001 × ρ0 + ω0

01 × (ω001 × ρ0) + ω0

01 ×(R0

1r1)+ R

0

1r1 +R0

1r1 + d

0

1

= ω001 × ρ0 + ω0

01 × (ω001 × ρ0) + ω0

01 ×(R0

1r1)+ ω0

01 ×(R0

1r1)

+R01r

1 + d0

1

= ω001 × ρ0︸︷︷︸

A

+ ω001 ×

(ω0

01 × ρ0)︸︷︷︸

B

+ 2ω001 ×

(R0

1r1)︸︷︷︸

C

+ R01r

1 + d0

1︸︷︷︸D

(2.138)where A) is the tangential acceleration; B) is the centripetal acceleration; C) is the

Coriolis acceleration; D) is the sum of the linear acceleration of the point representedin the fixed frame and the acceleration of the frame (already represented in R0).

Case B

We consider here a roto-translation described by

T0

1(t) = T(R0

1

)T(d 0

1

)(2.139)

that represents a rotation R01 with instantaneous rotational velocity ω0

01, followedby a translation d 0

1 performed with respect to the mobile frame (post-mobile rule).

The motion defined in (2.139) allows the following alternative description: “a trans-lation followed by a rotation performed with respect to the fixed frame (pre-fixedrule)”.

In this case we can state that the application point P of the applied vector(P,ω) isthe origin of R0.

The point r 1 is described in R0 by:

r 0(t) = R01(t)r

1(t) +R01(t)d

01(t); (2.140)

Taking the time derivative and omitting again to write the time dependence, weobtain:

r 0 = R0

1r1 +R0

1r1 + R

0

1d01 +R0

1d0

1

= S(ω001)R

01

(r 1 + d 0

1

)+R0

1r1 +R0

1d0

1

= ω001 × (R0

1r1 +R0

1d01) +R0

1r1 +R0

1d0

1

(2.141)

If now we defineρ0 = R0

1r1 +R0

1d01 (2.142)


we haver 0 = ω0

01 × ρ0 +R01r

1 +R01d

0

1 (2.143)

that is identical to (2.137), with R01d

0

1 instead of d0

1. Taking the time derivative of(2.143) we obtain the total acceleration:

r 0(t) =

ω01 × ρ0︸︷︷︸

A

+ω001 ×

(ω0

01 × ρ0)︸︷︷︸

B

+2ω001 ×

(R0

1r1 +R0

1d0

1

)︸︷︷︸

C

+R01r

1 +R01d

0

1︸︷︷︸D

(2.144)

The formula has the A), B), C) and D) terms identical with those in (2.138),the only difference being the translational velocity and acceleration of the frame R1

expressed in the fixed reference R0.

In conclusions, the total acceleration of a point that moves in a local reference frame,that itself moves with respect to a fixed frame, is the sum of the following terms(time dependency omitted for brevity):

Tangential accelerationDue to rotational acceleration of the local frame with respect to the fixed frame:

A = ω001 × ρ0

where ρ0 is the distance of the point P from the origin of the fixed frame.

Centripetal accelerationDue to the variation of the tangential acceleration, that produce an accelerationdirected toward the instantaneous rotation center:

B = ω001 ×

(ω0

01 × ρ0)

Coriolis accelerationDue to the linear velocity in a rotating frame:

C = 2ω001 × v 0

wherev 0 = R0

1r1 or v 0 = R0

1(r1 + d

0

1)

is the total linear velocity;

Linear accelerationDue to the linear acceleration of the point P and of the reference frame:

a0 = R01r

1 + d0

1 or a0 = R01(r

1 + d0

1)

Notice that the symbol ρ0 has a different meaning in the two cases, as put in evidenceby (2.136) and (2.142).

Documents

Chapter 2 Reference Frames and Roto-translations...m that is used to represent P. The index m is omitted whenever there are no ambiguities. Given two reference frames R 1 e R 2, the