Calculus and Differential Geometry ... - …faculty.mansfield.edu/hiseri/book-cdg.pdfCalculus and Diﬀerential Geometry: An Introduction to Curvature Donna Dietz Howard Iseri Department

Calculus and Differential Geometry:

An Introduction to Curvature

Donna Dietz

Howard Iseri

Department of Mathematics and Computer Information Science,Mansfield University, Mansfield, PA 16933

E-mail address : [email protected]

Contents

Chapter 1. Angles and Curvature 11. Rotation 12. Angles 33. Rotation 44. Definition of Curvature 65. Impulse Curvature 8

Chapter 2. Solid Angles and Gauss Curvature 111. Total curvature for cone points 112. Total curvature for smooth surfaces 133. Gauss curvature and impulse curvature 144. Gauss-Bonnet Theorem (Exact exerpt from Creative Visualization

handout. 155. Defining Gauss curvature 166. Intrinsic aspects of the Gauss curvature 19

Chapter 3. Intrinsic Curvature 211. Parallel vectors 21

Chapter 4. Functions 251. Introduction 252. Piecewise-Linear Approximations for Functions of One Variable 253. Uniform Continuity 274. Differentiation in One Variable 295. Derivatives and PL Approximations 336. Parametrizations of Curves 357. Functions of Two Variables 378. Differentiability for Functions of Two Variables 37

Chapter 5. The Riemannian Curvature Tensor in Two Dimensions 471. Parametrizations 48

Chapter 6. Riemannian Curvature Tensor 531. The Riemannian Metric for a Plane 532. The Riemannian Metric for Curved Surfaces 563. Curvature 604. The Inverse of the Metric 62

Chapter 7. Riemannian Curvature Tensor 631. Intrinsic Interpretations 63

3

4 CONTENTS

Chapter 8. Curvature of 3-Dimensional Spaces 691. What we know 692. What is the geometry like around a vertex of a cubed 3-manifold? 693. A positive curvature example 69

CHAPTER 1

Angles and Curvature

0.1. Overview. As you walk around a closed path (along a simple closedcurve on the floor), the direction you are facing will make a net rotation of 2πradians or 360◦.

1. Rotation

Imagine a circle drawn on the floor (the radius might be ten feet). You areto walk around the circle once in a counter-clockwise direction. If you are initiallyfacing north, you will soon be facing north-west and then west. We can naturallysay that the direction in which you are facing has changed by 90◦ or π

2 radians.After that, you will face south, then east, and finally north again. The directionin which you are facing has experienced a rotation of 360◦. We will want to thinkof this rotation as describing how the direction you are facing has changed asopposed to your change in location as you make an orbit around the circle.

For a curve in the plane, we can talk about the rotation of a tangent vector inthe same way that we have talked about the rotation of our body as we walk alonga curve drawn on the floor. Intuitively at least, we would like to identify these twoconcepts. That is, what we discover about one should apply equally to the other.

Throughout this book, we will use the convention that counter-clockwise rota-tions are positive. For example, if you were to turn 45◦ to the left and then 90◦ tothe right, the net rotation would be −45◦.

A

B

C

Figure 1. Walk along this path marked on the floor. (Exercise 1)

1

2 1. ANGLES AND CURVATURE

1.1. Exercises.

1. Suppose you are walking around the curve shown in Figure 1 in a counter-clockwise direction. Assume that the curve is smooth (the direction varies smoothly)and that the direction you are facing is the same as that of a tangent vector. Howdoes the direction you face change as you move from the starting point A to thepoint B? From B to C? From A to C? What is your total (net) rotation for theentire circuit?


2. What would your total rotation be as you walked in the direction indicatedaround the path shown in Figure 2?


3. What would your total rotation be as you walked in the direction indicatedaround the path shown in Figure 3?

4. Make a conjecture about the net rotation of a tangent vector moving arounda simple closed curve in the plane in a counter-clockwise direction.

5. Make a conjecture about the net rotation of a normal vector moving arounda simple closed curve in the plane in a counter-clockwise direction. Does it makea difference whether the normal vector is pointing outward or inwards? Are thereother directions that a normal vector can point?

1.2. Overview. Angles are abrupt changes in direction. Total curvature isthe net change in direction over some section of a curve or polygonal path.

2. ANGLES 3

2. Angles

One of the most important theorems in Euclidean geometry states that thesum of the angles of a triangle is 180◦. Virtually all of the theorems that involveangle measure or parallelism can be proved with this fact. Among these would bethat the angle sum of a quadrilateral is 360◦, the angle sum of a pentagon is 540◦,the angle sum of a hexagon is 720◦, and in general,

Theorem 1. The angle sum of a (convex) n-gon is (n− 2) · 180◦

100◦

95◦

95◦

70◦

Figure 4. The turning angles for a quadrilateral.

This is all very nice, but the sequence of theorems just mentioned can berestated more simply and intuitively in terms of the turning angle or angledefect. The reason for using the term turning angle should become clear, andangle defect refers to the idea that the turning angle measures how far the angle isfrom being a straight angle. In Figure 4, a quadrilateral is shown with the turningangles marked. You should imagine yourself walking around the quadrilateral ina counter-clockwise direction. The turning angles then measure the amount youmust turn to your left as you start the next edge. In this case, the sum of theturning angles is 360◦. If you imagine yourself walking around any closed path,taking left turns, and coming back to your original position, you must have rotateda full 360◦. This should agree completely with your answers to the exercises in theprevious section. It seems reasonable, therefore, that the sum of the turning anglesis 360◦ for any polygon. This is in fact true, and Theorem 1 can be restated as

Theorem 2. The turning angle sum of a (convex) n-gon is 360◦.

It is not necessarily true that Theorem 2 is a better theorem than Theorem 1,but it is certainly simpler and more intuitive. The angle sum theorem is probablymore convenient for analyzing geometric figures, but we are wanting to understandcurvature, and the turning angle sum theorem sets us off in the right direction.

2.1. Exercises.

6. Theorem 1 states that the angle sum of an n-gon is (n − 2)180◦ or n − 2times the angle sum of a triangle. Draw a figure illustrating that a convex pentagonhas the angle sum of three triangles. Do the same for a hexagon.


7. Suppose the quadrilateral of Figure 4 is drawn on the floor with up in thepicture corresponding to north, and you are to walk around it in the counter-clockwise direction. Draw a picture of the face of a compass, and for one of thesides, draw the position of the needle corresponding to the direction you are facingas you walk along it. On the same picture, draw the needle positions correspondingto the other three sides. At each vertex, you would need to pivot as you finishwalking along one side of the quadrilateral and start on the next. In your picure,for each vertex, indicate which directions you sweep through as you turn to the left.

3. Rotation

Our goal is to formulate definitions in differential geometry. Before we do thatfor curves in the plane, let us summarize what we have so far.

Given an object moving in a counter-clockwise direction around a simple closedcurve, a vector tangent to the curve and associated with the object must make a“full” rotation of 2π radians or 360◦. In other words, if we were to think of thistangent vector (of if you wish, a copy of it) as having its tail fixed at the origin,then as the object moves around the curve, the tangent vector will sweep throughall possible directions. This rotation of the tangent vector will be predominantlyin the counter-clockwise direction, but it may, for example, sweep clockwise for abit, come back counter-clockwise an equal amount, and then continue on. Theseclockwise rotations are always countered by an extra counter-clockwise rotation,and the total net result is always 360◦ of counter-clockwise rotation.

If the curve is smooth (whatever that means), we can easily describe a tangentvector in terms of a derivative. There are some difficulties at non-smooth parts ofa curve. At the corners of a quadrilateral, for example, a derivative will not specifya unique tangent direction. In this case at least, we will be able to find a tangentdirection entering the vertex and one leaving. We can and will account for thedirections swept through as we pivot from one direction to the other, and we willavoid curves that are “less smooth” than this.

In order to motivate the definitions describing rotations in terms of derivatives,we will consider the following. Looking at the unit tangent vector as we movearound a vertex of a polygonal path, we see that the direction of the tangent vectorstays the same, pivots through some angle θ at the vertex, and then again remainsthe same until another vertex is encountered. An example of this is illustrated inFigure 5, and in this picture the angle θ will be positive. Later, we will be interestedin understanding curvature in higher dimensions, and it will be more convenientto speak in terms of a unit normal vector rather than a unit tangent. For a curvein the plane (we will assume that polygonal paths are curves) a unit normal to acurve will experience the same changes in direction that a unit tangent will. Theunit normal to the same curve shown in Figure 5 will also sweep through the sameangle θ, as shown in Figure 6. As described earlier, the rotation is a measure ofhow the direction of the unit tangent or unit normal vectors changes. If we takethe unit normal at each point of the curve, and put its tail at the origin, the headof the vector will stay on the unit circle and serve as a “direction-o-meter,” asshown in Figure 7. As we move along the curve, the will stay fixed until we reachthe vertex, and then it will swing over to the left as we pass through the vertex.Formally, it is common to associate each point on the curve with a point on theunit circle determined by the unit normal in this way. It is called the Gauss map,

3. ROTATION 5

θ

Figure 5. Following the unit tangent vectors around a vertex withturning angle θ.

θ

Figure 6. Following the unit normal vectors around a vertex withturning angle θ.

and this will be something that we will be able to differentiate in a meaningful way.In order to perform this differentiation, we need to consider a situation where the

θ

Figure 7. The unit normal vectors moved to the origin.

direction of the normal vector changes over some interval, and not all at once. Ifwe take the polygonal curve we have been using and “smooth” it out, the changein direction is spread out over the curve, as shown in Figure 8. In particular, notethat the total change in direction is the same, the positive angle θ, it is only thatthe direction-o-meter swings to the left more gradually.

The total rotation, which we will call the total curvature, is a quantitity thatapplies to both polygonal curves and smooth ones. With the smooth curves, how-ever, we can also talk about the rate of rotation (it does makes some sense to saythat the rate of rotation at the vertex of the polygonal curve is infinite). There aretwo quantities that are natural candidates to which we will compare the rotation,


θ

Figure 8. Following the unit normal vectors around a vertex withturning angle θ.

time and distance. It makes sense to say that if a given amount of rotation takesplace over a very short distance, then the curve must be very sharply curved, andconversely, the curve is not so sharply curved, if the the rotation takes place over alonger distance. Therefore, if we simply divide,

(1) average rate of rotation =total rotation

distancewe get a reasonable measure of how much a curve curves. We will call this averagerate of rotation, average curvature. The next logical step is to take a limit as thedistance approaches zero, and this suggests a definition for curvature. If ρ is aquantity measuring rotation, and s is an arclength parameter, then the curvatureκ should be defined

(2) κ =dρ

dsIn the next section, we will express this definition more formally as a formula.In particular, we need to find a function corresponding to ρ. This formula willcorrespond exactly to the one given in calculus classes. Before we do that, however,we can check a simple case.

The circumference of a circle of radius r is 2πr. The total rotation is 2π radians.Radians are more natural in this context than are degrees, but degrees would workOK. If we assume that the curvature of a circle is constant, then the curvatureshould be same as the average curvature. The curvature must be, therefore,

(3) κ =2π2πr

=1r,

and this agrees with the calculus definition of curvature. The point of this book isto show that the definitions for the curvature of surfaces and of three-dimensionalspaces can be motivated in an analogous way.

4. Definition of Curvature

We are in search of a function that measures the angle of rotation for the unitnormal vector, or equivalently, the unit tangent vector. In terms of the Gaussmap, the head of the unit normal vector always lies on the unit circle. Therefore,the derivative of the unit normal vector must always be tangent to the unit circle.This is a manifestation of the fact that the derivative of a vector function that hasconstant magnitude is always perpendicular to the original vector function. Twonotions point the way. First, over small distances, the arc of a circle near a point

4. DEFINITION OF CURVATURE 7

on the circle and the tangent line through that point are very similar. Second, thelength of an arc of the unit circle is equal to the corresponding angle measured inradians. Therefore, a derivative of the unit normal vector measures change alonga tangent to the unit circle (as in the Gauss map), this change is essentially thesame as the change along the unit circle, which is equal to a change in the directionof the normal vector measured in radians. In other words, the conclusions of thelast section suggest that the curvature can be defined as the derivative of the unitnormal vector with respect to arclength. It can also be defined as the derivative ofthe unit tangent vector with respect to arclength. That is,

(4) κ(s) =∥∥∥∥dnds

∥∥∥∥ =∥∥∥∥dTds

∥∥∥∥ .

4.1. Exercises.

8. Show that the derivative of the unit normal vector is perpendicular to unitnormal vector. Use the fact that the unit normal vector has constant magnitude,i.e., ‖n‖ = 1, and that the “dot product rule” looks like the product rule fromcalculus, i.e., d

ds 〈 x,y 〉 =⟨

x, dyds

⟩+

⟨dxds ,y

⟩.

9. The derivitive of the unit normal vector expresses a rate of change alonga tangent to the Gauss map circle. I claimed that this rate of change could beinterpreted as a rate of change of direction in terms of an angle measured in radians.Show that this interpretation is a reasonable one by showing that for the quantitiesshown in Figure 9 dτ

dθ (0) = dτdρ (0) = dρ

dθ (0) = 1. Hint: Don’t work too hard. Justuse trigonometry to express each as a function of one of the others.

θ

τ

ρ

1

Figure 9. Comparing the quantities measured along the tangentline, along the unit circle, and the angle at the center of the circle.

10. Suppose an object is moving along a curve, and at a point P on the curve,the derivative of the unit normal vector with respect to time is dn

dt = [ 3, 2 ] (thisis a velocity for the head of the vector n in terms of the Gauss map in feet persecond perhaps). We could say that the unit normal vector is rotating at a certainrate measured in radians per second. What is this rate? Suppose that the velocityof the object as it passes through P is v = [ 6, 4 ]. What is its speed (a.k.a. ds

dt )?What is the curvature of the curve at the point P ? Hint: You can use the chainrule, if you want.

11. The vector function x(t) =[t, t2

]is a parametrization of a parabola. Find

the curvature of the curve at the points corresponding to t = 0 and t = 2.


5. Impulse Curvature

We can define curvature for smooth curves, but this definition will not work forcurves with sharp corners in them. The notion of total curvature applies to bothcases, however. We can develop a notion of curvature that works for corners that wewill call impulse curvature. Let us look closely first at a total curvature function, andhow total curvature and (instantaneous) curvature are related through derivativesand integrals. Consider the smooth curve of Figure 8. As we move along the curvefrom right to left, the unit normal vector makes an angle with the “positive x-axis”of π

6 radians (in this particular example). In Figure 10, the initial point of the graphcorresponds to this value of ρ. The curve in Figure 8 is straight initially, so thedirection of the unit normal is constant, and this manifests itself in the horizontalsection of the graph in Figure 10. As we move into the curved section of the curve,the unit normal vector begins to rotate counter-clockwise, so the angle ρ increases,and this is reflected in the graph. The angle reaches a maximum as we move intothe final straight portion of the curve, and ρ is again constant, now at a value ofabout 5π

6 radians. We can think of ρ(s) as being a total curvature function, anddifference between the starting and finishing values, θ = 5π

6 − π6 = 4π

6 representsthe total curvature of this section of the curve.

ρ

s

π

θ

Figure 10. Graph of the direction of the unit normal vector forthe smooth curve of Figure 8 with respect to arclength.

If we graph the total curvature function ρ for the polygonal curve of Figure 8,the initial and final values for ρ are the same, but the increase in the value of ρoccurs at a single point on the curve. The graph for ρ is a step function in this case,but the total curvature is still the difference between the initial and final values ofρ. This total curvature function ρ keeps track of the direction of the unit normalvector, and total curvature is the net change in this function. As concluded before,curvature is the derivative of this function, dρ

ds . Therefore, we can see curvature inthe graphs of Figures 10 and 11 as slopes. The slopes are zero on either end ofthe graph of Figure 10, so the curvature of the corresponding parts of the smoothcurve of Figure 8 must also be zero. This agrees with the fact that the ends ofthis curve are straight. The slopes are positive in the middle of the graph, and thiscorresponds with the fact that the middle section of the smooth curve has positivecurvature (positive, since the unit normal vector is rotating counter-clockwise). Onthe other hand, the polygonal curve of Figure 8 is straight everywhere except at thevertex. Therefore, the slopes in the graph of Figure 11 are zero everywhere except

5. IMPULSE CURVATURE 9

ρ

s

π

θ

Figure 11. Graph of the direction of the unit normal vector forthe polygonal curve of Figure 8 with respect to arclength.

at the point in the middle. Here the slope and the curvature at the vertex are bothinfinite. The curvature graphs are shown in Figures 12 and 13.

dρds = κ

s

Figure 12. Graph of the derivative of ρ with respect to s for thesmooth curve of Figure 8.

dρds = κ

s

Figure 13. Graph of the derivative of ρ with respect to s for thepolygonal curve of Figure 8.

Of particular interest is the fact that it is possible to assign a finite value to theinfinite curvature at the vertex of the polygonal curve. More specifically, we willthink of the curvature at this point as being infinite, but if we were to integrateacross this infinite function value, we would obtain a definite finite value, namelythe total curvature. The total curvature at the vertex of the polygonal curve ofFigure 8 is 4π

6 radians, so we will say that the curvature at this point is ∞∗ 4π6 . We

will call this impulse curvature, and the notation will simply remind us that whenwe integrate across such a value, the result will be 4π

6 . For example, for any interval[a, b] containing the undefined point for the function in Figure 13, we would have

(5)∫ b

a

dρ

dsds =

4π6.


5.1. Exercises.

12. Consider a square with sides of length 1. Choose the midpoint of one ofthe sides as a starting point and consider on object moving around the square ina counter-clockwise direction. Let κ(s) be the curvature function with respect toarclength from this starting point. Compute each of the following. κ(0). κ

(12

).∫ 1

0 κ(s) ds.∫ 4

0 κ(s) ds.

13. Consider the graph y = | sin(x)|. What is the curvature at the point (0, 0)?

CHAPTER 2

Solid Angles and Gauss Curvature

1. Total curvature for cone points

The goal here is to generalize our notions of curvature to surfaces. This canbe done in a number of ways, but our intention will be to eventually end up withan intuitive understanding of the Gauss curvature. In the previous chapter, thecurvature of a curve was obtained by extending the notion of the turning anglefor the vertex of a polygonal curve. This suggests, perhaps, that we first considerthe vertex of a polyhedral surface. If we imagine the vertex of a pyramid, the3-dimensional region interior to that vertex can be compared to the 2-dimensionalregion inside an angle. This solid angle parallels the notion of a (plane) angle in anumber of ways.

It was the turning angle, however, that became the total curvature. One wayof measuring this was to consider the length of the arc on the unit circle that allof the possible unit normal vectors swept out at the vertex in terms of the Gaussmap. These normals were perpendicular to the tangent lines through the vertexthat were outside of the angle. For the vertex of the pyramid, we could considerall tangent planes through the vertex outside of the pyramid and the unit vectorsnormal to these planes. Under the Gauss map (now to the unit sphere instead ofthe unit circle), the heads of these normals would sweep out a region on the unitsphere. The area of this region would be a natural candidate for the total curvature(and also the impulse curvature) at this vertex. This works amazingly well, but itis a bit simpler to look at a cone.

We can make a cone out of a piece of paper by removing a wedge and taping theedges together. Let us suppose that we remove a wedge with angle θ as in Figure1. A circle of radius R is shown in Figure 1, and θ radians have been removed fromthe circle as well (an arc of length Rθ). After joining the edges, we get somethinglike the cone in Figure 2. The circle, as a circle on the cone, still has radius R, andthe radius is measured to the vertex of the cone. As a curve in space, it also has asmaller radius r. The angle between the radius of length r and the surface of thecone is marked φ, and the angle between the central axis and the normals to thesurface of the cone is also φ. The angle between the central axis and the surface ofthe cone is marked ψ.

We are interested in computing the area of the region on the unit sphere cor-responding to the normal vectors at the vertex. The normal vectors to the surfaceof the cone determine a circle on the unit sphere under the Gauss map as shown inFigure 2. This circle separates the sphere into two pieces, and we are interested inthe area of the upper one.

To find φ, consider the circle of radius R (shown in Figure 1). After removingthe wedge and joining the edges, this circle becomes a circle on the cone. It has

11

12 2. SOLID ANGLES AND GAUSS CURVATURE

θR

Figure 1. Remove a θ-wedge to construct a cone.

φ

ψ

rφ

R

Figure 2. A cone with total curvature θ.

radius R along the surface, and it has radius r in R3. We can, therefore, compute itscircumference two ways. We have C = 2πr, as a circle in space, and C = (2π−θ)R,as a circle on the cone having been formed by removing a θ radian wedge. Theright triangle shown in Figure 2 with hypotenuse R and base r has an angle φ, so

(6) cosφ =r

R=

2π − θ

2π.

Equation (6) determines φ in terms of the angle θ. We can determine the area sweptout by the normals at the vertex under the Gauss map in terms of φ easily usingφ̄, θ̄, and ρ̄ as spherical coordinates for the sphere. Here, 0 ≤ φ̄ ≤ φ, 0 ≤ θ̄ ≤ 2π,

2. TOTAL CURVATURE FOR SMOOTH SURFACES 13

and ρ = 1. The desired area is then∫ 2π

0

∫ φ

0

ρ̄2 sin φ̄ dφ̄dθ̄ =∫ 2π

0

∫ φ

0

sin φ̄ dφ̄dθ̄

=∫ 2π

0

1 − cosφ dθ̄

= 2π(1 − cosφ).

(7)

The total curvature at the vertex is therefore 2π(1 − cosφ). Quite remarkable isthe fact that this total curvature is precisely θ, the measure of the wedge removedto form the cone. This can be seen by solving equation (6) for θ.

Definition 1. The impulse curvature, or total curvature, at the vertex of acone is the area swept out by the unit normal vectors at the vertex under the Gaussmap.

Theorem 3. The total curvature at the vertex of a cone is equal to the angleof the wedge removed to construct it.

1.1. Exercises.

14. Show that the results are the same if we used a pyramid instead of acone. Note that there are only four different unit normals obtained from the lateralsurfaces of the pyramid. The rest of the boundary of the region on the unit spherecome from those tangent planes that contain an edge leading into the vertex. Thenormals would be those unit vectors perpendicular to the edge between the normalsfor the faces. The normal vectors at the vertex are normal to planes through thevertex that lie outside of the pyramid.

15. What is the total curvature of any region of the cone not containing thevertex? (Note: a curve on the surface of the unit sphere has no area.)

2. Total curvature for smooth surfaces

If v is the vertex of a cone, then all of the area on the unit sphere under theGauss map comes from unit normal vectors at the vertex. If we were to smooth thevertex, as in Figure 3, then these unit normals will be spread out over the smoothsurface, and there will only be one unit normal at each point of the surface, butthe area under the Gauss map would be the same, since we have precisely the samecollection of unit normals. Therefore, smoothing the vertex should not change thetotal curvature, and the geometry of the surface near the circle shown is exactly thesame as the geometry on the cone. With this notion of total curvature for surfacesdescribed intuitively, we can define an instantaneous curvature for smooth surfaces,generally known as the Gauss curvature, as we did with smooth curves.

2.1. Exercises.

16. What is total curvature of a sphere? A cube? A tetrahedron?

17. It is possible to find a triangle on the unit sphere that has one vertex at thenorth pole, two vertices on the equator, and three right angles. What is the areaof this triangle? What is the total curvature of the region inside of this triangle?Find a triangle with two right angles and one angle measuring π

4 radians. Whatis the total curvature of the region inside of it? Do you see a relationship betweenthe total curvature within and the angle sums of these two triangles?


r

Figure 3. We have the same total curvature as the cone in Figure2, θ, if we smooth the vertex of the cone.

3. Gauss curvature and impulse curvature

The total curvature of a curve was defined as the length of an arc of the unitcircle under the Gauss map. This was an extension of the idea of a turning angleto curves. The measure of the turning angle, as interpreted through the Gaussmap, can be applied to the vertex of a cone by considering the area a region of theunit sphere under the (spherical) Gauss map. This idea extends to smooth surfacesin the same way as the turning angle extends to smooth curves. We obtained aninstantaneous curvature for curves by taking a limit comparing the length alongthe unit circle with the corresponding length along the curve. We can do the samething here by comparing the area on the unit sphere with the corresponding areaon the surface. This is the notion of curvature of surfaces used by Gauss, and it iscalled the Gauss curvature.

Definition 2. At a point p on a surface S, the Gauss curvature at p is thelimit

(8) K = lim∆A→0

∆Θ∆A

,

where ∆A is the area of some region on the surface containing p and ∆Θ is thetotal curvature of that region.

If we think of the measure of an angle in terms the possible directions of theunit normal vector at a vertex (the turning angle), and then extend this into thecurvature of the curve, then this is the most natural notion of curvature for surfaces,since it is a direct translation of the relevant notions in terms of curves to surfaces.

For computational purposes, this is not the most convenient formula, but thisis probably one of the more intuitive ways to think about what Gauss curvature is.

3.1. Exercises.

4. GAUSS-BONNET THEOREM (EXACT EXERPT FROM CREATIVE VISUALIZATION HANDOUT.15

18. Find the total curvature of a sphere with radius r. What is the Gausscurvature?

4. Gauss-Bonnet Theorem (Exact exerpt from Creative Visualizationhandout.

I do not address the Gauss-Bonnet theorem in any of the labs, but after thestudents have completed the last lab, I would look at the cone point version of theGauss-Bonnet theorem. From here, the definition for Gauss curvature on a smoothsurface should make sense intuitively.

θr

C

Figure 4. The angle defect corresponds to total curvature.

The basic idea can be seen using circles and spheres. Consider a circle of radiusr centered at the cone point of a cone with angle defect θ, as in Figure 4. In theplane, this circle will have curvature κ = 1

r . Since the local geometry on the coneis Euclidean away from the cone point, the geodesic curvature for this circle as acurve on the cone must be the same. That is, κg = 1

r . What is different about thiscircle and a circle in the plane with the same radius, is that the circle on the conehas a smaller circumference. In fact, the difference must be θr.

We can now compute the total geodesic curvature.

(9)∫

C

κg ds =1r

∫

C

ds =1r(2πr − θr) = 2π − θ.

Since curvature measures the rate of rotation of the tangent vector, it should makesense to students that the total rotation for a simple closed curve in the plane mustalways be 2π. Since any small deformation of the circle essentially takes place inthe plane, it should also make sense that the total rotation for a simple closed curvearound the cone point will always be 2π minus the angle defect. In any case, theformulation of the Gauss-Bonnet theorem should seem natural.

Comparing Equation (9) to the Gauss-Bonnet theorem,

(10)∫

C

κg ds = 2π −∫

R

K dA,

it’s obvious that the angle defect corresponds with the total curvature∫K dA. In

fact, I think it makes perfect sense to motivate the definition of the Gauss curvatureK in terms of this formula. I might start out by doing the following.

Consider a sphere tangent to a cone, as shown in Figure 5. The geodesiccurvature for the circle of tangency will be the same on both surfaces. Therefore,the total curvature for the regions contained by the circle on both surfaces shouldbe the same. We can then require that the Gauss curvature be an infinitesimal


R

r

C

φ

Figure 5. The circle of tangency will have the same geodesic cur-vature on both surfaces.

version of the total curvature and that it be constant on the sphere. That is,

(11) θ =∫

D

K dA = K

∫

D

dA = KR2θ,

and

(12) K =1R2

.

I think the actual computation is a bit tricky, but there may be a simpler way. Inany case, the area integral is

(13)∫

D

dA =∫ 2π

0

∫ φ

0

R2 sin p dpdt = R2(1 − cosφ)2π,

where the parameters p and t are the phi and theta from spherical coordinates. Toexpress this expression in terms of θ, note that the circumference of the circle C is2πr−θr on the cone. If the radius of this circle in space is ρ, then this circumferenceis also 2πρ. Since R sinφ = ρ, we have that

(14) 2πr − θr = 2πR sinφ,

and

(15) θ = 2π(1 − R

rsinφ).

Now, tanφ = rR , so

(16) θ = 2π(1 − cosφsinφ

sinφ) = 2π(1 − cosφ).

Equations (13) and (16) establish equation (11).

5. Defining Gauss curvature

The Gauss curvature at a point on a surface is generally defined to be theproduct of the two principle curvatures. Very roughly, this can be described asfollows. At a point on a surface in space, we can choose one of two possible unitvectors normal to the surface (one normal is as good as the other). For every planethat contains the point and the normal vector, the intersection of the plane andthe surface is a curve that has a curvature within that plane. If the curve bendstowards the normal vector, we will associate a positive sign with this curvature,and if the curve bends away from the normal vector, we will associate a negativesign. In other words, if the normal vector chosen points upwards and the curveis concave up, then the curvature will be positive. These signed curvatures are

5. DEFINING GAUSS CURVATURE 17

called normal curvatures. The maximum normal curvature (most positive) andthe minimum normal curvature (most negative) are the principle curvature. Thechoice of normal vector and the which curvatures are positive is quite arbitrary,but of significance is that the Gauss curvature of a bowl-shaped surface will alwaysbe positive, and the Gauss curvature of a saddle-shaped surface will always benegative, regardless of how the choices were made.

That this is as simple a definition for the curvature of a surface as could beexpected is one thing, and that it works incredibly well is made very clear in anybook on the subject. What is not so clear is why anyone would consider thedefinition in the first place and what it really represents. What we will do here isto show that this definition is a rigorous implementation of the definition we havealready described and how the previous definition leads to this one.

A lot of insight into what Gauss curvature is can be obtained by examiningthe connection between the intuitive definition given earlier and the one involvingthe principle curvatures. We will start with the intuitive definition of the Gausscurvature at a point. This was expressed in Equation (8). The biggest problemwith this formula is that it does not say how ∆A goes to zero. Different values forK can be obtained, if there are no restrictions. We will want to choose the mostboring limit possible. Sufficient for our purposes, we can take a small sphere inspace centered at the point P . Each point on the surface contained in the sphere(this region has area ∆A) has a normal vector, and thus an image under the Gaussmap. These Gauss map images will determine a region on the unit sphere having awell-defined area (if the surface is sufficiently smooth, which we will always assumeis the case), and this area is ∆Θ. We can then take the limit as the radius ofthe sphere about P goes to zero. If the surface is sufficiently smooth, this limitshould exist, and we will assume that all surfaces under consideration are sufficientlysmooth, unless otherwise noted.

As it stands, this definition is non-trivial to apply directly, so we will formulatean alternative in terms of derivatives. For one of the small regions on the surfaceabout P contained in the small sphere, the region should be roughly disk shaped,and we can imagine it as consisting of a bouquet of radial arcs. The normal vectorat P will determine one point on the unit sphere under the Gauss map. The normalvectors from the points on the radial arcs will determine arcs on the unit spherealso under the Gauss map. Of relevance is the fact that the length of the arc onthe unit sphere under the Gauss map divided by the length of the radial arc on thesurface will limit on the curvature of the radial arc at P . Also of relevance is theobservation that the area ∆Θ is determined by the extent of these arcs. It wouldseem reasonable to assume, therefore, that the limit of Equation (8) will dependonly on the curvatures of arcs through P . The one important assumption that wewill make in the formulation of the alternative definition of the Gauss curvature isthat it depends only on information provided by first and second derivatives.

Suppose we have a point P on a surface in space, and we will define the Gausscurvature of the surface at P . The curvature is independent of the surface’s positionand orientation in space, so we will assume that the point P is at the origin and thesurface is tangent to the xy-plane. In a region about P , we will assume that thesurface can be described as the graph of a function f(x, y), and since the curvaturedepends only on first and second derivatives, we will only consider surfaces thatensure that f has continuous first and second derivatives (i.e., f is C2). Since we


will only use information from the first and second derivatives at P , we can alsoassume that f is quadratic, f(0, 0) = 0, fx(0, 0) = 0, and fy(0, 0) = 0. Therefore,f must take the form

(17) f(x, y) = ax2 + bxy + cy2.

It will be convenient to use vector notation and terminology, so we will work withthe parametrization

(18) x(x, y) =[x, y, ax2 + bxy + cy2

].

The first (partial) derivatives, dxdx = [ 1, 0, 2ax+ by ] and dx

dy = [ 0, 1, bx+ 2cy ], arevectors tangent to the surface, and at each point, these two vectors span a planetangent to the surface at that point. All vectors tangent to the surface at this pointwill lie in this plane. That dx

dx (0, 0) = [ 1, 0, 0 ] and dxdy (0, 0) = [ 0, 1, 0 ] reiterate the

fact that the surface is tangent to the xy-plane.The unit normal vector at each point of the surface must, essentially by def-

inition, be perpendicular to the tangent plane. It must be perpendicular to bothtangent vectors, and so can be obtained from the cross product.

(19) n =[ −2ax− by,−bx− 2cy, 1 ]√

b2x2 + 4bcxy + 4c2y2 + 4a2x2 + 4abxy + b2y2 + 1

We are interested in how much the unit normal vector varies over a small piece ofthe surface about the origin, and then how it shrinks to zero. The unit vector nranges over a region on the unit sphere, and what we want is essentially a derivativeof n over two dimensions. The appropriate object is a linear function associatedwith a tangent plane. In particular the plane determined by the partial derivativesof n. These partial derivatives are a bit messy, but we only need to know them at(0, 0). The partial with respect to x is

(20)dndx

=

√b2x2 + 4bcxy + 4c2y2 + 4a2x2 + 4abxy + b2y2 + 1 [ −2a,−b, 0 ]

√b2x2 + 4bcxy + 4c2y2 + 4a2x2 + 4abxy + b2y2 + 1

2

−[ −2ax− by,−bx− 2cy, 1 ] ( 1

2 )(2b2x+ 4bcy + 8a2x+ 4aby)√b2x2 + 4bcxy + 4c2y2 + 4a2x2 + 4abxy + b2y2 + 1

3 ,

which at (0, 0) is

(21)dndx

(0, 0) = [ −2a,−b, 0 ] .

Similarly,

(22)dndy

(0, 0) = [ −b,−2c, 0 ] .

The partial derivatives dndx and dn

dy describe the linear approximation to how the unitvector n varies near the origin. For a short distance ε in the x-direction, therefore,the unit normal vector moves approximately a distance [ −2aε,−bε, 0 ], and forthe same distance in the y-direction, it moves approximately [ −bε,−2cε, 0 ]. Thiscompletely determines the linear approximation, so an ε-square on the xy-planecorresponds to a “parallelogram” on the unit sphere under the Gauss map spanned

6. INTRINSIC ASPECTS OF THE GAUSS CURVATURE 19

by these vectors. The area of this parallelogram is given by the cross product

(23)

∣∣∣∣∣∣

i j k−2a −b 0−b −2c 0

∣∣∣∣∣∣=

∣∣∣∣−2a −b−b −2c

∣∣∣∣

This determinant describes how areas under the Gauss map compare to areas inthe domain near (0, 0), and so this should define the Gauss curvature.

Note that the matrix

(24)[−2a −b−b −2c

]

completely describes the linear approximation to the normal vector at (0, 0). As apoint passes through the origin, this matrix describes how the corresponding normalvector is changing at the origin. For example, if we move in the direction of [ 1, 0 ]from the origin, then the direction of the unit normal to the surface is changing atthe following (vector) rate.

(25)[−2a −b−b −2c

] [10

]=

[−2a−b

]

This is almost, but not quite, a curvature. Specifically, if we considered the curveon the surface above the x-axis, we would have a parabola, z = −2ax, and thiscurve corresponds to the direction determined by the vector [ 1, 0 ]. The curvaturefor this curve is 2a at the origin, but this comes from the rotation of the normal tothe curve in the xz-plane. The normal to the surface may rotate in the y-directionas well, as indicated by the component −b.

5.1. Exercises.

19. Determine the magnitude of the tangent vector dxdx (x, 0), and then differ-

entiate with respect to x to verify the claim made above.

20. Verify the claims above that the curvature at the origin of the curve aboveor below the x-axis has curvature 2a at the origin by computing dT

ds where T =dxdx (x,0)

‖dxdx (x,0)‖ .

6. Intrinsic aspects of the Gauss curvature

In the discussion leading to the definition of the Gauss curvature, we stumbledacross a surprising relationship. If we remove a θ-wedge to form a cone, then thetotal curvature of the vertex of that cone is also θ. Imagine that you are a 2-dimensional person living on the surface of the cone, who is completely unaware ofa third dimension. Without a concept of a third dimension, the sharpness of thevertex would be completely outside of your experiences, just as concepts requiring afourth dimension lie outside of our 3-dimensional minds. You would perhaps noticethat circles around the vertex have smaller circumferences than circles that did notcontain the vertex. From this you might be able to see that the vertex of the coneonly had radian measure 2π− θ around it, while there are 2π radians around everyother point. We will say that the sharpness of the cone is extrinsic (seen from theoutside), and the fact that a θ-wedge is missing is intrinsic (seen from the inside).

This illustrates an interesting difference between curves and surfaces. The totalcurvature of a curve is purely extrinsic, since a 1-dimensional person living in a curve


would be totally unaware of it. The total curvature of a surface is both extrinsicand intrinsic. It is equally measurable from outside the surface and from within it.This idea, which originates with Descartes and Gauss, is expoited by Riemann andothers, in particular Einstein, to show that while curvature is basically an extrinsicconcept, it is possible to talk about the curvature of our space without there beingmore dimensions.

We can illustrate some aspects of this by looking at the geometry of geodesicson a cone.

CHAPTER 3

Intrinsic Curvature

1. Parallel vectors

Understanding what it means for two vectors in the plane to be parallel ishardly an issue. It is even difficult to explain the concept, since the concept ofparallel vectors seems so obvious. Imagine taking a vector in the plane based at theorigin. If you were to move it to some other point without altering its direction,then few would argue with the claim that the result is a vector parallel to theoriginal. At issue, however, is what it means for the direction to remain the sameand how you would know. For vectors tangent to a sphere, on the other hand,it is impossible in most cases to move the vector in a way that keeps the vectortangent to the sphere and not change its direction. Here the concept of direction istaken from the direction of a vector in Euclidean 3-space, which most of us wouldthink is intuitively clear. If we were to restrict our attention to the surface of thesphere, and make no reference to an ambient space the issue is much less clear. Ifwe were two-dimensional creatures living on the sphere with no awareness of anambient space, we probably would have some notion of moving an object withoutrotating it. This must also not be consistent with the notion of direction in 3-spacementioned above. One possible basis for such a notion is the concept of paralleltransport.

Consider the three vectors shown in Figure 1. The angle between each vectorand the straight line is the same angle θ. This is consistent with our intuitivenotion that all three vectors are parallel. We can phrase this as a trivial axiom: Ifwe move a vector along a straight line and keep the angle between the vector andline constant, then the resulting vector is parallel to the original.

θ

θ

θ

Figure 1. Moving a vector without changing its direction.

21

22 3. INTRINSIC CURVATURE

If you were a 2-dimensional creature on the sphere, then a great circle wouldbe the object for you corresponding to a straight line. This would be a curve thatturns neither to the right nor left. In other words, it does not change direction(as far as you are concerned). If you were to move a vector along a great circleat a constant angle, then you must conclude that the vector did not rotate, andthe resultant vector is therefore parallel to the original. If this sphere sat in a3-dimensional Euclidean space (and there is no real reason to assume that it did),then a 3-dimensional Euclidean creature would see this differently. One of thefundamental notions of the study of manifolds is that the 3-dimensional Euclideanview is not necessarily the correct one. It is simply one of many.

One very important aspect of this notion of parallel vectors on the sphere isthat it is dependent on path. We can see this in the following example. Figure2 shows parts of four great circles. One is the equator, two meet the equator atright angles, and a fourth intersects one of the vertical great circles at a right angle.This forms a quadrilateral with three right angles. We know that the angle sumof this quadrilateral must be greater than 2π, so the fourth angle must be largerthan a right angle. In fact, if this sphere has radius 1, then the difference betweenthis angle and a right angle must be equal to the area of the quadrilateral. Thequadrilateral is shown in a flattened version to give a different view in Figure 3.We will perform a parallel transport from vertex A to vertex C two ways. Firstfrom A to B to C, and then from A to D to C. Suppose the vector under questionis tangent to side AD at A and points towards D. It is perpendicular to side AB,so as we parallel transport it to B, it maintain this right angle. This results in avector tangent to BC at B. Parallel transport along BC entails maintaining a zeroangle, and so the resultant vector at vertex C will still be tangent to BC. On theother hand, if we parallel transport to vertex D first, we get a vector tangent to ADat vertex D. This is perpendicular to side DC, so this right angle is maintained aswe slide it upwards to vertex C. The result is a vector at C that is perpendicularto DC. We have, therefore, two vectors at C that have equal claim to beingparallel to the original vector at A. With a Euclidean bias, we might conclude thatthis contradiction proves that parallel transport is a flawed concept. From a moreenlightened manifold view, however, we would just say that parallel transport isindependent of path in Euclidean spaces.

Figure 2. A quadrilateral on the sphere.

1. PARALLEL VECTORS 23

θ

A D

B

C

Figure 3. The quadrilateral flattened.

Even from a fundamentalist Euclidean point of view, the notion of paralleltransport has some value. A few simple calculations show that the angle marked θin Figure 3 is equal to the difference between the angle sum of the quadrilateral and2π. This is equal to the total curvature contained within the quadrilateral. Thisprovides a way of computing total curvature, and is basically equivalent to usingthe angle sum, the turning angles, or the total rotation of a tangent vector.

We are headed towards a way of computing (actually defining) curvature us-ing derivatives both extrinsically and intrisically. We will be imposing coordinatesystems on surfaces, which very roughly, means imposing a grid system (ala graphpaper) on the surface. In other words, we will be breaking the surface into tinyquadrilaterals, and the two parallel transported vectors just mentioned will havenatural intepretations corresponding to second and third derivatives.

CHAPTER 4

Functions

1. Introduction

The point of this book is to gain some understanding of the geometry of space,in particular, a space that we could live in. With that in mind, we will wantto assume that the functions we use to describe these spaces are basically well-behaved. This chapter will explain what we mean by that. In order to gain anunderstanding of the curvature of three-dimensional space, we will first explore thecurvature of lower-dimensional spaces. This approach should seem reasonable inthat the lower-dimensional spaces are simpler, but that they also require conceptsthat generalize to the three-dimensional case. A less obvious class of relevant spaceswill also provide us with significant insight into the geometry of all the spaces justmentioned. These are spaces with isolated singularities. These singularities willinclude sharp corners in graphs and cone points on surfaces. The functions weconsider, therefore, will have these kinds of characteristics.

One basic principle that we will try to exploit is that linear functions areeasier to understand than functions in general, and that straight lines are easier tounderstand than general curves. Furthermore, finite and discrete objects are easierto understand than are infinite and continuous ones. The general aim of this bookis to use what we know about the easier to understand objects to gain some insightin the harder to understand ones. This chapter takes this approach to the study offunctions.

The graph of a function of two variables is shown in Figure 1. This graph wasproduced by a software package called Maple, and we can simplistically describe theprocess that Maple used as follows. A 25× 25 grid is imposed on the [ −0.1, 0.1 ]×[ −0.1, 0.1 ] portion of the domain, and function values are computed at each ofthe lattice points. This provides Maple with the coordinates for 676 points on thesurface (A 25 × 25 array of squares makes a 26 × 26 array of lattice points). Foreach line segment in the grid, Maple draws a line segment between the appropriatepoints on the surface, essentially projecting the grid lines onto the surface.

Looking at the surface in Figure 1, it appears that the Maple graph is a goodrepresentation of the surface. The fact that we are looking at a set of straight linesegments is not overly obtrusive, and it is easy to accept that this is a representationof a nicely curving surface. It is not inconceivable, therefore, that the line segmentsthemselves contain useful information about the underlying surface.

2. Piecewise-Linear Approximations for Functions of One Variable

We are not interested in studying functions in general, but how we can usefunctions to describe and understand geometry. We will take a very contrainedview of piecewise-linear approximations, therefore. While we will use functions

25

26 4. FUNCTIONS

Figure 1. The graph of a function of two variables z = f(x, y).

that are defined on the entire real line, at any one time, we will generally only beinterested in that function over some closed interval. If we were going to graph afunction, for example, we would typically only graph part of it. So we will define apiecewise-linear approximation for real-valued functions over closed intervals.

Definition 3. Let f : [ a, b ] → R. For some positive integer n, we divide[ a, b ] into n equal subintervals, [ x1, x2 ] , [ x2, x3 ] , . . . , [ xn, xn+1 ]. This collectionof subintervals will be called a partition, the interval [ xi, xi+1 ] will be called the i-th subinterval, and the common length of the subintervals ∆x = xi+1 − xi = b−a

nwill be called the mesh. For each subinterval, the line segment from (xi, f(xi))to (xi+1, f(xi+1)) will be called the i-th segment. The piecewise-linear (PL)approximation of f is the function f : [ a, b ] → R whose graph coincides withthe collection of all n segments.

Example 1. Consider the function f(x) = x2. The PL approximation of fwith n = 4 has lattice { −2,−1, 0, 1, 2 } and mesh ∆x = 1. The four segments andthe graph of f are shown in Figure 2. We generally will not make specific use of aformula for f , but it is possible to come up with one. In this case, for example, wehave

(26) f(x) =

−3x− 2 for x ∈ [ −2,−1 ] ,−x for x ∈ [ −1, 0 ] ,x for x ∈ [ 0, 1 ] ,3x− 2 for x ∈ [ 1, 2 ] .

3. UNIFORM CONTINUITY 27

Figure 2. The PL approximation of f(x) = x2 with a mesh of 1.

The PL approximation of f with n = 10 gives a more accurate looking graph,as shown in Figure 3.

Figure 3. The PL approximation of f(x) = x2 with a mesh of 0.4.

As can be seen in Example 1, increasing n (or equivalently decreasing ∆x)makes f more closely resemble f . We will want to make the assumption thatthe difference between the two functions can be made arbitrarily small. This is notnecessarily the case, as can be seen in the next example, but we will be able to makeassuptions of this type by restricting our attention to sufficiently nice functions aswill be explored throughout this chapter.

Example 2. As can be seen in Figure 4, the PL approximation experiencessome difficulty in following the graph of f(x) = x sin

(1x

)near x = 0, even with

n = 100. There are an infinite number of oscillations in the graph in any intervalabout x = 0, so there is no way that a finite number of segments can portray thisto any great degree of satisfaction. As mentioned, we will seek to avoid functionswith characteristics such as these.

3. Uniform Continuity

As we go through this chapter, we will be laying out the conditions we expectour functions to satisfy. Our underlying goal is to build an understanding of smoothgeometry, so at the very least, we might expect our functions to be continuous.That we will be making use of PL approximations also speaks to the need for the

28 4. FUNCTIONS

Figure 4. The graph of x sin(

1x

).

assumption of continuity. We will make important use of non-continuous functions,but the discontinuities will be isolated and simple in nature.

We will be making occasional reference to the definition of continuity, so let usstate it here.

Definition 4. For a function f : [ a, b ] → Rn and an x ∈ [ a, b ], f is con-tinuous at x if for every ε > 0, there is a δ > 0 such that whenever |∆x| < δ (andx+ ∆x ∈ [ a, b ]),

(27) | f(x+ ∆x) − f(x) | < ε.

For n > 1, we will take | f(x+ dx) − f(x) | to mean the magnitude of this differenceas vectors. If this definition is satisfied at each x ∈ [ a, b ], we will say that f iscontinuous on [ a, b ].

Note that if f is continuous on an interval, this definition allows a different δfor each x. This will be more than a little inconvenient for us, so we would like aslightly stronger notion of continuity. This will be the following.

Definition 5. For a function f : [ a, b ] → Rn, f is said to be uniformlycontinuous on [ a, b ] if for every ε > 0, there is a δ > 0 such that whenever|∆x| < δ (and x+ ∆x ∈ [ a, b ]),

(28) | f(x+ ∆x) − f(x) | < ε.

The main point of this definition is that the same δ works for any x. From ourexperience in calculus, we are familiar with a wide range of continuous functionsand a few non-continuous ones. We may not, however, be as comfortable withdetermining which functions are uniformly continuous. It turns out that this will

4. DIFFERENTIATION IN ONE VARIABLE 29

not be a concern, since we will almost exclusively be interested in functions oversome closed interval. This is a result of the following theorem about which moredetails can be found in a book on topology or real analysis.

Theorem 4. Let f : A ⊂ Rm → Rn be a continuous function. If A is closedand bounded, then f is uniformly continuous.

Uniform continuity is actually less than we desire. The function of Example 2is continuous everywhere, and therefore by Theorem 4, is uniformly continuous overany closed interval about x = 0. So while no graph can capture the oscillationspresent near x = 0, since the magnitudes of the oscillations become very small,the graph can stay close to the sampling points. Requiring uniform continuity,therefore, will not exclude all of the functions we would want to exclude. It shouldbe emphasized that we want uniform continuity for use in proofs, not necessarilyto exclude bad functions.

4. Differentiation in One Variable

All of our work with differentiation we extend the basic notion of the derivativestudied in calculus. We will begin with the definition.

Definition 6. For the function f : [ a, b ] → R, and for any x ∈ ( a, b ), letf ′(x) be defined by

(29) f ′(x) = lim∆x→0

f(x+ ∆x) − f(x)∆x

,

if the limit exists. For the endpoints a and b, the derivative is defined by

f ′(a) = lim∆x→0+

f(a+ ∆x) − f(a)∆x

,(30)

and f ′(b) = lim∆x→0−

f(b+ ∆x) − f(b)∆x

,(31)

where the first involves a limit from the right and the second a limit from theleft. These will sometimes be specifically referred to as right- and left-sidedderivatives. At each x for which the derivative exists (including a and b), we willsay that f is differentiable at x. If f is differentiable at every point of [ a, b ], wewill say that f is differentiable on [ a, b ].

At a particular value of x, the number f ′(x) is typically associated with theslope of a tangent line. Let us look a bit at what that means. If the limit in (29)exists, then for any ε > 0, there is a δ > 0 such that as long as | ∆x | < δ,

(32)∣∣∣∣ f ′(x) − f(x+ ∆x) − f(x)

∆x

∣∣∣∣ < ε.

In other words, for any epsilon, there is an interval (−δ, δ) such that

(33) | f ′(x)dx + f(x) − f(x+ ∆x) | < ε∆x.

for any ∆x in this interval. This tells us that the linear function t(∆x) = f ′(x)∆x+f(x) is a reasonable approximation to the function F (∆x) = f(x+ ∆x) for a fixedvalue of x. Since ε is arbitrary, no other linear function will fit as well. If fis differentiable at x, therefore, then there is a unique tangent line that fits thecurve better than any other line. Increasing or decreasing the slope, as in Figure5 results in a line that does not fit as well, so intuitively, we see a certain amount

30 4. FUNCTIONS

of symmetry. If the limit in the derivative definition does not exist, then a line

Figure 5. Lines with slopes different from the derivative do notfit as well.

cannot be singled out as fitting better than the rest. In Figure 6, we see a point ofnon-differentiability where a single line cannot fit the curve as we are accustomedto seeing in a tangent line on both sides of the point, and the most “symmetric” linedoes not fit the curve very well at all. We do see lines that look tangent on one sideof the non-differentiable point or the other, so the function may be differentiablefrom the right or left at this point.

Figure 6. At a non-differentiable point, there is no single bestlinear approximation, and no single line fits very well.

The tangent line, if it exists, is closely associated with a linear function. Sinceit is the slope of this function that is most important to us, we will often talk aboutthis linear function in terms of a coordinate system whose origin is at the point(x, f(x)).

Definition 7. If f is differentiable at the point x, then the differential of fat x is the linear function

(34) df(dx) = f ′(x)dx.

4. DIFFERENTIATION IN ONE VARIABLE 31

Note that the variable names in this coordinate system are dx and dy and that theexpression f ′(x) is a constant.

Note that for the differential, the origin for the dxdy-coordinate need not bethought of as being at the point (x, f(x)) as shown in Figure 7. Compare thiswith the notion of putting the base of a tangent vector at a relevant point on thegraph. In fact, the differential (as well as the tangent line) can be identified withthe collection of all possible tangent vectors. As a result, where a vector has botha direction and a magnitude, the differential has only direction. We will use thedifferential, therefore, as a way of generalizing slope to higher dimensions.

dx

dy df

Figure 7. We can think of the origin of the dxdy-coordinate sys-tem as being based at the relevant point on the curve, but we don’thave to.

Example 3. Consider the function f(x) = x2 +1. Its derivative is f ′(x) = 2x,and f ′(0) = 0. The slope of the tangent line at the point (0, 1) is therefore 0, andthe equation of the tangent line is t(x) = 0x+ 1 (Note that the differential at thispoint is df = 0dx). For ε = .1 in (33), the graph of f must lie between the linesy = ±.1x+1 over some interval about x = 0. This is shown in Figure 8. No matterhow small we make ε, there will be some inteval about x = 0 in which the parabolalies between the lines y = εx+ 1. This is a geometric description of our concept ofa tangent line.

Figure 8. For the function f(x) = x2+1, the line y = 1 is tangentto the curve at x = 0. Here the graph of f lies between the linesy = ±.1x+ 1 over some interval close to x = 0.

32 4. FUNCTIONS

Example 4. Differentiable functions allow graph behavior that lie beyond whatwe would like to consider. For example, note that the function g(x) = −x2 + 1 hasthe same tangent line at x = 0 as the function f just mentioned. It follows that anyfunction that lies between f and g must also have the same tangent line. Considerthe following function h defined below and graphed in Figure 9 with f , g, and t.

(35) h(x) =

{x2 sin

(1x2

), x 6= 0,

0, x = 0.

Clearly h lies between any pair of lines y = εx + 1 over some interval, since both

Figure 9. The graph of h lies between the graphs of f and g, soit has the same tangent at x = 0.

f and g do. It appears in the graph depicted in Figure 9, however, that the slopesof tangents to h near x = 0 can have high-magnitude slopes. This is confirmed bya computation of the derivative. The derivative of h away from zero can be foundusing the basic techniques of calculus, so the derivative of h must be

(36) h′(x) =

{2x sin

(1x2

)− 2

x cos(

1x2

), x 6= 0,

0, x = 0,

and it is clear that h′(x) takes large values arbitraily close to x = 0. The slopes ofthe tangent lines to h vary so wildly that h′ is not even continuous at x = 0. Thepoint of this book is to study the relationship between curvature and the geometryof curves and surfaces and to understand what it might mean for the universe inwhich we live to have curvature. As a result, we are most interested in objects thatcurve very gently. As this last example illustrates, differentiability alone does notguarantee the gentle curving we desire. The oscillations that are seen in the graphof Figure 9 are not really the problem. The problem is that the oscillations becomemore wild as we approach x = 0, and this is sufficient to make the derivative of f ′

not continuous. This particular example can be eliminated, of course, if we onlyconsidered functions with continuous derivatives.

As we have just seen in Example 4, a function can be differentiable with anon-continuous derivative. We do not want to consider functions that are this wild,so we will require that our functions have continuous derivatives unless specificallynoted otherwise. Such functions are called continuously differentiable or C1.

5. DERIVATIVES AND PL APPROXIMATIONS 33

Definition 8. For a function f : [ a, b ] → R, if f is differentiable on [ a, b ]and f ′ is continuous on [ a, b ], then f is said to be continuously differentiable on[ a, b ]. We will also use the notation C1 for continuously differentiable functions.If the second derivative is also continuous (more specifically, if f ′ is continuouslydifferentiable), then f is C2. Similarly, we may speak of functions that are Cn forany positive integer n, or even C∞ (f and all of its derivatives are continuouslydifferentiable). The notation C0 is sometimes used to describe continuous functions.

It is dangerous to place too much weight on what a differentiable or a contin-uously differentiable function might look like, but in general, we can think of a C1

function as looking smoother than a function that was differentiable, but not C1.A C2 function would look smoother still, but the differences become much moresubtle as we consider higher levels of continuous differentiability.

For a function f to be differentiable at x, we consider the slopes of secant lines.We can imagine ourselves at the point (x, f(x)) seeing an object approaching usalong the graph. The expression

(37)f(x+ dx) − f(x)

dxdescribes the observed direction we look in to see the object. For f to be differen-tiable, we would expect this direction to have a limit, and this limit would agreewith the direction for an object approaching from the opposite direction. The limitwould correspond to the directions determined by the tangent to the curve. If theobject were a car with its headlights on, the direction the headlights pointed inwould correspond to the tangent line at the point of the graph occupied by the car.These directions correspond to the values f ′(x+ dx). For the function h describedabove, the direction of the headlights would swing wildly back and forth betweendirections perpendicular to the tangent. If f is continuously differentiable, thenthese values must limit on f ′(x). In other words, the direction the headlights pointmust approach the direction of the tangent line, and they would always be pointingin your general direction.

5. Derivatives and PL Approximations

Given a PL approximation of a function f , the segments are each a portionof a secant line. At each individual lattice point, the slope of the secants through(xi, f(xi)) and (xi +∆x, f(xi +∆x)) limits on the derivative as ∆x → 0. We wouldexpect, therefore, that for very small values of the mesh, the slopes of the segmentswill very closely approximate the derivatives at the lattice points. We can establishthis easily with reference to the Mean Value Theorem, which we state here.

Mean Value Theorem. If f is continuous on [ a, b ] and differentiable on( a, b ), then there is a point c ∈ ( a, b ) such that

(38) f ′(c) =f(b) − f(a)

b− a.

Let f be a continuously differentiable function on [ a, b ], and let f be thePL-approximation of f with mesh ∆x and lattice points { x1, x2, . . . , xn+1 }. Onany particular segment, the Mean Value Theorem states that there must be aci ∈ ( xi, xi+1 ) such that

(39) f ′(ci) =f(xi+1) − f(xi)

∆x.

34 4. FUNCTIONS

Since the function f ′ is continuous, it is uniformly continuous, so given any ε > 0,there is a δ > 0 such that if ∆x < δ, then | f ′(ci) − f ′(xi) | < ε for all i. It followsthat, for all i,

(40)∣∣∣∣ f ′(xi) −

f(xi+1) − f(xi)∆x

∣∣∣∣ < ε.

We can conclude, therefore, that the slopes of the segments of f are good approxi-mations of the derivatives of f at the lattice points, and that the error can be madearbitrarily small by reducing the mesh.

Definition 9. We will define the PL differential of f at xi (and, if it isconvenient to have done so, at any point in [ xi, xi+1 )) to be

(41) Df(xi) = f(xi+1) − f(xi).

Df(xi)

∆x

Figure 10. The vector Df .

Note that

(42) lim∆x→0

Df(xi)∆x

= f ′(xi),

where ∆x = b−an for positive integers n, and n → ∞. If the function f were

constant, its graph would be a horizontal straight line, and the PL differentialwould be 0. As shown in Figure 10, the PL differential measures the increase in f(or the decrease) as we move from one lattice point to the next.

We can talk about the function f ′′ in a similar way. If f is C2, then for eachi, there is a c′i ∈ ( xi, xi+1 ) such that

(43) f ′′(c′i) =f ′(xi+1) − f ′(xi)

∆x.

Since f ′′ is continuous, there is a δ′ > 0 smaller than the δ mentioned above suchthat if ∆x < δ′,

(44) | f ′′(c′i) − f ′′(xi) | < ε.

Since

(45)∣∣∣∣ f ′(xi) −

f(xi+1) − f(xi)∆x

∣∣∣∣ < ε,

and

(46)∣∣∣∣ f ′(xi+1) −

f(xi+2) − f(xi+1)∆x

∣∣∣∣ < ε,

6. PARAMETRIZATIONS OF CURVES 35

we can conclude that

(47)

∣∣∣∣∣∣f ′′(xi) −

(f(xi+1)−f(xi)

∆x − f(xi+2)−f(xi+1)∆x

)

∆x

∣∣∣∣∣∣

=∣∣∣∣ f ′′(xi) −

Df(xi+1) −Df(xi)∆x2

∣∣∣∣ < 3ε.

With this in mind, we will make the following definition.

Definition 10. The second PL differential of f is defined as

(48) D2f(xi) = Df(xi+1) −Df(xi).

Note that

(49) lim∆x→0

D2f(xi)∆x2

= f ′′(xi)

It should be noted that D2f(xi) is probably a better approximation of f ′′(xi+1)than it is of f ′′(xi), but this formulation will be more convenient for us. In Figure

Df(xi)

−Df(xi)

Df(xi+1)

A

B

Figure 11. The distance D2f(xi) is equal to the sum of the dis-tances Df(xi+1) and −Df(xi).

13, if the graph of f were a straight line, then we would expect Df to be constant,and Df(xi) and Df(xi+1) would be the same. In this case, the graph of f wouldcontinue to the point marked A. Instead, the graph of f proceeds to the pointmarked B. The difference between A and B is the quantity D2f(xi). Therefore,D2f and f ′′ measure how much the graph is not a straight line, and so they aremeasures of curvature in some way. They measure the deviation from straightnessin the vertical direction, however. These values will change if the graph is rotated,for example, so they are not convenient quantities to use to describe a curve’s shape.They are easy to compute, and they contain the information necessary to describea curve’s curvature, and they will be of use to us.

6. Parametrizations of Curves

Df(xi)

(xi, f(xi))

(xi+1, f(xi+1))

Figure 12. The vector Df .

36 4. FUNCTIONS

Df(xi)

Df(xi)

Df(xi+1)

D2f(x1)

Figure 13. The vector D2f .

8. DIFFERENTIABILITY FOR FUNCTIONS OF TWO VARIABLES 37

7. Functions of Two Variables

Earlier in the chapter, we discussed briefly the graph of a function of twovariables (see Figure 1). The computer representation of the graph consists ofa collection of line segments. These segments will play a role in our study offunctions of two variables as the segments of a PL approximation did with functionsof one variable. The segments form a grid on the surface breaking the surfaceinto quadrilaterals that we will call grid parallelograms. In general, these gridparallograms are not true parallograms, which, among other things, always lie in aplane. In fact, while it would seem natural to use the segments directly to define aPL approximation to a surface, the four vertices of a grid parallelogram generallywill not lie in a plane. Since any set of three points is always coplanar, we can, insome sense, fold each grid parallelogram along a diagonal to fit the segments. Wecan, therefore, approximate the graph of a function of two variables with a collectionof flat triangular disks. From this we can naturally find a piecewise linear function.

Definition 11. For the function f : [ a, b ] × [ c, d ] → R, we can define thepiecewise-linear (PL) approximation as follows. Given positive integers m andn, we can break [ a, b ] into m equal subintervals with lattice points { x1, x2, . . . , xm+1 },and we can break [ c, d ] into n equal subintervals with lattice points { y1, y2, . . . , yn+1 }.From these, we can divide the rectangle [ a, b ] × [ c, d ] into mn equal rectangles[ xi, xi+1 ]× [ yj , yj+1 ] with width ∆x and height ∆y. We will say that the mesh is∆x × ∆y. The set of rectangles is called the partition, and the points (xi, yj)are the lattice points. The (i, j)-th rectangle has vertices (xi, yj), (xi+1, yj),(xi+1, yj+1), and (xi, yj+1). There is a flat (planar) triangular disk with vertices(xi, yj , f(xi, yj)), (xi+1, yj , f(xi+1, yj)), and (xi, yj+1), f(xi, yj+1), and there is an-other flat triangular disk with vertices (xi+1, yj , f(xi+1, yj)), (xi+1, yj+1, f(xi+1, yj+1)),and (xi, yj+1), f(xi, yj+1). Together these form the (i, j)-th grid parallelogram.The PL approximation of f is the function f : [ a, b ] × [ c, d ] → R whose graphconsists of all of the grid parallelograms.

In practice, we will use only the grid segments from the PL approximation,and the pair of triangular disks that make up each grid parallelogram along withthe diagonal between them will be of only secondary importance. Using the no-tation | (x1, y1) − (x2, y2) | for the distance between the two points, we can definecontinuity for a function of two variables as follows.

Definition 12. For f : [ a, b ] × [ c, d ] → R, f is continuous at (x, y) if forevery ε > 0, there is a δ > 0 such that whenever | (x, y) − (x+ ∆x, y + ∆y) | < δ,we have

(50) | f(x+ ∆x, y + ∆y) − f(x, y) | < ε.

If the existence of δ is independent of the point (x, y), then f is said to be uniformlycontinuous. By Theorem 4 we see that f is uniformly continuous if it is continuouson [ a, b ] × [ c, d ].

8. Differentiability for Functions of Two Variables

Our notion of differentiability for functions of more than one variable will bebased on the concept of a partial derivative. Given a function f in several variables,f : R3 → R for example, we can take the derivative of f with respect to one of thevariables by holding the others constant.

38 4. FUNCTIONS

Consider the function f(x, y) = 3xy + x4. Taking y to be constant, we candifferentiate with respect to x to obtain the expression 3y + 2x. We will use thenotations

(51)df

dx= fx = fx(x, y) = 3y + 4x3

for the partial derivative with respect to x. It is common to use curly ∂’s in thenotation for partial derivatives, but since we will be using partial derivatives almostexclusively, there is no significant advantage to making a distinction between partialderivatives and regular ones. Of course the partial respect to y would be written as

(52)df

dy= fy = fy(x, y) = 3x.

A small portion of the graph of this function is shown in Figure 14. As we have

Figure 14. Graph of f(x, y) = 3xy + x4.

discussed, this depiction of the graph imposes gridlines on the surface. What wesee are a collection of straight line segments each an edge shared by two grid par-allelograms. Half of the gridlines correspond to fixed values of y and the other halfto fixed values of x. For example, if we were to fix y to a value of zero, this wouldsingle out those points lying on a curve corresponding to the function values f(x, 0).The points (x, 0, f(x, 0)) all lie in the xz-plane, and if y = 0 is one of the latticecoordinates, the corresponding segments would form a PL approximation of f(x, 0).The partial derivatives fx(x, 0) can be interpreted as slopes in the xz-plane, andthese would be approximated by the slopes of the segments. Fixing y = 1 singlesout the gridline on the closest face of the cube in Figure 14. The values of fx(x, 1)can be interpreted as slopes in this plane.

For a function f of one variable, being differentiable implies the continuity off . This does not apply to the partial derivatives of a function of more than one


variable. A standard counterexample is as follows.

(53) f(x, y) =

{xy

x2+y2 , (x, y) 6= (0, 0),0, (x, y) = (0, 0).

The partial derivatives of this function are

fx(x, y) =

{y3−x2y

(x2+y2)2 , (x, y) 6= (0, 0),

0, (x, y) = (0, 0),(54)

fy(x, y) =

{x3−xy2

(x2+y2)2 , (x, y) 6= (0, 0),0, (x, y) = (0, 0).

(55)

The partial derivatives exist at all points, and in particular at (0, 0). The functionf is not continuous at (0, 0), however, since f(t, t) = 1

2 for all t 6= 0, there arefunction values equal to 1

2 arbitrarily close to (0, 0). A portion of the surface isshown in Figure 15. The graph is not accurate around the discontinuity, but somesense of the surface can be obtained from the picture. In fact, discontinuities canoften be seen in a graph such as this with badly distorted grid parallelogramsnear the discontinuity. Note that the x- and y-axes lie on the surface and thatthe horizontal lines with points (t, t, 1

2 ) and with points (t,−t,− 12 ) also lie on the

surface everywhere except for when t = 0. In particular, for any δ > 0, there is apoint (x, y) within δ of (0, 0) such that f(x, y) takes any particular value between− 1

2 and 12 . If we look at a few of the gridlines, we see that these are nicely smooth

Figure 15. Graph of f(x, y) = xyx2+y2 .

individually. For example in Figure 16, graphs in the xz-plane corresponding tofixed values of y = 1, .5, .1, .02 are shown. Each is the graph of a differentiablefunction. In fact, they are continuously differentiable as functions of one variable.For y = 0, f(x, 0) = 0, so this gridline is also nicely smooth. It is the transition

40 4. FUNCTIONS

to the gridline at y = 0 that is not continuous. Considering the gridlines wherex is held constant shows a similar situation. What we see, therefore, is that thepartial derivatives only address the differentiability of the individual gridlines, sothe continuity of the function of two variables is not necessarily guaranteed.

Figure 16. Gridlines with y = 1, .5, .1, .02.

Our interests lie in the geometry of nice surfaces, and we would like to avoiddiscontinuities such as the one exhibited by this last example. It turns out thatif the partial derivatives are continuous (as functions of two variables), then theoriginal function is also continuous. With this in mind, we make the followingdefinition.

Definition 13. If the partial derivatives of a function f are continuous, we willcall f continuously differentiable or C1. If all of the second partial derivativesare also continuous, f is C2. As before, we can also speak of Cn functions ingeneral and C∞ functions.

We can roughly say that if n > m, then Cn functions are more smooth thanCm functions. For the most part, we will assume that whenever we speak of thederivative of a function, that derivative will be continuous. We will purposelyencounter instances of non-differentiability and non-continuity, and these will bevery important and very specific. Otherwise, if we speak of a third partial derivativeof a function f , for example, we will implicitly assume that f is at least C3.

For the non-continuous function f in the example above, it is difficult to imaginea plane tangent to the surface at (0, 0) (except, perhaps, for a vertical one). Thisfunction was not C1, however, and one important consequence of a function of twovariables being C1 is that it is always possible to determine a tangent plane in a


reasonable way. Let us examine this in some detail, since this will be central tomuch of what we study.

Consider a function f : R2 → R. If the first partial derivatives exist, then ata point (a, b), fx(a, b) and fy(a, b) can be interpreted as being the slopes of linestangent to the gridlines at (a, b, f(a, b)). It should seem reasonable that if a planewere to be tangent to the surface at this point, then the two tangent lines wouldlie on this tangent plane. In the example above, this would mean that the xy-planewould be the only possible candidate for a tangent plane at the origin, and we wouldnot want to consider this to be the case. No reasonable tangent plane exists in thissituation. Again, however, the function in the example was not C1. In any case,we can assume that the plane determined by the two partial derivatives should bethe only possible candidate for a tangent plane. We wish to show that if f is C1,then we can reasonably call this plane a tangent plane.

Let us now suppose that f is indeed C1 (as always, we assume that f is dif-ferentiable in some region around the point under consideration). For convenience,suppose that fx(a, b) = m and fy(a, b) = n. We can reasonably call these the x- andy-slopes at (a, b). The plane determined by these two slopes is the graph of a linearfunction t(x, y) = mx+ ny+ c where c is the constant that makes t(a, b) = f(a, b).Consider a nearby point (a+ dx, b+ dy), and we will attempt to estimate the dif-ference between t(a + dx, b + dy) and f(a + dx, b + dy). Our strategy will be toconsider f(a+ dx, b) first and then f(a+ dx, b+ dy) (we could just as easily startwith f(a, b+dy)). We can use the partial derivative fx(a, b) to estimate f(a+dx, b).The fact that fy is continuous allows us to estimate fy(a+ dx, b), and this in turncan be used to estimate f(a+ dx, b+ dy).

Let ε > 0. There is a δ1 > 0 such that if |dx| < δ1, then(56)

| f(a, b) + fx(a, b)dx− f(a+ dx, b) | = | t(a, b) +mdx− f(a+ dx, b) | < εdx.

We can make a similar approximation using fy(a+dx, b), but this would necessarilydepend on dx complicating matters significantly. We can however consider f(a +dx, b+dy)−f(a+dx, b). Since f is differentiable with respect to y, the Mean ValueTheorem tells us that there is a ν between b and b+dy (dy could be negative) suchthat

(57) f(a+ dx, b+ dy) − f(a+ dx, b) = fy(a+ dx, ν)dy.

The continuity of fy guarantees the existence of a δ2 > 0 such that if (a+ dx, η) iswithin δ2 of (a, b), then

(58) | fy(a+ dx, η) − fy(a, b) | = | fy(a+ dx, η) − n | < ε.

There is a δ > 0, therefore, such that whenever (a+ dx, b+ dy) is within δ of (a, b),|dx| < δ1 and (a + dx, η) will be within δ2 of (a, b). For a point (a + dx, b + dy)

42 4. FUNCTIONS

satisfying these conditions, we have

| f(a+ dx, b+ dy) − t(a+ dx, b+ dy) |(59)

= | f(a+ dx, b+ dy) − f(a+ dx, b) + f(a+ dx, b) − t(a+ dx, b+ dy) |(60)

= | f(a+ dx, b+ dy) − f(a+ dx, b) + f(a+ dx, b) − f(a, b) −mdx− ndy |(61)

≤ | f(a+ dx, b+ dy) − f(a+ dx, b) − ndy | + | f(a+ dx, b) − f(a, b) −mdx |(62)

≤ ε|dy| + ε|dx| = ε(|dy| + |dx|) < ε2√|dx|2 + |dy|2.

(63)

This can be interpreted as showing that the directional derivative in the directionof the vector [ dx, dy ] exists. In other words, if we were to consider the curve on thesurface containing the points (a+ tdx, b+ tdy, f(a+ tdx, b+ tdy)), then the tangentline to this curve would lie on the plane determined by the two partial derivatives.A third way of saying this is that if f is continuously differentiable, then the set oftangent lines to the surface at (a, b) will form a plane, and this plane is the sameas the one determined by fx(a, b) and fy(a, b).

We now can state that for a function of two variables defined in some regionabout (a, b) and continuously differentiable in that region, it is perfectly reasonableto speak of a tangent plane to the graph of the function. Continuous differentiabilityis not a necessary condition in this regard (see a book in real analysis), but since wewill never consider a non-continuous derivative outside of several important specialcases, it is reasonable to proceed in this way.

Continuously differentiable functions have one other important property thatwe will exploit heavily. This is in regards to the second partial derivatives. Thepartial derivative of fx with respect to y will be denoted by fxy. It is importantto note that in fxy we differentiated with respect to x first and then y. This samefunction is denoted

(64) fxy =d2f

dydx.

With the ddx notation, we indicate differentiation by placing symbols in front of

the function name, so the latest derivative should be furthest left. Even thoughit is fundamentally important that C2 functions are such that fxy = fyx, as wewill now investigate, the order of the differentiation is critical to understanding therelationship between geometry and differentiation.

Suppose we have a C2 function f in two variables. At a point (a, b), fx(a, b)is the slope of the tangent line to the gridline with y = b, and fy(a, b) is the slopeof the tangent to the gridline with x = a. The second partial derivative fxy(a, b)describes the rate of rotation of the tangent in the x-direction as we move it in they-direction. Similarly, fyx(a, b) describes the rate of rotation of the tangent in they-direction as we move it in the x-direction. Expressed this way, there appears tobe no reason to expect that fxy(a, b) = fyx(a, b). Let us try to understand why thismight be the case.


A standard example of a function with unequal cross-partials is as follows.

(65) f(x, y) =

{xy(x2−y2)

x2+y2 , (x, y) 6= (0, 0),(0, 0), (x, y) = (0, 0).

The graph of this function looks unremarkable (see Figure 17), but anomolies ofthe second derivative will not be obvious in graphs such as this. The first partialderivatives for this function are

Figure 17. For this function fxy(0, 0) 6= fyx(0, 0).

fx(x, y) =

{x4y−y5+4x2y3

(x2+y2)2 , (x, y) 6= (0, 0),

(0, 0), (x, y) = (0, 0).(66)

fy(x, y) =

{x5−xy4−4x3y2

(x2+y2)2 , (x, y) 6= (0, 0),

(0, 0), (x, y) = (0, 0).(67)

Away from (0, 0), the functions fxy and fyx are equal.

(68) fxy(x, y) = fyx(x, y) =x6 + 9x4y2 − 9x2y4 − y6

(x2 + y2)3, for(x, y) 6= (0, 0).

This should not be surprising, since all of the second partial derivatives are contin-uous away from (0, 0). The common graph for the cross-partials is shown in Figure18, and it is apparent that neither function can be continuous at (0, 0). Both func-tions have values at (0, 0), however. Since fx(0, y) = −y, the partial derivative ofthis function at (0, 0) must be fxy(0, 0) = −1. Similarly, since fy(x, 0) = x, we havethat fyx(0, 0) = 1.

The nature of this example indicates that for nice functions, the cross-partialsshould be expected to be equal. Let us look at why we might believe this to be

44 4. FUNCTIONS

Figure 18. This is the graph of fxy or fyx. They agree everywhereexcept at (0, 0).

the case. The grid lines on the graph break the surface into grid parallelogramscorresponding to a PL approximation of f . In general, these are not actual par-allelograms, since the four corners are assumed to lie on a curved surface, andso pairs of opposite sides cannot be expected to be parallel or the same length.For a grid with mesh ∆x × ∆y, we can consider the grid parallelogram at a point(a, b). In particular, the four corners are (a, b, f(a, b)), (a + ∆x, b, f(a + ∆x, b)),(a + ∆x, b + ∆y, f(a + ∆x, b + ∆y)), and (a, b + ∆y, f(a, b+ ∆y)). The grid par-allelogram is depicted in Figure 19 along with the true parallelogram determinedby the grid segments adjacent to (a, b, f(a, b)). Let Dxf(a, b) be the vector from(a, b, f(a, b)) to (a + ∆x, b, f(a + ∆x, b)), as shown in Figure 19. The coordinatesof Dxf(a, b) are

(69) Dxf = [ ∆x, 0, f(a+ ∆x, b) − f(a, b) ] .

Compare this vector to the slope of this side of the grid parallelogram

(70) m =f(a+ ∆x, b) − f(a, b)

∆x.

We are looking at the slope of a secant line whose limit is fx(a, b), and the vectorDxf(a, b) gives us information that is equivalent to this slope. The change in thevalues of f , the third coordinate in Dxf , will be called the PL partial differentialof f with respect to x, and will be denoted Dxf . The vector Dyf(a, b) similarlycontains information about a secant line whose slope approximates fy(a, b), and wedefine Dyf in the obvious way.

Figure 20 shows Df(a, b) and Df(a + ∆x, b). These reflect a change in fy

as it moves in the x-direction. In other words, the difference between these two


Dxf(a, b)

Dyf (a, b)

f(a, b) f(a + ∆x, b)

f(a, b + ∆y)

f(a + ∆x, b + ∆y)

Figure 19. A grid parallelogram determined by four points onthe surface and the actual parallelogram determined by Dxf andDyf .

vectors is an approximation of fyx(a, b), and as long as the limits are sufficientlywell-behaved,

(71) lim∆y→0

lim∆x→0

Dyf(a+ ∆x, b) −Dyf(a, b)∆x∆y

= [ 0, 0, fyx(a, b) ] .

If there were no difference between Dyf(a, b) and Dyf(a + ∆x, b), then Dyf(a +∆x, b) would occupy the opposite side of the true parallelogram shown in Figure20. The difference between the two vectors must be a vector from the corners of thetrue parallelogram and the grid parallelogram corresponding to (a + ∆x, b + ∆y),as shown in Figure 21.

Dyf (a + ∆x, b)

Dyf (a, b) Dyf (a, b)

Figure 20. The vectors Dyf(a, b) and Dyf(a+ ∆x, b).

The vector Dxyf(a, b) must have precisely the same geometric interpretation,so as long as the limits are well-behaved, it should be that fxy(a, b) = fyx(a, b).

46 4. FUNCTIONS

Dyxf(a, b)

Figure 21. The vector corresponding to fyx(a, b).

CHAPTER 5

The Riemannian Curvature Tensor in Two

Dimensions

For a surface parametrized by x(u1, u2) =[x1(u1, u2), x2(u1, u2), x3(u1, u2)

],

the Riemannian curvature tensor is defined to be

(72) Rlijk =

dΓlik

duj−dΓl

ij

duk+ Γp

ikΓlpj − Γp

ijΓlpk,

where i, j, k, l, p ∈ { 1, 2 }, and the Einstien summation convention is used (i.e.,since p occurs as both an upper and lower index, it is summed over). Using thenotation xi = dx

dui and xij = d2xdujdui , we define the Christoffel symbols Γk

ij alongwith the coefficients of the Second Fundamental Form by

(73) xij = Lijn + Γkijxk,

again using the Einstein summation convention. Note that the Γkij describe the

tangential change in the tangent vectors xi in terms of the xi, and they can beobtained by making measurements along the surface (i.e., they are intrinsic). TheLij are extrinsic, and the principal curvatures, κ1 and κ2, can be obtained fromthem, and so the Gauss curvature, K = κ1κ2, also depends on the Lij . The Gausscurvature can also be obtained from Rl

ijk , in particular

(74) K =Rl

121gl2

g,

where gij = 〈 xi,xj 〉 and g = | gij |. There are several other choices for theindices that will result in ±K. All quantities used here are intrinsic, so a proof ofthe relationships stated here also proves Gauss’ Theorema Egregium, that Gausscurvature is intrinsic.

The Rlijk contain more curvature information than K, and the Riemannian

curvature tensor generalizes to higher dimensions, where the Gauss curvature doesnot. Motivation for the Riemannian curvature tensor comes from the followingobservations. What I describe is partly nonsense, but it lays out the basic idea.

Suppose a simple closed curve C on the surface bounds a region S. The totalcurvature of S is θ =

∫S K dA. If we were to parallel transport a vector around

C, then the resultant vector would differ from the original by an angle θ. If wehad points A and B on the curve, then there are two ways that we could paralleltransport a vector along C from A to B. The angle between the two resultantvectors would also be θ.

The Riemannian curvature tensor captures this idea using differentials andderivatives. For example, if we were to follow the tangent vector x1 as it moveda small distance du1 in the u1-direction, and then a small distance du2 in the u2-direction, we would get some vector x1(12). We could go the other way, that is,

47

48 5. THE RIEMANNIAN CURVATURE TENSOR IN TWO DIMENSIONS

we could move first in the u2-direction, and then the u1-direction. We’ll call thisx1(21). These two vectors will be the same, but if we were to keep track of therotation of the vectors relative to the surface, we would get a discrepancy of θ,where θ is the total curvature inside the little parallelogram with sides du1 anddu2. The average curvature would be

(75) K ≈ θ

‖du1 × du2‖.

Note also that

(76) θ ≈ ‖x1(12) − x1(21)‖‖x1‖

.

Very roughly then, the Riemannian curvature tensor describes the following. Ig-noring the normal component of the change in xi, move it in the uj-direction, andthen in the uk-direction to obtain xi(jk). Obtain xi(kj) similarly and subtract.Expressed in terms of the tangent vectors x1 and x2, we have

(77) xi(jk) − xi(kj) = R1ijkx1 +R2

ijkx2.

1. Parametrizations

Let x(u1, u2) =[x1(u1, u2), x2(u1, u2), x3(u1, u2)

]be a vector function x :

R2 → R3. Let x be a piecewise linear approximation of x with mesh ∆u1 × ∆u2.At a lattice point (u1, u2) = (a, b), define the partial PL-differential with respect

to ui to be

x1 = D1x(a, b) = x(a+ ∆u1, b) − x(a, b),(78)

and x2 = D2x(a, b) = x(a, b+ ∆u2) − x(a, b),(79)

and the second partial PL-differentials with respect to ui and uj to be

x11 = D11x(a, b) = D1x(a+ ∆u1, b) −D1x(a, b),(80)

x12 = D12x(a, b) = D1x(a, b+ ∆u2) −D1x(a, b),(81)

x21 = D21x(a, b) = D2x(a+ ∆u1, b) −D2x(a, b),(82)

and x22 = D22x(a, b) = D2x(a, b+ ∆u2) −D2x(a, b).(83)

Also at each lattice point, we can define the unit PL-normal vector at (a, b) to be

(84) n(a, b) =x1 × x2

‖x1 × x2‖.

Note that n is normal to the plane determined by x1 and x2.One important goal is to understand the curvature of space, so it is important

to understand curvature intrinsically. It is possible to decompose the xij intotangential and normal components.

(85) xij = Lijn + Γkijxk.

The Lij are the coefficients of the PL-second form (??). The Γkij are the PL-

Christoffel symbols, and they describe the tangential (or geodesic) change in thetangent vectors xi. From an intrinsic point of view, we can define quantities thatcorrespond roughly to the PL-Christoffel symbols. We will do this by working fromthe fact that any two adjoining grid parallelograms can be laid flat (i.e., can beembedded in a plane).

1. PARAMETRIZATIONS 49

Let us first consider the intrinsic change corresponding to x11. At a latticepoint (a, b), we are looking at

(86) x11 = x1(a+ ∆u1, b) − x1(a, b),

so we are interested in the grid parallelograms corresponding to (a, b) and (a +∆u1, b). Both grid parallelograms can be embedded in the vector space spanned byx1(a, b) and x2(a, b), and in particular, x1(a+ ∆u1, b) lies in this plane. With thefollowing subtraction taking place in this vector space, we can therefore define theintrinsic PL-differential, δ1x1, and the intrinsic PL-Christoffel symbols, γk

11, by

(87) x1(a+ ∆u1, b) − x1(a, b) = δ1x1(a, b) = γ111x1(a, b) + γ2

11x2(a, b).

In general, we define δixj and γkij the same way.

x2

x1

x1 (2)

x1 (1)

x1 (21)

x1 (12)

θ

(a, b)

(a + ∆u1, b + ∆u2)

Figure 1. Pushing x1 around the grid parallelogram.

If we consider the grid parallelograms at (a, b), (a+ ∆u1, b), (a, b+ ∆u2), and(a+∆u1, b+∆u2), it will not be possible to lay these flat if there is non-zero impulsecurvature at (a+ ∆u1, b+ ∆u2). If we cut along the vector x1(a+ ∆u1, b+ ∆u2),however, we can pull all four grid parallelograms into the plane spanned by thevectors x1(a, b) and x2(a, b), as shown in Figure 1. The angle θ between the twocopies of x1(a+∆u1, b+∆u2) is precisely the impulse curvature at (a+∆u1, b+∆u2).The Riemannian curvature tensor exploits this observation, and we will start tobuild up to Rl

ijk here.The intrinsic PL-differential δ1x1(a, b) measures the difference between the vec-

tors marked x1 and x1(1) in Figure 1. Similarly, δ2x1(a+ ∆u1, b) is the differencebetween the vectors marked x1(1) and x1(12). Both of these differences are com-puted in the plane spanned by x1(a, b) and x2(a, b). Also in this plane, the differencebetween x1 and x1(21) is measured by δ2x1(a, b) and δ1x1(a, b+∆u2). Finding theangle between the vectors x1(12) and x1(21) is a bit awkward, but we can get agood approximation for small θ’s with

(88) θ ≈ ‖x1(12) − x1(21)‖‖x1(12)‖

,

since the vectors x1(21) and x1(12) are the same length, and so the differencebetween the vectors is essentially an arc of a circle with radius ‖x1(12)‖ (or equiv-alently, ‖x1(21)‖). With this in mind, we define the PL-Riemannian curvature

50 5. THE RIEMANNIAN CURVATURE TENSOR IN TWO DIMENSIONS

tensor by

(89) xi(jk) − xi(kj) = R1ijkx1 +R2

ijkx2.

In words, we take the vector xi and move it first in the xj-direction, and then thexk-direction. From this we subtract the result of moving this vector first in thexk-direction, and then the xj-direction. This description makes it apparent thatwe have the following relationships.

Rlijk = −Rl

ikj ,(90)

and Rlijj = 0.(91)

There are eight pairs of numbers R1ijk and R2

ijk , and each of the four pairs for whichj 6= k, along with x1 and x2, give us approximations of the impulse curvature at(a+ ∆u1, b+ ∆u2).

We can express the quantities in (89) in terms of differentials at (a, b). Specif-ically, we need the following

δ1γkij(a, b) = γk

ij(a+ ∆u1, b) − γkij(a, b),(92)

and δ2γkij(a, b) = γk

ij(a, b+ ∆u2) − γkij(a, b).(93)

For example, we have that

(94) γkij(a+ ∆u1, b) = γk

ij(a, b) + δ1γkij(a, b).

We can find x1(12) as follows, using numbers in parentheses to designate the latticepoint. Since

x1(1) = x1 + γ111x1 + γ2

11x2,(95)

x2(1) = x2 + γ121x1 + γ2

21x2,(96)

and γi12(1) = γi

12 + δ1γi12,(97)

we have that

x1(12) = x1(1) + γ112(1)x1(1) + γ2

12(1)x2(1)(98)

= x1 + γ111x1 + γ2

11x2(99)

+ (γ112 + δ1γ

112)(x1 + γ1

11x1 + γ211x2)(100)

+ (γ212 + δ1γ

112)(x2 + γ1

21x1 + γ221x2).(101)

Similarly,

x1(21) = x1(2) + γ111(2)x1(2) + γ2

11(2)x2(2)(102)

= x1 + γ112x1 + γ2

12x2(103)

+ (γ111 + δ2γ

111)(x1 + γ1

12x1 + γ212x2)(104)

+ (γ211 + δ2γ

111)(x2 + γ1

22x1 + γ222x2).(105)

It follows that

x1(12) − x1(21) = (γ112γ

111 + δ1γ

112 + δ1γ

112γ

111 + γ2

12γ121 + δ1γ

212γ

121)x1(106)

+ (γ112γ

211 + δ1γ

112γ

211 + γ2

12γ221 + δ1γ

212 + δ1γ

212γ

221)x2.(107)

− (γ111γ

112 + δ2γ

111 + δ2γ

111γ

112 + γ2

11γ122 + δ2γ

211γ

122))x1(108)

− (γ111γ

212 + δ2γ

111γ

212 + γ2

11γ222 + δ2γ

211 + δ2γ

211γ

222)x2(109)

1. PARAMETRIZATIONS 51

The x1 component can be written in Einstein notation as

(110) δ1γ112 − δ2γ

111 + γp

12γ1p1 − γp

11γ1p2 + δ1γ

p12γ

1p1 − δ2γ

p11γ

1p2,

and the x2-component can be written as

(111) δ1γ212 − δ2γ

211 + γp

12γ2p1 − γp

11γ2p2 + δ1γ

p12γ

2p1 − δ2γ

p11γ

2p2.

In general, the xi-component would be

(112) δ1γi12 − δ2γ

i11 + γp

12γip1 − γp

11γip2 + δ1γ

p12γ

ip1 − δ2γ

p11γ

ip2.

All cases are covered by

(113) Rlijk = δjγ

lik − δkγ

lij + γp

ikγlpj − γp

ijγlpk + δjγ

pikγ

lpj − δkγ

pijγ

lpk.

Compare this to the definition of the (non-PL) Riemannian curvature tensor.

CHAPTER 6

Riemannian Curvature Tensor

1. The Riemannian Metric for a Plane

One thing that should always be kept in mind is that the derivative can alwaysbe interpreted as being a linear approximation. In other words, it is often helpfulto understand a situation concerning a differentiable object by studying the corre-sponding linear algebra situation. We will be exploring something that differentialgeometers call a metric or a Riemannian metric. This is not to be confused withthe metric a topologist would impose on a metric space, although a metric-spacemetric can always be constructed from a Riemannian metric.

Differential geometers study objects that don’t have natural (or at least notconvenient) coordinate systems. We can impose coordinate systems on these spacesby identifying a piece of the space with a piece of a Euclidean space. Right now, wewill associate a piece of a surface with a piece of the Eucidean plane, R2. We cantalk about points of the surface using a coordinate system for R2. The geometry,however, for the two spaces will be different, and we will impose a funny geometryon R2 and pretend that R2 actually is the surface. We will start to describe howthis is done in the case when the surface is another plane, and at the same time,develop some of the notation and terminology that is used in differential geometry.

We will think of the derivatives at a point of a surface as being generalizationsof the notion of a linear transformation, so let’s look at a linear transformation. Dueto the repetitive nature of linear algebra and differential geometry, it is convenientto talk about things such as the u1u2-plane rather than the uv-plane. For the mostpart, superscripts are used in the same way that subscripts are. This interferes withthe use of exponents, but the handyness of the notation more than makes up forthis. Suppose we have a linear transformation A from the u1u2-plane to the x1x2-plane. Linear transformations can be defined in terms of matrix multiplication, sowe have

(114) A(u1, u2) =

a11 a1

2

a21 a2

2

[u1

u2

].

Note that aij is the entry in the i-th row and j-th column. Part of the reason for

using superscripts in this way will be apparent in a minute. More will made clear

53

54 6. RIEMANNIAN CURVATURE TENSOR

later. Expanding the multiplication, we see that

[x1

x2

]= A(u1, u2) =

a11u

1 + a12u

2

a21u

1 + a22u

2

=

∑2

j=1 a1ju

j

∑2j=1 a

2ju

j

.

(115)

The use of the summation notation indicates some of the repetitiveness. Morecan be seen in the fact that both entries look the same. Another common way ofexpressing this relationship is

x1 =2∑

j=1

a1ju

j

x2 =2∑

j=1

a2ju

j

(116)

Both equations look the same, except for the superscripts, which are both 1 in thefirst equation, and both 2 in the second. Note also that the summation index j inboth summations appear as both a subscript and a superscript. Albert Einstein isoften credited with noticing that there is a nice scheme for deciding which indicesshould be subscripts and which should be superscripts, and if this is followed,summations will always range over one subscript and one superscript. The termscovariant and contravariant tensors are used in this scheme, and we will discussthis later. The important thing here is that we will always sum over an index thatappears both as a superscript and a subscript. This makes the summation signredundant (the context will make it clear what the range of the index is supposedto be), and so using the Einstein summation convention, we can write the equationsin (116) as one equation and without the summation sign,

(117) xi = aiju

j .

This may look odd at first, but it turns out to be a powerful use of notation thattakes care of itself.

Let’s look at an example. Consider the linear transformation defined by the

matrix A =[1 32 2

]. The vectors (the points) [ 1, 0 ] and [ 0, 1 ] in the u1u2-plane

map to the following vectors in the x1x2-plane. We will use A for the name of thefunction, as well.

A(1, 0) =[1 32 2

] [10

]=

[12

]

A(0, 1) =[1 32 2

] [01

]=

[32

](118)

Remember that we are building up intuition and notation for more complicatedsituations. What we want to do here is to talk about points on the x1x2-planeusing coordinates from the u1u2-plane. The game is that we will use the label[ 1, 0 ] to talk about the vector [ 1, 2 ] in the x1x2-plane. We can compute the

1. THE RIEMANNIAN METRIC FOR A PLANE 55

magnitude of the vector [ 1, 2 ] using the dot product. That is,

(119)[1 2

] [12

]= 1 + 4 = 5,

so the magnitude of the vector must be√

5. Under the new rules for the game andthe new geometry for the u1u2-plane, the magnitude of [ 1, 0 ] is

√5. Similarly, the

magnitude of the vector [ 0, 1 ] is the same as the magnitude of the vector [ 3, 2 ],which is

√9 + 4 =

√13. Under this new geometry, as we move from the point (0, 0)

to the point (1, 0), we have traveled a distance of√

5 units, and as we move from(0, 0) to (0, 1), we have moved a distance of

√13. Our notion of distance has changed

dramatically, and the way we measure angles can be different, as well. In the oldgeometry of the u1u2-plane, the two vectors [ 1, 0 ] and [ 0, 1 ] are perpendicular,but in the new geometry, the angle is the same as the angle between [ 1, 2 ] and[ 3, 2 ] in the x1x2-plane. We can compute this angle using the dot product, as wedid before. That is,

(120)[1 2

] [32

]= 3 + 4 = 8,

and so the angle between them θ must satisfy

(121) 8 =√

5√

13 cos θ.

In other words,

(122) θ = cos−1

(8√

5√

13

).

In the new geometry for the u1u2-plane, therefore, this must be the angle between[ 1, 0 ] and [ 0, 1 ]. All that we have done here depends on taking dot products inthe x1x2-plane. The rules of the game, therefore, can be condensed into a funnydot product on the u1u2-plane. These generalizations of the dot product are calledinner products, and inner products can always be expressed in terms of a matrixproduct as follows. Given an inner product, there is a matrix [ gij ] such that theinner product of vectors

[a1, a2

]and

[b1, b2

]is

⟨ [a1, a2

],[b1, b2

] ⟩=

[a1 a2

] [g11 g12g21 g22

][b1

b2

]

= a1b1g11 + a2b1g21 + a1b2g12 + a2b2g22

= aibjgij (in Einstein notation)

(123)

Assuming the existence of such a matrix, it is easy to find the matrix [ gij ] for thisexample.

(124) [ gij ] =[5 88 13

]

This inner product, and equivalently the matrix gij , determines the new geometrycompletely. Letting e1 = [ 1, 0 ] and e2 = [ 0, 1 ], the gij are determined by whatthe inner products between these vectors should be. That is,

(125) gij = 〈 ei, ej 〉


We will define magnitudes and angles in the new geometry using the inner productin place of the dot product.

‖x‖ =√〈 x,x 〉(126)

〈 x,y 〉 = ‖x‖ ‖y‖ cos θ(127)

1.1. Exercises.

21. Let A be the linear transformation from the u1u2-plane to the x1x2-plane

determined by the matrix[2 −31 4

]. Find the inner product matrix [ gij ] for the

new geometry. Under the new geometry, determine the distance traveled along thepath from (0, 0) to (1, 0). Along the path from (0, 0) to (1, 1) to (0, 3). Find theangle between the vectors [ 2, 3 ] and [ −3, 1 ].

22. Write the product of the square matrices[ai

j

]and

[bjk

]in Einstein no-

tation (use[cik

]for the product).

2. The Riemannian Metric for Curved Surfaces

For a function of one variable, the first derivative can be interpreted as describ-ing an approximating tangent line, and the second derivative an approximatingparabola. We can make similar comparisons with surfaces.

Suppose we have a surface parametrized by the following vector function.

(128) x(u1, u2) =[x1(u1, u2), x2(u1, u2), x3(u1, u2)

]

Suppose also that we are interested in investigating the curvature of the surfaceat any particular point on the surface. We will interpret the derivatives of x asdescribing linear approximations to the surface. In other words, the first partialderivatives define a tangent plane, and they also determine a linear transformationfrom the u1u2-plane to that tangent plane. Let us take a look at these first partialsand discuss their meaning.

dxdu1

=[dx1

du1,dx2

du1,dx3

du1

](129)

dxdu2

=[dx1

du2,dx2

du2,dx3

du2

](130)

These two vectors are tangent to the surface at their respective points. Theycan be used directly to describe a plane tangent to the surface, and the chainrule provides justification for doing this. Suppose we have a line in the u1u2-plane parametrized by γ(t) =

[a1t, a2t

](in other words, u = a1t and v =

a2t). The corresponding curve on the surface can be parametrized by x(γ(t)) =[x1(a1t, a2t), x2(a1t, a2t), x3(a1t, a2t)

]. We can differentiate x with respect to the

2. THE RIEMANNIAN METRIC FOR CURVED SURFACES 57

new parameter t, and the chain rule illustrates the linear character of the derivative.

dxdt

=[dx1

du1

du1

dt+

dx

du2

du2

dt,dx2

du1

du1

dt+dx2

du2

du2

dt,dx3

du1

du1

dt+dx3

du1

du2

dt

](131)

=[dx1

du1a1 +

dx1

du2a2,

dx2

du1a1 +

dx2

du2a2,

dx3

du1a1 +

dx3

du2a2

](132)

= a1

[dx1

du1,dx2

du1,dx3

du1

]+ a2

[dx1

du2,dx2

du2,dx3

du2

](133)

= a1 dxdu1

+ a2 dxdu2

(134)

This could be interpreted as follows. If a point is moving along a line accordingto the parametrization γ, it would pass through any particular point with velocity[a1, a2

]. The image of this point on the surface will have velocity a1 dx

du1 + a2 dxdu2 .

The relationship between velocity vectors in the u1u2-plane at a particular pointand velocity vectors at the corresponding image point on the surface are relatedby a linear transformation. This function is known as the differential and can bedefined by

(135) dx =dxdu1

du1 +dxdu2

du2,

where a velocity vector[du1, du2

]at a point in the u1u2-plane corresponds to a

velocity vector dx at the corresponding image point on the surface. This functionis also described by a matrix called the Jacobian,

(136) J(x) =

dx1

du1dx1

du2

dx2

du1dx2

du2

dx3

du1dx3

du2

.

The differential then becomes

dx(du1, du2) = J(x)[du1

du2

](137)

=

dx1

du1dx1

du2

dx2

du1dx2

du2

dx3

du1dx3

du2

[du1

du2

](138)

Once we have this linear function, it is easy to compare areas in the u1u2-plane withareas on the tangent plane. The unit square determined by the vectors (1, 0) and(0, 1) gets mapped to a parallelogram determined by the vectors dx

du1 and dxdu2 . The

area of this parallelogram, if you remember from calculus, is equal to the magnitudeof the cross product of these two vectors. We’ll come back to this later.

We have already talked a bit about a point in the u1u2-plane and the cor-responding point on the surface. The parametrization ties points together, andwe can identify the pairs of points and speak of them almost as if they were one.Expressed another way, we are again playing the game of using labels from theu1u2-plane to describe points on the surface. This is an important concept when


dealing with manifolds, since there may be no way, or at least no convenient way,of describing individual points on a manifold. What we will do is to talk aboutpoints in the u1u2-plane and endow them with properties from the manifold, or inthis case, the surface. For example, let us say that we move along the segment from(0, 0) to (1, 0) in the u1u2-plane. We have traveled a distance of 1. The image ofthis segment on the surface might be a curve with length 5. If we change the waythat we measure distances in the u1u2-plane, as we did in the case when the surfacewas simply another plane, then in some new geometry,the length of the segmentjust mentioned would be 5, and then doing geometry in the u1u2-plane becomesmore like doing geometry on the surface. The goal here is to come up with a funnyway of measuring things like distances and angles so that doing geometry with thesenew measurement schemes is equivalent to doing geometry on the surface. This isa rough description of what differential geometry is about.

The differential tells us how velocity vectors in the u1u2-plane correspond tovelocity vectors on the surface via a linear transformation that changes from pointto point. These contain the necessary information to find a relationship betweendistances and angles. Note that both of these quantities for vectors are obtainablefrom the dot product, so if we know how the dot products compare, then we shouldbe able to get what we need for distances and angles. Suppose the Jacobian at(0, 0) is

(139) J(x)(0, 0) =

c11 c12c21 c22c31 c32

Consider two velocity vectors in the u1u2-plane,[a1, a2

]and

[b1, b2

]. Their

images are

(140)

c11 c12c21 c22c31 c32

[a1

a2

]=

c11a

1 + c12a2

c21a1 + c22a

2

c31a1 + c32a

2

,

and similarly

(141)

c11 c12c21 c22c31 c32

[b1

b2

]=

c11b

1 + c12b2

c21b1 + c22b

2

c31b1 + c32b

2

.

The dot product of these two vectors is

(142)[c11a

1 + c12a2 c21a

1 + c22a2 c31a

1 + c32a2]c11b

1 + c12b2

c21b1 + c22b

2

c31b1 + c32b

2

= (c11a1 + c12a

2)(c11b1 + c12b

2) + (c21a1 + c22a

2)(c21b1 + c22b

2)

+ (c31a1 + c32a

2)(c31b1 + c32b

2)

Of particular interest is the way the distributive property translates to a propertycalled bilinearity for the dot product. Note how the ai factor out, and then the bi.

2. THE RIEMANNIAN METRIC FOR CURVED SURFACES 59

The gij represent the quantities in parentheses.

= a1[c11(c11b

1 + c12b2) + c21(c

21b

1 + c22b2) + c31(c

31b

1 + c32b2)]

+ a2[c12(c11b

1 + c12b2) + c22(c

21b

1 + c22b2) + c32(c

31b

1 + c32b2)]

= a1b1(c11c11 + c21c

21 + c31c

31) + a1b2(c11c

12 + c21c

22 + c31c

32)

+ a2b1(c12c11 + c22c

21 + c32c

31) + a2b2(c12c

12 + c22c

22 + c32c

32)

= a1b1g11 + a1b2g12 + a2b1g21 + a2b2g22.

(143)

We could have established this with less clutter by assuming that the dot productfollows a distributive law, which it does. The vectors

[a1, a2

]and

[b1, b2

]map

to the vectors on the surface a1 dxdu1 + a2 dx

du2 and b1 dxdu1 + b2 dx

du2 . Therefore,

(144)(a1 dxdu1

+ a2 dxdu2

)·(b1dxdu1

+ b2dxdu2

)

= a1b1dxdu1

· dxdu1

+ a1b2dxdu1

· dxdu2

+ a2b1dxdu2

· dxdu1

+ a2b2dxdu2

· dxdu2

= a1b1g11 + a1b2g12 + a2b1g21 + a2b2g22.

Note that the gij represent the same quantities in both derivations. The dot productis a special case of a vector space product called an inner product, which share thebasic property of bilinearity. Bilinearity is described as

(145) 〈 ax + by, z 〉 = a 〈 x, z 〉 + b 〈 y, z 〉

and

(146) 〈 x, by + cz 〉 = b 〈 x,y 〉 + c 〈 x, z 〉 .

This new inner product on the vectors[a1, a2

]and

[b1, b2

]is defined by

(147)⟨ [

a1, a2],[b1, b2

] ⟩=

[a1 a2

] [g11 g12g21 g22

][b1

b2

]

The matrix [ gij ], or the bilinear function (i.e., the inner product) defined by it, iscalled the first fundamental form. The entries of the matrix, the gij , generally varyfrom point to point, and we usually want to consider surfaces and parametrizationswhere these vary smoothly.

If we were to consider a vector at a certain point, say [ 1, 0 ], we can computeits inner product with itself using the first fundamental form.

(148) 〈 [ 1, 0 ] , [ 1, 0 ] 〉 =[1 0

] [g11 g12g21 g22

] [10

]= [ g11 ]

The quantity g11 came from the inner product of the vector on the surface corre-sponding to [ 1, 0 ], so this should not be surprising. Perhaps more importantly,note that once we have the matrix [ gij ], we can work exclusively with vectors inthe u1u2-plane, and while this vector is a unit vector under the dot product, withrespect to this new inner product, it has magnitude

√〈 [ 1, 0 ] , [ 1, 0 ] 〉 =

√g11.

The matrix [ gij ] is also called the the Riemannian metric or simply the metric.


3. Curvature

The curvature of the surface at a point depends on how the unit normal vectoris rotating as it moves past the point. This depends, of course, on how the normalis moving past the point, but this relationship is based on the derivative, and so,it is linear. We need, therefore, to find the derivative of the unit normal in twodirections, and this is most convenient in the directions corresponding to u1 andu2. In other words, the curvature of the surface is completely described by dn

du1 anddndu2 . In practice, these can be complicated derivatives to compute. For that reason,and also to understand them better, we will look at their relationship with otherderivatives.

The first derivatives of x determine n, so the second derivatives of x shouldalso determine the derivatives of n. These relationships are linear, so all shouldbe expressible in terms of matrix multiplications. Note that the product rule andchain rule generalize inner products and cross products in a natural way, and wewill use these as we need them.

The unit normal vector can be written in terms of the first derivatives, dxdu1 and

dxdu2 as follows, since the cross product is perpendicular to the two factors.

(149) n =dxdu1 × dx

du2∥∥ dxdu1 × dx

du2

∥∥Differentiating this expression directly is not immediately illuminating, so we willapproach this from another direction. Each of the second partial derivatives d2x

dujdui

measures the change in the first derivative dxdui at each point of the surface. Some of

this change occurs in the form of a change in magnitude, and some of this change isa result of the vector rotating. Furthermore, some change occurs along the tangentplane, and the rest in the direction of the normal vector. In any case, the twotangent vectors and the unit normal at each point form a basis for R3, so eachsecond derivative can be expressed as a linear combination of these three vectors.

d2xdujdui

= Lijn + Γ1ij

dxdu1

+ Γ2ij

dxdu2

= Lijn + Γkij

dxduk

(in Einstein notation)(150)

The four numbers Lij measure how quickly the first derivatives turn away fromthe surface. These together, therefore, can conceivably contain all of the surface’scurvature information. We are assuming that this curvature information comes fromthe linear approximation of the rotation of the unit normal vector at each point.We should look for the relationship between the Lij and the derivatives of the unitnormal vector. Now, the unit normal has constant magnitude, so its derivativesare perpendicular to n. In other words, the derivatives of n must be parallel tothe tangent plane. They can be expressed, therefore, as a linear combination of thefirst derivatives.

dndui

= −L1i

dxdu1

− L2i

dxdu2

= −Lji

dxduj

(151)

The four numbers −Lji are different from the Lij , and in general, the position of the

indices should be understood to indicate distinct variable names. The two L’s are

3. CURVATURE 61

closely related, and differences in index placement will usually imply a particularrelationship. Furthermore, the negative signs on the −Lj

i are customary and areused to simplify the relationship between the L’s. We have actually seen the Lj

i

before. The determinant of the matrix[Lj

i

]is the Gauss curvature.

To establish a formula tying the L’s together, we will differentiate the innerproduct

⟨dxdui ,n

⟩, which is zero, since the two vectors are perpendicular.

0 =d

⟨dxdui ,n

⟩

duk=

⟨dxdui

,dnduk

⟩+

⟨d2x

dukdui,n

⟩

=⟨dxdui

, Ljk

dxduj

⟩+ Lik

= −Ljk

⟨dxdui

,dxduj

⟩+ Lik

= −Ljkgij + Lik

(152)

We have, therefore, that

(153) Lik = Ljkgij (in Einstein notation).

This is a typical arrangement. We may speak of lowering an index, which meansmultiplying by the matrix associated with the metric. Note that the ordering of theindices is not critical, since the matrices we will be dealing with are for the mostpart symmetric. The matrix [ gij ] has a matrix inverse, which we will denote by

(154) [ gij ]−1 =[gjk

].

Again, the g with two superscripts is distinct from the g with two subscripts. Bydefinition, one is the metric, the other is the metric’s inverse. In particular, let

(155) δij =

{1 if i = j,

0 if i 6= j.

Essentially,[δij

]is the identity matrix. We can express the fact that [ gij ] and[

gjk]

are inverse matrices in Einstein notation by

(156) gijgjk = δk

i .

Equation (153) can be reversed using Einstein notation as

Lik = Ljkgij(157)

Likgil = Lj

kgijgil(158)

Likgil = Lj

kδlj = Ll

k.(159)

The matrix Lij defines a linear transformation relating the rate at which the unit

normal vector rotates with a corresponding velocity vector on the surface at aparticular point on the surface. This linear transformation is called the Weingartenmap. At each point of the surface the Weingarten map is a linear approximationto the Gauss map. It is, therefore, a derivative of the Gauss map, at least in somesense. The ratio of areas under the Gauss map with areas on the surface is theGauss curvature, so the Weingarten map is intimately related to the curvature ofthe surface. Since the relationship is linear, the particular region we choose to useto compute the two areas is not important. The easiest correspond to the square


determined by the unit vectors [ 1, 0 ] and [ 0, 1 ] in the u1u2-plane. The image ofthis square under dx is the parallelogram determined by the two tangent vectorsdxdu1 and dx

du2 . The area of this parallelogram can be found using the cross product

(160) Area on surface =dxdu1

× dxdu2

.

The corresponding vectors under the Weingarten map are dndu1 = −L1

1dxdu1 − L2

1dxdu2

and dndu2 = −L1

2dxdu1 − L2

2dxdu2 . The area of the paralellogram determined by these

two vectors can also be found using the cross product. Using the fact that the crossproduct is bilinear and anti-symmetric, we see that

dndu1

× dndu2

= (−L11

dxdu1

− L21

dxdu2

) × (−L12

dxdu1

− L22

dxdu2

)

= L11L

12

dxdu1

× dxdu1

+ L11L

22

dxdu1

× dxdu2

+ L21L

12

dxdu2

× dxdu1

+ L21L

22

dxdu2

× dxdu2

= 0 + L11L

22

dxdu1

× dxdu2

− L21L

12

dxdu1

× dxdu2

+ 0

= (L11L

22 − L2

1L12)dxdu1

× dxdu2

(161)

The ratio of the areas under the Gauss map and areas on the surface, therefore, isgiven by the determinant of the matrix

[Li

j

].

4. The Inverse of the Metric

It is more difficult to compute distances along a surface, since the metricchanges from point to point. At a particular point, however, the vector [ 1, 0 ]say, has an interpretation as a velocity vector with magnitude, or speed,

√g11. A

small movement in this direction from this point, say[

∆u1, 0], corresponds to a

distance√g11∆u1. This would be a good approximation for a distance along the

surface in this direction for small values of ∆u1. From this point, it should beconceivable that we could compute distances using integration, but that is not theconcern here. We will consider differentiation first.

Remember that we obtained the metric from the first partial derivatives ofthe parametrization, and for unit vectors u,

√〈 u,u 〉 is the magnitude of the

corresponding tangent vector on the surface. It is, in some sense, a directionalderivative.

CHAPTER 7

Riemannian Curvature Tensor

1. Intrinsic Interpretations

We have discussed the second derivatives of the vector function x in terms ofthe following.

(162)d2x

dujdui= Lijn + Γk

ij

dxduk

These are called Gauss’ formulas. The Lij are called the coefficients of the secondfundamental form, and the Γk

ij are called the Christoffel symbols (of the secondkind). We have talked about the Lij a bit, and right now, we will focus on the Γk

ij .We saw earlier that if we followed a tangent vector around a closed path in a

plane, then the net rotation of the tangent vector would be 2π (radians). Any devia-tion from 2π gives us direct information about total curvature contained within theclosed curve. Let us try to follow a tangent vector around the path correspondingto the unit square in the u1u2-plane ([1, 0]× [1, 0]) using information only availableat the point (via derivatives). That is, we will use the Γk

ij and perhaps derivativesof these at the point. Imposing the normal looking graph paper of the u1u2-planeon the surface, we will be running around one of the “squares,” which we’ll calls-squares, to have a name.

Running from (0, 0) to (1, 0), the tangent vector dxdu1 gives us a velocity on the

surface. We will assume that this vector is tangent to the first side of the s-square.The vector dx

du1 will move a distance∥∥ dx

du1

∥∥ (approximately) to the next vertex. Itwill turn towards dx

du2 according to Γ211. It changes magnitude according to Γ1

11. Inparticular, traversing the first side affects the tangent vector in the following way

(163)dxdu1

→ (1 + Γ111)

dxdu1

+ Γ211

dxdu2

.

Now we want to push this vector along the path corresponding to the segementfrom (1, 0) to (1, 1). We’re pushing in the direction of u2, so we are interested inΓ1

12 and Γ212, but these have changed slightly, since we’re starting at a different

point. This change can be approximated using dΓ112

du1 and dΓ212

du1 . We have moved adistance corresponding to a change in u1 of one unit, so these are the differences

63


we need. We have the effect of traversing the next side as

[1 + Γ1

11 +(1 + Γ1

11

) (Γ1

12 +dΓ1

12

du1

)]dxdu1

+[Γ2

11 +(1 + Γ1

11

) (Γ2

12 +dΓ2

12

du1

)]dxdu2

=[1 + Γ111 + Γ1

12 +dΓ1

12

du1+ Γ1

11Γ112 + Γ1

11

dΓ112

du1

]dxdu1

+[Γ2

11 + Γ212 +

dΓ212

du1+ Γ1

11Γ212 + Γ1

11

dΓ212

du1

]dxdu2

(164)

Pushing dxdu1 the other way around the square, from (0, 0) to (0, 1) to (1, 1), yields

[1 + Γ1

12 +(1 + Γ1

12

) (Γ1

11 +dΓ1

11

du2

)]dxdu1

+[Γ2

12 +(1 + Γ1

12

) (Γ2

11 +dΓ2

11

du2

)]dxdu2

=[1 + Γ1

12 + Γ111 +

dΓ111

du2+ Γ1

12Γ111 + Γ1

12

dΓ111

du2

]dxdu1

+[Γ2

12 + Γ211 +

dΓ211

du2+ Γ1

12Γ211 + Γ1

12

dΓ211

du2

]dxdu2

(165)

This is all wrong. Start over again.We want to push the vector x1 around the square two ways: in the u1 direction

and then the u2 direction, and also in the u2 direction and then the u1 direction.Let’s call these x1(12) and x1(21). We’ll also say x1(1) is the intermediate vectorafter pushing x1 in only the u1 direction.

Based on the information available at the original point, we can make thefollowing “best guess.” We start with

x1(1) = x1 + Γ111x1 + Γ2

11x2(166)

x2(1) = x2 + Γ121x1 + Γ2

21x2(167)

As we push x1(1) to x1(12), we can use the best information we have about thecurrent states of the various quantities. We have estimates of x1(1) and x2(1), andwe can also estimate the new Γk

11 with

Γ112(1) = Γ1

12 +d

du1Γ1

12(168)

Γ212(1) = Γ2

12 +d

du1Γ2

12.(169)

Therefore, we can make the estimate

(170) x1(12) = x1(1) + Γ112(1)x1(1) + Γ2

12(1)x2(1),

1. INTRINSIC INTERPRETATIONS 65

and substitution yields

x1(12)

= x1 + Γ111x1 + Γ2

11x2

+(

Γ112 +

d

du1Γ1

12

) (x1 + Γ1

11x1 + Γ211x2

)

+(

Γ212 +

d

du1Γ2

12

) (x2 + Γ1

21x1 + Γ221x2

)

=[1 + Γ1

11 + Γ112 +

dΓ112

du1+ Γ1

12Γ111 +

dΓ112

du1Γ1

11 + Γ212Γ

121 +

dΓ212

du1Γ1

21

]x1

+[Γ2

11 + Γ112Γ

211 +

dΓ112

du1Γ2

11 + Γ212 +

dΓ212

du1+ Γ2

12Γ221 +

dΓ212

du1Γ2

21

]x2

(171)

For x1(21), we can make similar approximations.

(172) x1(21) = x1(2) + Γ111(2)x1(2) + Γ2

11(2)x2(2).

The intemediate values are

x1(2) = x1 + Γ112x1 + Γ2

12x2(173)

x2(2) = x2 + Γ122x1 + Γ2

22x2(174)

Γ111(2) = Γ1

11 +d

du2Γ1

11(175)

Γ211(2) = Γ2

11 +d

du2Γ2

11.(176)

Therefore,

x1(21)

= x1 + Γ112x1 + Γ2

12x2

+(

Γ111 +

d

du2Γ1

11

) (x1 + Γ1

12x1 + Γ212x2

)

+(

Γ211 +

d

du2Γ2

11

) (x2 + Γ1

22x1 + Γ222x2

)

=[1 + Γ1

12 + Γ111 +

dΓ111

du2+ Γ1

11Γ112 +

dΓ111

du2Γ1

12 + Γ211Γ

122 +

dΓ211

du2Γ1

22

]x1

+[Γ2

12 + Γ111Γ

212 +

dΓ111

du2Γ2

12 + Γ211 +

dΓ211

du2+ Γ2

11Γ222 +

dΓ211

du2Γ2

22

]x2.

(177)


Some measure of the curvature is given by how different x1(12) is from x1(21). Ifwe subtract, we see that

x1(12) − x1(21)

=[dΓ1

12

du1− dΓ1

11

du2+ Γ1

12Γ111 − Γ1

11Γ112 +

dΓ112

du1Γ1

11 −dΓ1

11

du2Γ1

12

+ Γ212Γ

121 − Γ2

11Γ122 +

dΓ212

du1Γ1

21 −dΓ2

11

du2Γ1

22

]x1

+[Γ1

12Γ211 − Γ1

11Γ212 +

dΓ112

du1Γ2

11 −dΓ1

11

du2Γ2

12 +dΓ2

12

du1−dΓ2

11

du2

+ Γ212Γ

221 − Γ2

11Γ222 +

dΓ212

du1Γ2

21 −dΓ2

11

du2Γ2

22

]x2

=[dΓ1

12

du1− dΓ1

11

du2+ Γp

12Γ1p1 − Γp

11Γ1p2 +

dΓp12

du1Γ1

p1 −dΓp

11

du2Γ1

p2

]x1

+[Γp

12Γ2p1 − Γp

11Γ2p2 +

dΓp12

du1Γ2

p1 −dΓp

11

du2Γ2

p2 +dΓ2

12

du1− dΓ2

11

du2

]x2.

(178)

If we were pushing x1 around an ε-square, then all terms will have a factorof ε, except for the terms like dΓ1

12du1 Γ1

11, which would have a factor of ε2. It isconceivable that in a comparison with the actual function, the ε2-terms wouldbecome irrelevant. This brings agreement with the Riemannian curvature tensor.That is, x1(12) − x1(21) = Rk

112xk. In Millman and Parker, it is shown that

(179) K =Rl

121gl2

g.

It seems that Rl112 or Rl

212 could also be used with an appropriate gij . In our case,this appears to be

(180) K =Rl

112gl2

g.

This appears to be

(181)〈 x1(12) − x1(21),x2 〉

g.

To see that the difference seen in the vector x1 as it is pushed around the squaretwo different ways is relevant, we get an initial confirmation from the following.

In the Smarandache Manifolds book, it is shown that the relative angle betweentwo geodesics increases or decreases depending on the total curvature between thegeodesics. If the relative angle decreases by θ radians, then there must be θ totalcurvature between. The total curvature, therefore, must be ± the angle betweenx1(12) and x1(21). The formula given by Millman and Parker computes this angle.We can verify this as follows.

〈 x1(12),x2 〉 = ‖x1(12)‖ ‖x2‖ ≈ ‖x1‖ ‖x2‖ cos θ12(182)

〈 x1(21),x2 〉 = ‖x1(21)‖ ‖x2‖ ≈ ‖x1‖ ‖x2‖ cos θ21,(183)

and we want θ12 − θ21. We can use the trig identity

(184) cosα− cosβ = −2 sin(α+ β

2

)sin

(α− β

2

)

1. INTRINSIC INTERPRETATIONS 67

We also have the fact that

(185) g = ‖x1 × x2‖2.

Therefore,

Rl121gl2

g=

‖x1‖ ‖x2‖ (−2) sin(

θ12+θ212

)sin

(θ12−θ21

2

)

‖x1‖2 ‖x2‖2 sin2 θ

≈ θ12 − θ21‖x1‖ ‖x2‖ sin θ

= K

(186)

CHAPTER 8

Curvature of 3-Dimensional Spaces

1. What we know

There has been some work done with polyhedral metrics by Gromov, and laterby Aitchison, and Rubinstein. The latter two worked with cubings of 3-manifolds,which consist of flat 3-cubes, and the curvature is concentrated in the 1-skeleton.The dihedral angle around each edge must be at least 2π, so the geometry is Euclid-ean or hyperbolic around the edges. At each vertex, it is required that lk(v) hasthe property that every 1-cycle has at least three edges and that every 1-cycle withexactly three edges bounds a triangle contained in exactly one cube. I’m not surewhat lk(v) is, but my first guess is that it is a small ball about the vertex, andI think this means that it is a simplicial complex. My second guess is that it isthe surface of this ball. Each triangle corresponds to a cube. We’re looking at thetriangulation of a 2-sphere. OK, I think this is it, and I believe lk(v) stands for thelink of v.

2. What is the geometry like around a vertex of a cubed 3-manifold?

The simplest case might be eight cubes arranged like the octants of R3 aboutthe origin. Adding two more in the most obvious way yields the geometry of stackedcones with impulse curvature −π

2 .

Question 1. What is the nature of the curvature at a point beyond this typeof dihedral curvature?

3. A positive curvature example

This simple configuration consists of the corners of four cubes meeting at onevertex. A polyhedral ball about this vertex would form a tetrahedron. Since onlyone vertex is being considered, we can build the space out of four of the eight oc-tants of R3. We will use the x+y+z+-octant and the three octants adjacent to it,the x−y+z+-octant, the x+y−z+-octant, and the x+y+z−-octant. From this con-figuration, we will identify the x−z+-quarter plane of the x−y+z+-octant with they−z+-quarter plane of the x+y−z+-octant; the x−y+-quarter plane of the x−y+z+-octant with the y+z−-quarter plane of the x+y+z−-octant; and the x+z−-quarterplane of the x+y+z−-octant with the x+y−-quarter plane of the x+y−z+-octant.A view of this using cubes is shown in Figure 1. Note that we are left with foursemi-axes. The x+-, y+-, and z+-axes remain, and the three negative axes havebeen identified. We will refer to this last axis as the negative axis.

The geometry in the interior of each of the octants is Euclidean, and there isπ2 radians of dihedral curvature along each of the axes. This can be seen in the2-gon shown in Figure 2. It might be that it makes sense to say that there is 2π

69

70 8. CURVATURE OF 3-DIMENSIONAL SPACES

Figure 1. A depiction of the identifications: x−z+ ≡ y−z+,x−y+ ≡ y+z−, and x+z− ≡ x+y−.

Figure 2. A 2-gon with total curvature π2 .

steradians of curvature at the central vertex. One possible effect is some torsion inthe curvature of a curve near it.

Documents

Calculus and Differential Geometry ... - …faculty.mansfield.edu/hiseri/book-cdg.pdfCalculus and Diﬀerential Geometry: An Introduction to Curvature Donna Dietz Howard Iseri Department