UFDC Home  myUFDC Home  Help  RSS 
STANDARD VIEW
MARC VIEW


Full Text  
PAGE 1 E S S E N T I A L P H Y S I C S P a r t 1 R E L A T I V I T Y P A R T I C L E D Y N A M I C S G R A V I T A T I O N A N D W A V E M O T I O N F R A N K W K F I R K P r o f e s s o r E m e r i t u s o f P h y s i c s Y a l e U n i v e r s i t y 2 0 0 0 PAGE 2 ii PAGE 3 iiiCONTENTS PREFACE vii 1 MATHEMATICAL PRELIMINARIES 1.1 Invariants 1 1.2 Some geometrical invariants2 1.3 Elements of differential geometry5 1.4 Gaussian coordinates and the invariant line element7 1.5 Geometry and groups10 1.6 Vectors 13 1.7 Quaternions13 1.8 3vector analysis 16 1.9 Linear algebra and nvectors 18 1.10 The geometry of vectors 21 1.11 Linear operators and matrices 24 1.12 Rotation operators 25 1.13 Components of a vector under coordinate rotations27 2 KINEMATICS: THE GEOMETRY OF MOTION 2.1 Velocity and acceleration 31 2.2 Differential equations of kinematics 34 2.3 Velocity in Cartesian and polar coordinates 37 2.4 Acceleration in Cartesian and polar coordinates 39 3 CLASSICAL AND SPECIAL RELATIVITY 3.1 The Galilean transformation 44 3.2 Einsteins spacetime symmetry: the Lorentz transformation 46 3.3 The invariant interval: contravariant and covariant vectors 49 3.4 The group structure of Lorentz transformations 51 3.5 The rotation group 54 3.6 The relativity of simultaneity: time dilation and length contraction 55 3.7 The 4velocity 59 4 NEWTONIAN DYNAMICS 4.1 The law of inertia 63 PAGE 4 iv 4.2 Newtons laws of motion 65 4.3 Systems of many interacting particles: conservation of linear and angular momentum 66 4.4 Work and energy in Newtonian dynamics 72 4.5 Potential energy 74 4.6 Particle interactions 77 4.7 The motion of rigid bodies 81 4.8 Angular velocity and the instantaneous center of rotation 84 4.9 An application of the Newtonian method 86 5 INVARIANCE PRINCIPLES AND CONSERVATION LAWS 5.1 Invariance of the potential under translations and the conservation of linear momentum 93 5.2 Invariance of the potential under rotations and the conservation of angular momentum 93 6 EINSTEINIAN DYNAMICS 6.1 4momentum and the energymomentum invariant 96 6.2 The relativistic Doppler shift 97 6.3 Relativistic collisions and the conservation of 4momentum 98 6.4 Relativistic inelastic collisions 101 6.5 The Mandelstam variables 102 6.6 Positronelectron annihilationinflight 105 7 NEWTONIAN GRAVITATION 7.1 Properties of motion along curved paths in the plane 110 7.2 An overview of Newtonian gravitation 112 7.3 Gravitation: an example of a central force 117 7.4 Motion under a central force and the conservation of angular momentum 119 7.5 Keplers 2nd law explained 119 7.6 Central orbits 120 7.7 Bound and unbound orbits 125 7.8 The concept of the gravitational field 126 7.9 The gravitational potential 130 8 EINSTEINIAN GRAVITATION: AN INTRODUCTION TO GENERAL RELATIVITY 8.1 The principle of equivalence 135 8.2 Time and length changes in a gravitational field 137 8.3 The Schwarzschild line element 137 PAGE 5 v 8.4 The metric in the presence of matter 140 8.5 The weak field approximation 141 8.6 The refractive index of spacetime in the presence of mass 142 8.7 The deflection of light grazing the sun 143 9 AN INTRODUCTION TO THE CALCULUS OF VARIATIONS 9.1 The Euler equation 148 9.2 The Lagrange equations 150 9.3 The Hamilton equations 152 10 CONSERVATION LAWS, AGAIN 10.1 The conservation of mechanical energy 156 10.2 The conservation of linear and angular momentum 157 11 CHAOS 11.1 The general motion of a damped, driven pendulum 160 11.2 The numerical solution of differential equations 162 12 WAVE MOTION 12.1 The basic form of a wave 166 12.2 The general wave equation 169 12.3 The Lorentz invariant phase of a wave and the relativistic Doppler shift 170 12.4 Plane harmonic waves 172 12.5 Spherical waves 173 12.6 The superposition of harmonic waves 175 12.7 Standing waves 176 13 ORTHOGONAL FUNCTIONS AND FOURIER SERIES 13.1 Definitions 178 13.2 Some trigonometric identities and their Fourier series 179 13.3 Determination of the Fourier coefficients of a function 181 13.4 The Fourier series of a periodic sawtooth waveform 182 APPENDIX A SOLVING ORDINARY DIFFERENTIAL EQUATIONS 186 BIBLIOGRAPHY 197 PAGE 6 vi PAGE 7 vii PREFACE Throughout the decade of the 1990s, I taught a oneyear course of a specialized nature to students who entered Yale College with excellent preparation in Mathematics and the Physical Sciences, and who expressed an interest in Physics or a closely related field. The level of the course was that typified by the Feynman Lectures on Physics My oneyear course was necessarily more restricted in content than the twoyear Feynman Lectures. The depth of treatment of each topic was limited by the fact that the course consisted of a total of fiftytwo lectures, each lasting oneandaquarter hours. The key role played by invariants in the Physical Universe was constantly emphasized The material that I covered each Fall is presented, almost verbatim, in this book. The first chapter contains key mathematical ideas, including some invariants of geometry and algebra, generalized coordinates, and the algebra and geometry of vectors. The importance of linear operators and their matrix representations is stressed in the early lectures. These mathematical concepts are required in the presentation of a unified treatment of both Classical and Special Relativity. Students are encouraged to develop a relativistic outlook at an early stage The fundamental Lorentz transformation is developed using arguments based on symmetrizing the classical Galilean transformation. Key 4vectors, such as the 4velocity and 4momentum, and their invariant norms, are shown to evolve in a natural way from their classical forms.. A basic change in the subject matter occurs at this point in the book. It is necessary to introduce the Newtonian concepts of mass, momentum, and energy, and to discuss the conservation laws of linear and angular momentum, and mechanical energy, and their associated invariants. The PAGE 8 viiidiscovery of these laws, and their applications to everyday problems, represents the high point in the scientific endeavor of the 17th and 18th centuries. An introduction to thegeneral dynamical methods of Lagrange and Hamilton is delayed until Chapter 9 where they are included in a discussion of the Calculus of Variations. The key subject of Einsteinian dynamics is treated at a level not usually met at the introductory level. The4momentum invariant and its uses in relativistic collisions, both elastic and inelastic, isdiscussed in detail in Chapter 6 Further developments in the use of relativistic invariants are given in the discussion of the Mandelstam variables, and their application to the studyof highenergy collisions. Following an overview of Newtonian Gravitation, the generalproblem of central orbits is discussed using the powerful method of [p, r] coordinates.Einsteins General Theory of Relativity is introduced using the Principle of Equivalence andthe notion of extended inertial frames that include those frames in free fall in a gravitational field of small size in which there is no measurable field gradient. A heuristicargument is given to deduce the Schwarzschild line element in the weak field approximation; it is used as a basis for a discussion of the refractive index of spacetime inthe presence of matter. Einsteins famous predicted value for the bending of a beam oflight grazing the surface of the Sun is calculated. The Calculus of Variations is animportant topic in Physics and Mathematics; it is introduced in Chapter 9 where it is shown to lead to the ideas of the Lagrange and Hamilton functions. These functions areused to illustrate in a general way the conservation laws of momentum and angular momentum, and the relation of these laws to the homogeneity and isotropy of space. Thesubject of chaos is introduced by considering the motion of a damped, driven pendulum. PAGE 9 ix A method for solving the nonlinear equation of motion of the pendulum is outlined. Wave motion is treated from the pointofview of invariance principles. The form of the general wave equation is derived, and the Lorentz invariance of the phase of a wave is discussed in Chapter 12 The final chapter deals with the problem of orthogonal functions in general, and Fourier series, in particular. At this stage in their training, students are often underprepared in the subject of Differential Equations. Some useful methods of solving ordinary differential equations are therefore given in an appendix. The students taking my course were generally required to take a parallel oneyear course in the Mathematics Department that covered Vector and Matrix Algebra and Analysis at a level suitable for potential majors in Mathematics. Here, I have presented my version of a firstsemester course in Physics a version that deals with the essentials in a nofrills way. Over the years, I demonstrated that the contents of this compact book could be successfully taught in one semester. Textbooks are concerned with taking many known facts and presenting them in clear and concise ways; my understanding of the facts is largely based on the writings of a relatively small number of celebrated authors whose work I am pleased to acknowledge in the bibliography. Guilford, Connecticut February, 2000 I am grateful to several readers for pointing out errors and unclear statements in my first version of this book. The comments of Dr Andre Mirabelli were particularly useful, and were taken to heart. March, 2003 PAGE 10 x PAGE 11 1 MATHEMATICAL PRELIMINARIES 1.1 Invariants It is a remarkable fact that very few fundamental laws are required to describe the enormous range of physical phenomena that take place throughout the universe. The study of these fundamental laws is at the heart of Physics. The laws are found to have a mathematical structure; the interplay between Physics and Mathematics is therefore emphasized throughout this book. For example, Galileo found by observation, and Newton developed within a mathematical framework, the Principle of Relativity: the laws governing the motions of objects have the same mathematical form in all inertial frames of reference. Inertial frames move at constant speed in straight lines with respect to each other they are mutually nonaccelerating. We say that Newtons laws of motion are invariant under the Galilean transformation (see later discussion). The discovery of key invariants of Nature has been essential for the development of the subject. Einstein extended the Newtonian Principle of Relativity to include the motions of beams of light and of objects that move at speeds close to the speed of light. This extended principle forms the basis of Special Relativity. Later, Einstein generalized the principle to include accelerating frames of reference. The general principle is known as the Principle of Covariance; it forms the basis of the General Theory of Relativity ( a theory of Gravitation). PAGE 12 M A T H E M A T I C A L P R E L I M I N A R I E S 2 A review of the elementary properties of geometrical invariants, generalized coordinates, linear vector spaces, and matrix operators, is given at a level suitable for a sound treatment of Classical and Special Relativity. Other mathematical methods, including contraand covariant 4vectors, variational principles, orthogonal functions, and ordinary differential equations are introduced, as required. 1.2 Some geometrical invariants In his book The Ascent of Man Bronowski discusses the lasting importance of the discoveries of the Greek geometers. He gives a proof of the most famous theorem of Euclidean Geometry, namely Pythagoras theorem, that is based on the invariance of length and angle ( and therefore of area) under translations and rotations in space. Let a rightangled triangle with sides a, b, and c, be translated and rotated into the following four positions to form a square of side c: c 1 c 2 4 c b a 3 c  (b a)  The total area of the square = c 2 = area of four triangles + area of shaded square. If the rightangled triangle is translated and rotated to form the rectangle: PAGE 13 M A T H E M A T I C A L P R E L I M I N A R I E S 3 a a 1 4 b b 2 3 then the area of four triangles = 2ab. The area of the shaded square area is (b a) 2 = b 2 2ab + a 2 We have postulated the invariance of length and angle under translations and rotations and therefore c 2 = 2ab + (b a) 2 = a 2 + b 2 (1.1) We shall see that this key result characterizes the locally flat space in which we live. It is the only form that is consistent with the invariance of lengths and angles under translations and rotations The scalar product is an important invariant in Mathematics and Physics. Its invariance properties can best be seen by developing Pythagoras theorem in a threedimensional coordinate form. Consider the square of the distance between the points P[x 1 y 1 z 1 ] and Q[x 2 y 2 z 2 ] in Cartesian coordinates: PAGE 14 M A T H E M A T I C A L P R E L I M I N A R I E S 4 z y Q[x 2 ,y 2 ,z 2 ] P[x 1 ,y 1 ,z 1 ] a O x 1 x 2 x We have (PQ) 2 = (x 2 x 1 ) 2 + (y 2 y 1 ) 2 + (z 2 z 1 ) 2 = x 2 2 2x 1 x 2 + x 1 2 + y 2 2 2y 1 y 2 + y 1 2 + z 2 2 2z 1 z 2 + z 1 2 = (x 1 2 + y 1 2 + z 1 2 ) + (x 2 2 + y 2 2 + z 2 2 ) 2(x 1 x 2 + y 1 y 2 + z 1 z 2 ) = (OP) 2 + (OQ) 2 2(x 1 x 2 + y 1 y 2 + z 1 z 2 ) (1.2) The lengths PQ, OP, OQ, and their squares, are invariants under rotations and therefore the entire righthand side of this equation is an invariant. The admixture of the coordinates (x 1 x 2 + y 1 y 2 + z 1 z 2 ) is therefore an invariant under rotations. This term has a geometric interpretation: in the triangle OPQ, we have the generalized Pythagorean theorem (PQ) 2 = (OP) 2 + (OQ) 2 2OP.OQ cos a, therefore OP.OQ cos a = x 1 x 2 +y 1 y 2 + z 1 z 2 the scalar product (1.3) Invariants in spacetime with scalarproductlike forms, such as the interval between events (see 3.3 ), are of fundamental importance in the Theory of Relativity. PAGE 15 M A T H E M A T I C A L P R E L I M I N A R I E S 5 Although rotations in space are part of our everyday experience, the idea of rotations in spacetime is counterintuitive. In Chapter 3 this idea is discussed in terms of the relative motion of inertial observers. 1.3 Elements of differential geometry Nature does not prescibe a particular coordinate system or mesh. We are free to select the system that is most appropriate for the problem at hand. In the familiar Cartesian system in which the mesh lines are orthogonal, equidistant, straight lines in the plane, the key advantage stems from our ability to calculate distances given the coordinates we can apply Pythagoras theorem, directly. Consider an arbitrary mesh: v direction P[3 u 4 v ] 4 v ds, a length 3 v dv a du 2 v 1 v Origin O 1 u 2 u 3 u u direction Given the point P[3 u 4 v ], we cannot use Pythagoras theorem to calculate the distance OP. PAGE 16 M A T H E M A T I C A L P R E L I M I N A R I E S 6 In the infinitesimal parallelogram shown, we might think it appropriate to write ds 2 = du 2 + dv 2 + 2dudvcos a (ds 2 = (ds) 2 a squared length ) This we cannot do! The differentials du and dv are not lengths they are simply differences between two numbers that label the mesh. We must therefore multiply each differential by a quantity that converts each one into a length. Introducing dimensioned coefficients, we have ds 2 = g 11 du 2 + 2g 12 dudv + g 22 dv 2 (1.4) where g 11 du and g 22 dv are now lengths The problem is therefore one of finding general expressions for the coefficients; it was solved by Gauss, the preeminent mathematician of his age. We shall restrict our discussion to the case of two variables. Before treating this problem, it will be useful to recall the idea of a total differential associated with a function of more than one variable. Let u = f(x, y) be a function of two variables, x and y. As x and y vary, the corresponding values of u describe a surface. For example, if u = x 2 + y 2 the surface is a paraboloid of revolution. The partial derivatives of u are defined by f(x, y)/ x = limit as h 0 {(f(x + h, y) f(x, y))/h} (treat y as a constant), (1.5) and f(x, y)/ y = limit as k 0 {(f(x, y + k) f(x, y))/k} (treat x as a constant). (1.6) For example, if u = f(x, y) = 3x 2 + 2y 3 then f/ x = 6x, 2 f/ x 2 = 6, 3 f/ x 3 = 0 and PAGE 17 M A T H E M A T I C A L P R E L I M I N A R I E S 7 f/ y = 6y 2 2 f/ y 2 = 12y, 3 f/ y 3 = 12, and 4 f/ y 4 = 0. If u = f(x, y) then the total differential of the function is du = ( f/ x)dx + ( f/ y)dy corresponding to the changes: x x + dx and y y + dy. (Note that du is a function of x, y, dx, and dy of the independent variables x and y) 1.4 Gaussian coordinates and the invariant line element Consider the infinitesimal separation between two points P and Q that are described in either Cartesian or Gaussian coordinates: y + dy Q v + dv Q ds ds y P v P x x + dx u u + du Cartesian Gaussian In the Gaussian system, du and dv do not represent distances. Let x = f(u, v) and y = F(u, v) (1.7 a,b) then, in the infinitesimal limit dx = ( x/ u)du + ( x/ v)dv and dy = ( y/ u)du + ( y/ v)dv. In the Cartesian system, there is a direct correspondence between the meshnumbers and distances so that PAGE 18 M A T H E M A T I C A L P R E L I M I N A R I E S 8 ds 2 = dx 2 + dy 2 (1.8) But dx 2 = ( x/ u) 2 du 2 + 2( x/ u)( x/ v)dudv + ( x/ v) 2 dv 2 and dy 2 = ( y/ u) 2 du 2 + 2( y/ u)( y/ v)dudv + ( y/ v) 2 dv 2 We therefore obtain ds 2 = {( x/ u) 2 + ( y/ u) 2 }du 2 + 2{( x/ u)( x/ v) + ( y/ u)( y/ v)}dudv + {( x/ v) 2 + ( y/ v) 2 }dv 2 = g 11 du 2 + 2g 12 dudv + g 22 dv 2 (1.9) If we put u = u 1 and v = u 2 then ds 2 = g ij du i du j where i,j = 1,2. (1.10) i j (This is a general form for an ndimensional space: i, j = 1, 2, 3, ...n). Two important points connected with this invariant differential line element are: 1. Interpretation of the coefficients g ij Consider a Euclidean mesh of equispaced parallelograms: v R ds a dv u P du Q PAGE 19 M A T H E M A T I C A L P R E L I M I N A R I E S 9 In PQR ds 2 = 1.du 2 + 1.dv 2 + 2cos a dudv = g 11 du 2 + g 22 dv 2 + 2g 12 dudv (1.11) therefore, g 11 = g 22 = 1 (the meshlines are equispaced) and g 12 = cos a where a is the angle between the uv axes. We see that if the meshlines are locally orthogonal then g 12 = 0. 2. Dependence of the g ij s on the coordinate system and the local values of u, v. A specific example will illustrate the main points of this topic: consider a point P described in three coordinate systems Cartesian P[x, y], Polar P[r, f ], and Gaussian P[u, v] and the square ds 2 of the line element in each system. The transformation [x, y] [r, f ] is x = rcos f and y = rsin f (1.12 a,b) The transformation [r, f ] [u, v] is direct, namely r = u and f = v. Now, x/ r = cos f y/ r = sin f x/ f = rsin f y/ f = rcos f therefore, x/ u = cosv, y/ u = sinv, x/ v = usinv, y/ v = ucosv. The coefficients are therefore g 11 = cos 2 v + sin 2 v = 1, (1.13 ac) PAGE 20 M A T H E M A T I C A L P R E L I M I N A R I E S 10 g 22 = (usinv) 2 +(ucosv) 2 = u 2 and g 12 = cos(usinv) + sinv(ucosv) = 0 (an orthogonal mesh). We therefore have ds 2 = dx 2 + dy 2 (1.14 ac) = du 2 + u 2 dv 2 = dr 2 + r 2 d f 2 In this example, the coefficient g 22 = f(u). The essential point of Gaussian coordinate systems is that the coefficients, g ij completely characterize the surface they are intrinsic features. We can, in principle, determine the nature of a surface by measuring the local values of the coefficients as we move over the surface. We do not need to leave a surface to study its form. 1.5 Geometry and groups Felix Klein (1849 1925), introduced his influential Erlanger Program in 1872. In this program, Geometry is developed from the viewpoint of the invariants associated with groups of transformations In Euclidean Geometry, the fundamental objects are taken to be rigid bodies that remain fixed in size and shape as they are moved from place to place. The notion of a rigid body is an idealization. Klein considered transformations of the entire plane mappings of the set of all points in the plane onto itself. The proper set of rigid motions in the plane consists of translations and rotations. A reflection is an improper rigid motion in the plane; it is a physical impossibility in the plane itself. The set of all rigid motions both proper and PAGE 21 M A T H E M A T I C A L P R E L I M I N A R I E S 11 improper forms a group that has the proper rigid motions as a subgroup. A group G is a set of distinct elements {g i } for which a law of composition o is given such that the composition of any two elements of the set satisfies: Closure : if g i g j belong to G then g k = g i o g j belongs to G for all elements g i g j and Associativity : for all g i g j g k in G, g i o (g j o g k ) = (g i o g j ) o g k Furthermore, the set contains A unique identity e, such that g i o e = e o g i = g i for all g i in G, and A unique inverse g i for every element g i in G, such that g i o g i = g i o g i = e. A group that contains a finite number n of distinct elements g n is said to be a finite group of order n. The set of integers Z is a subset of the reals R; both sets form infinite groups under the composition of addition. Z is a subgroupof R. Permutations of a set X form a group S x under composition of functions; if a: X X and b: X X are permutations, the composite function ab: X X given by ab(x) = a(b(x)) is a permutation. If the set X contains the first n positive numbers, the n! permutations form a group, the symmetric group, S n For example, the arrangements of the three numbers 123 form the group S 3 = { 123, 312, 231, 132, 321, 213 }. PAGE 22 M A T H E M A T I C A L P R E L I M I N A R I E S 12 If the vertices of an equilateral triangle are labelled 123, the six possible symmetry arrangements of the triangle are obtained by three successive rotations through 120 o about its center of gravity, and by the three reflections in the planes I, II, III: I 1 2 3 II III This group of isometriesof the equilateral triangle (called the dihedral group, D 3 ) has the same structure as the group of permutations of three objects. The groups S 3 and D 3 are said to be isomorphic. According to Klein, plane Euclidean Geometry is the study of those properties of plane rigid figures that are unchanged by the group of isometries. (The basic invariants are length and angle ). In his development of the subject, Klein considered Similarity Geometry that involves isometries with a change of scale, (the basic invariant is angle ), Affine Geometry in which figures can be distorted under transformations of the form x = ax + by + c (1.15 a,b) y = dx + ey + f where [x, y] are Cartesian coordinates, and a, b, c, d, e, f, are real coefficients, and Projective Geometry in which all conic sections: circles, ellipses, parabolas, and hyperbolas can be transformed into one another by a projective transformation. PAGE 23 M A T H E M A T I C A L P R E L I M I N A R I E S 13 It will be shown that the Lorentz transformations the fundamental transformations of events in space and time, as described by different inertial observers form a group. 1.6 Vectors The idea that a line with a definite length and a definite direction a vector can be used to represent a physical quantity that possesses magnitude and direction is an ancient one. The combined action of two vectors A and B is obtained by means of the parallelogram law, illustrated in the following diagram A + B B A The diagonal of the parallelogram formed by A and B gives the magnitude and direction of the resultant vector C Symbollically, we write C = A + B (1.16) in which the = sign has a meaning that is clearly different from its meaning in ordinary arithmetic. Galileo used this empiricallybased law to obtain the resultant force acting on a body. Although a geometric approach to the study of vectors has an intuitive appeal, it will often be advantageous to use the algebraic method particularly in the study of Einsteins Special Relativity and Maxwells Electromagnetism. 1.7 Quaternions In the decade 1830 1840, the renowned Hamilton introduced new kinds of PAGE 24 M A T H E M A T I C A L P R E L I M I N A R I E S 14 numbers that contain four components, and that do not obey the commutative property of multiplication. He called the new numbers quaternions A quaternion has the form u + x i + y j + z k (1.17) in which the quantities i j k are akin to the quantity i = in complex numbers, x + iy. The component u forms the scalar part, and the three components x i + y j + z k form the vector part of the quaternion. The coefficients x, y, z can be considered to be the Cartesian components of a point P in space. The quantities i j k are qualitative units that are directed along the coordinate axes. Two quaternions are equal if their scalar parts are equal, and if their coefficients x, y, z of i j k are respectively equal. The sum of two quaternions is a quaternion. In operations that involve quaternions, the usual rules of multiplication hold except in those terms in which products of i j k occur in these terms, the commutative law does not hold. For example j k = i k j = i k i = j i k = j i j = k j i = k (1.18) (these products obey a righthand rule), and i 2 = j 2 = k 2 = (Note the relation to i 2 = ). (1.19) The product of two quaternions does not commute. For example, if p = 1 + 2 i + 3 j + 4 k and q = 2 + 3 i + 4 j + 5 k then pq = 36 + 6 i + 12 j + 12 k whereas PAGE 25 M A T H E M A T I C A L P R E L I M I N A R I E S 15 qp = 36 + 23 i 2 j + 9 k Multiplication is associative. Quaternions can be used as operators to rotate and scale a given vector into a new vector: (a + b i + c j + d k )(x i + y j + z k ) = (x i + y j + z k ) If the law of composition is quaternionic multiplication then the set Q = {, i j k } is found to be a group of order 8. It is a noncommutative group. Hamilton developed the Calculus of Quaternions. He considered, for example, the properties of the differential operator: = i ( / x) + j ( / y) + k ( / z). (1.20) (He called this operator nabla). If f(x, y, z) is a scalar point function (singlevalued) then f = i ( f/ x) + j ( f/ y) + k ( f/ z) a vector. If v = v 1 i + v 2 j + v 3 k is a continuous vector point function, where the v i s are functions of x, y, and z, Hamilton introduced the operation v = ( i / x + j / y + k / z)(v 1 i + v 2 j + v 3 k ) (1.21) = ( v 1 / x + v 2 / y + v 3 / z) + ( v 3 / y v 2 / z) i + ( v 1 / z v 3 / x) j + ( v 2 / x v 1 / y) k PAGE 26 M A T H E M A T I C A L P R E L I M I N A R I E S 16 = a quaternion. The scalar part is the negative of the divergence of v (a term due to Clifford), and the vector part is the curl of v (a term due to Maxwell). Maxwell used the repeated operator 2 which he called the Laplacian. 1.8 3 vector analysis Gibbs, in his notes for Yale students, written in the period 1881 1884, and Heaviside, in articles published in the Electrician in the 1880s, independently developed 3dimensional Vector Analysis as a subject in its own right detached from quaternions. In the Sciences, and in parts of Mathematics (most notably in Analytical and Differential Geometry), their methods are widely used. Two kinds of vector multiplication were introduced: scalar multiplication and vector multiplication. Consider two vectors v and v where v = v 1 e 1 + v 2 e 2 + v 3 e 3 and v = v 1 e 1 + v 2 e 2 + v 3 e 3 The quantities e 1 e 2 and e 3 are vectors of unit length pointing along mutually orthogonal axes, labelled 1, 2, and 3. i) The scalar multiplication of v and v is defined as v v = v 1 v 1 + v 2 v 2 + v 3 v 3 (1.22) where the unit vectors have the properties e 1 e 1 = e 2 e 2 = e 3 e 3 = 1, (1.23) PAGE 27 M A T H E M A T I C A L P R E L I M I N A R I E S 17 and e 1 e 2 = e 2 e 1 = e 1 e 3 = e 3 e 1 = e 2 e 3 = e 3 e 2 = 0. (1.24) The most important property of the scalar product of two vectors is its invariance under rotations and translations of the coordinates. (See Chapter 1 ). ii) The vector product of two vectors v and v is defined as e 1 e 2 e 3 v v = v 1 v 2 v 3 ( where . is the determinant) (1.25) v 1 v 2 v 3 = (v 2 v 3 v 3 v 2 ) e 1 + (v 3 v 1 v 1 v 3 ) e 2 + (v 1 v 2 v 2 v 1 ) e 3 The unit vectors have the properties e 1 e 1 = e 2 e 2 = e 3 e 3 = 0 (1.26 a,b) (note that these properties differ from the quaternionic products of the i j k s), and e 1 e 2 = e 3 e 2 e 1 = e 3 e 2 e 3 = e 1 e 3 e 2 = e 1 e 3 e 1 = e 2 e 1 e 3 = e 2 These noncommuting vectors, or cross products obey the standard righthandrule. The vector product of two parallel vectors is zero even when neither vector is zero. The nonassociative property of a vector product is illustrated in the following example e 1 e 2 e 2 = ( e 1 e 2 ) e 2 = e 3 e 2 = e 1 = e 1 ( e 2 e 2 ) = 0. PAGE 28 M A T H E M A T I C A L P R E L I M I N A R I E S 18 Important operations in Vector Analysis that follow directly from those introduced in the theory of quaternions are: 1) the gradient of a scalar function f(x 1 x 2 x 3 ) f = ( f/ x 1 ) e 1 + ( f/ x 2 ) e 2 + ( f/ x 3 ) e 3 (1.27) 2) the divergence of a vector function v v = v 1 / x 1 + v 2 / x 2 + v 3 / x 3 (1.28) where v has components v 1 v 2 v 3 that are functions of x 1 x 2 x 3 and 3) the curl of a vector function v e 1 e 2 e 3 v = / x 1 / x 2 / x 3 (1.29) v 1 v 2 v 3 The physical significance of these operations is discussed later. 1.9 Linear algebra and nvectors A major part of Linear Algebra is concerned with the extension of the algebraic properties of vectors in the plane (2vectors), and in space (3vectors), to vectors in higher dimensions (nvectors). This area of study has its origin in the work of Grassmann (1809 77), who generalized the quaternions (4component hypercomplex numbers), introduced by Hamilton. An ndimensional vector is defined as an ordered column of numbers x 1 x 2 x n = (1.30) x n PAGE 29 M A T H E M A T I C A L P R E L I M I N A R I E S 19 It will be convenient to write this as an ordered row in square brackets x n = [x 1 x 2 ... x n ] (1.31) The transpose of the column vector is the row vector x n T = (x 1 x 2 ...x n ). (1.32) The numbers x 1 x 2 ...x n are called the components of x and the integer n is the dimension of x The order of the components is important, for example [1, 2, 3] [2, 3, 1]. The two vectors x = [x 1 x 2 ...x n ] and y = [y 1 y 2 ...y n ] are equal if x i = y i (i = 1 to n). The laws of Vector Algebra are 1. x + y = y + x (1.33 ae) 2. [ x + y ] + z = x + [ y + z ] 3. a[ x + y ] = a x + a y where a is a scalar 4. (a + b) x = a x + b y where a,b are scalars 5. (ab) x = a(b x ) where a,b are scalars If a = 1 and b = then x + [ x ] = 0 where 0 = [0, 0, ...0] is the zero vector. The vectors x = [x 1 x 2 ...x n ] and y = [y 1 y 2 ...y n ] can be added to give their sum or resultant: PAGE 30 M A T H E M A T I C A L P R E L I M I N A R I E S 20 x + y = [x 1 + y 1 x 2 + y 2 ...,x n + y n ]. (1.34) The set of vectors that obeys the above rules is called the space of all nvectors or the vector space of dimension n. In general, a vector v = a x + b y lies in the plane of x and y The vector v is said to depend linearly on x and y it is a linear combination of x and y A kvector v is said to depend linearly on the vectors u 1 u 2 ... u k if there are scalars a i such that v = a 1 u 1 +a 2 u 2 + ...a k u k (1.35) For example [3, 5, 7] = [3, 6, 6] + [0, 1] = 3[1, 2, 2] + 1[0, 1], a linear combination of the vectors [1, 2, 2] and [0, 1]. A set of vectors u 1 u 2 ... u k is called linearly dependent if one of these vectors depends linearly on the rest. For example, if u 1 = a 2 u 2 + a 3 u 3 + ...+ a k u k ., (1.36) the set u 1 ... u k is linearly dependent. If none of the vectors u 1 u 2 ... u k can be written linearly in terms of the remaining ones we say that the vectors are linearly independent. Alternatively, the vectors u 1 u 2 ... u k are linearly dependent if and only if there is an equation of the form c 1 u 1 + c 2 u 2 + ...c k u k = 0 (1.37) in which the scalars c i are not all zero. PAGE 31 M A T H E M A T I C A L P R E L I M I N A R I E S 21 Consider the vectors e i obtained by putting the i th component equal to 1, and all the other components equal to zero: e 1 = [1, 0, 0, ...0] e 2 = [0, 1, 0, ...0] ... then every vector of dimension n depends linearly on e 1 e 2 ... e n thus x = [x 1 x 2 ...x n ] = x 1 e 1 + x 2 e 2 + ...x n e n (1.38) The e i s are said to span the space of all nvectors; they form a basis Every basis of an nspace has exactly n elements. The connection between a vector x and a definite coordinate system is made by choosing a set of basis vectors e i 1.10 The geometry of vectors The laws of vector algebra can be interpreted geometrically for vectors of dimension 2 and 3. Let the zero vector represent the origin of a coordinate system, and let the 2vectors, x and y correspond to points in the plane: P[x 1 x 2 ] and Q[y 1 y 2 ]. The vector sum x + y is represented by the point R, as shown R[x 1 +y 1 x 2 +y 2 ] 2nd component x 2 P[x 1 x 2 ] y 2 Q[y 1 y 2 ] O[0, 0] x 1 y 1 1st component PAGE 32 M A T H E M A T I C A L P R E L I M I N A R I E S 22 R is in the plane OPQ, even if x and y are 3vectors. Every vector point on the line OR represents the sum of the two corresponding vector points on the lines OP and OQ. We therefore introduce the concept of the directed vector lines OP, OQ, and OR, related by the vector equation OP + OQ = OR (1.39) A vector V can be represented as a line of length OP pointing in the direction of the unit vector v thus P V = v .OP v O A vector V is unchanged by a pure displacement: = V 2 V 1 where the = sign means equality in magnitude and direction. Two classes of vectors will be met in future discussions; they are 1. Polar vectors : the vector is drawn in the direction of the physical quantity being represented, for example a velocity, and 2. Axial vectors : the vector is drawn parallel to the axis about which the physical quantity acts, for example an angular velocity. PAGE 33 M A T H E M A T I C A L P R E L I M I N A R I E S 23 The associative property of the sum of vectors can be readily demonstrated, geometrically C V B A We see that V = A + B + C = ( A + B ) + C = A + ( B + C ) = ( A + C ) + B (1.40) The process of vector addition can be reversed; a vector V can be decomposed into the sum of n vectors of which (n 1) are arbitrary, and the n th vector closes the polygon. The vectors need not be in the same plane. A special case of this process is the decomposition of a 3vector into its Cartesian components. A general case A special case V V 5 V V z V 4 V 1 V 3 V x V y V 2 V 1 V 2 V 3 V 4 : arbitrary V z closes the polygon V 5 closes the polygon PAGE 34 M A T H E M A T I C A L P R E L I M I N A R I E S 24 The vector product of A and B is an axial vector, perpendicular to the plane containing A and B. z ^ B y A B a a unit vector + n A perpendicular to the A B plane x A B = AB sin a n = B A (1.41) 1.11 Linear Operators and Matrices Transformations from a coordinate system [x, y] to another system [x, y], without shift of the origin, or from a point P[x, y] to another point P[x, y], in the same system, that have the form x = ax + by y = cx + dy where a, b, c, d are real coefficients, can be written in matrix notation, as follows x a b x = (1.41) y c d y Symbolically, x = Mx (1.42) where PAGE 35 M A T H E M A T I C A L P R E L I M I N A R I E S 25 x = [x, y], and x = [x, y], both column 2vectors, and a b M = c d a 2 2 matrix operator that changes [x, y] into [x, y]. In general, M transforms a unit square into a parallelogram: y y [a+b,c+d] [b,d] [0,1] [1,1] x [a,c] [0,0] [1,0] x This transformation plays a key rle in Einsteins Special Theory of Relativity (see later discussion). 1.12 Rotation operators Consider the rotation of an x, y coordinate system about the origin through an angle f : y y P[x, y] or P[x, y] y y f x x + f O,O x x PAGE 36 M A T H E M A T I C A L P R E L I M I N A R I E S 26From the diagram, we see that x = xcos f + ysin f and y = xsin f + ycos f or x co s f si n f x = y si n f co s f y Symbolically, P = c( f ) P (1.43) where cos f sin f c( f ) = is the rotation operator sin f cos f The subscript c denotes a rotation of the coordinates through an angle + f The inverse operator, c ( f ), is obtained by reversing the angle of rotation: + f f We see that matrix product c ( f ) c( f ) = c T( f ) c( f ) = I (1.44) where the superscript T indicates the transpose (rows columns), and 1 0 I = is the identity operator. (1.45) 0 1 Eq.(1.44) is the defining property of an orthogonal matrix If we leave the axes fixed and rotate the point P[x, y] to P[x, y], then PAGE 37 M A T H E M A T I C A L P R E L I M I N A R I E S 27we have y y P[x, y] y P[x, y] f O x x x From the diagram, we see that x = xcos f ysin f and y = xsin f + ycos f or P = v( f ) P (1.46) where cos f sin f v( f ) = the operator that rotates a vector through + f sin f cos f 1.13 Components of a vector under coordinate rotations Consider a vector V [ vx, vy], and the same vector V with components [ vx, vy], in a coordinate system (primed), rotated through an angle + f y y vy vy V = V x vx f O, O vx x PAGE 38 M A T H E M A T I C A L P R E L I M I N A R I E S 28 We have met the transformation [x, y] [x, y] under the operation c ( f ); here, we have the same transformation but now it operates on the components of the vector, v x and v y [ v x v y ] = c ( f )[ v x v y ]. (1.47) PROBLEMS 11 i) If u = 3 x/y show that u/ x = (3 x/y ln3)/y and u/ y = ( x/y xln3)/y 2 ii) If u = ln{(x 3 + y)/x 2 } show that u/ x = (x 3 2y)/(x(x 3 +y)) and u/ y = 1/(x 3 + y). 12 Calculate the second partial derivatives of f(x, y) = (1/ y)exp{(x a) 2 /4y}, a = constant. 13 Check the answers obtained in problem 12 by showing that the function f(x, y) in 12 is a solution of the partial differential equation 2 f/ x 2 f/ y = 0. 14 If f(x, y, z) = 1/(x 2 + y 2 + z 2 ) 1/2 = 1/r, show that f(x, y, z) = 1/r is a solution of Laplaces equation 2 f/ x 2 + 2 f/ y 2 + 2 f/ z 2 = 0. This important equation occurs in many branches of Physics. 15 At a given instant, the radius of a cylinder is r(t) = 4cm and its height is h(t) = 10cm. If r(t) and h(t) are both changing at a rate of 2 cm.s show that the instantaneous increase in the volume of the cylinder is 192 p cm 3 .s 16 The transformation between Cartesian coordinates [x, y, z] and spherical polar coordinates [r, q f ] is PAGE 39 M A T H E M A T I C A L P R E L I M I N A R I E S 29 x = rsin q cos f y = rsin q sin f z = rcos q Show, by calculating all necessary partial derivatives, that the square of the line element is ds 2 = dr 2 + r 2 sin 2 q d f 2 + r 2 d q 2 Obtain this result using geometrical arguments. This form of the square of the line element will be used on several occasions in the future. 17 Prove that the inverse of each element of a group is unique. 18 Prove that the set of positive rational numbers does not form a group under division. 19 A finite group of order n has n 2 products that may be written in an n n array, called the group multiplication table. For example, the 4throots of unity {e, a, b, c} = {, i}, where i = forms a group under multiplication (1i = i, i(i) = 1, i 2 = (i) 2 = etc. ) with a multiplication table e = 1 a = i b = c = i e 1 i i a i i 1 b i 1 i c i 1 i In this case, the table is symmetric about the main diagonal; this is a characteristic feature of a group in which all products commute (ab = ba) it is an Abelian group. If G is the dihedral group D 3 discussed in the text, where G = {e, a, a 2 b, c, d}, where e is the identity, obtain the group multiplication table. Is it an Abelian group?. PAGE 40 M A T H E M A T I C A L P R E L I M I N A R I E S 30 Notice that the three elements {e, a, a 2 } form a subgroup of G, whereas the three elements {b, c, d} do not; there is no identity in this subset. The group D 3 has the same multiplication table as the group of permutations of three objects. This is the condition that signifies group isomorphism. 110 Are the sets i) {[0, 1, 1], [1, 0, 1], [1, 1, 0]} and ii) {[1, 3, 5, 7], [4, 2, 1], [2, 1, 4, 5]} linearly dependent? Explain. 111 i) Prove that the vectors [0, 1, 1], [1, 0, 1], [1, 1, 0] form a basis for Euclidean space R 3 ii) Do the vectors [1, i] and [i, ], (i = ), form a basis for the complex space C 2 ? 112 Interpret the linear independence of two 3vectors geometrically. 113 i) If X = [1, 2, 3] and Y = [3, 2, 1], prove that their cross product is orthogonal to the X Y plane. ii) If X and Y are 3vectors, prove that X Y = 0 iff X and Y are linearly dependent. 114 If a 11 a 12 a 13 T = a 21 a 22 a 23 0 0 1 represents a linear transformation of the plane under which distance is an invariant show that the following relations must hold : a 11 2 + a 21 2 = a 12 2 + a 22 2 = 1, and a 11 a 12 + a 21 a 22 = 0. PAGE 41 2 KINEMATICS: THE GEOMETRY OF MOTION 2.1 Velocity and acceleration The most important concepts in Kinematics a subject in which the properties of the forces responsible for the motion are ignored can be introduced by studying the simplest of all motions, namely that of a point P moving in a straight line. Let a point P[t, x] be at a distance x from a fixed point O at a time t, and let it be at a point P[t, x] = P[t + D t, x + D x] at a time D t later. The average speed of P in the interval D t is PAGE 42 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 32 The tangent of the angle made by the tangent to the curve at any point gives the value of the instantaneous speed at the point. The instantaneous acceleration a of the point P is given by the time rateofchange of the velocity a = d v /dt = d(d x /dt)/dt = d 2 x /dt 2 = x (2.3) A change of variable from t to x gives a = dv/dt = dv(dx/dt)/dx = v(dv/dx). (2.4) This is a useful relation when dealing with problems in which the velocity is given as a function of the position. For example v v P P v a O N Q x The gradient is dv/dx and tan a = dv/dx, therefore NQ, the subnormal = v(dv/dx) = a p the acceleration of P. (2.5) The area under a curve of the speed as a function of time between the times t 1 and t 2 is [A] [t1,t2] = [t1,t2] v(t)dt = [t1,t2] (dx/dt)dt = [x1,x2] dx = (x 2 x 1 ) = distance travelled in the time t 2 t 1 (2.6) PAGE 43 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 33 The solution of a kinematical problem is sometimes simplified by using a graphical method, for example: A point A moves along an xaxis with a constant speed v A Let it be at the origin O (x = 0) at time t = 0. It continues for a distance x A at which point it decelerates at a constant rate, finally stopping at a distance X from O at time T. A second point B moves away from O in the +xdirection with constant acceleration. Let it begin its motion at t = 0. It continues to accelerate until it reaches a maximum speed v B max at a time t B max when at x B max from O. At x B max it begins to decelerate at a constant rate, finally stopping at X at time T: To prove that the maximum speed of B during its motion is v B max = v A {1 (x A /2X)} a value that is independent of the time at which the maximum speed is reached. The velocitytime curves of the points are v A possible path for B v B max v A B A O t = 0 t A t B max T t x = 0 x A x B max X The areas under the curves give X = v A t A + v A (T t A )/2 = v B max T/2, so that v B max = v A (1 + (t A /T)), but v A T = 2X x A therefore v B max = v A {1 (x A /2X)} f(t B max ). PAGE 44 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 34 2.2 Differential equations of kinematics If the acceleration is a known function of time then the differential equation a(t) = dv/dt (2.7) can be solved by performing the integrations (either analytically or numerically) a(t)dt = dv (2.8) If a(t) is constant then the result is simply at + C = v, where C is a constant that is given by the initial conditions. Let v = u when t = 0 then C = u and we have at + u = v. (2.9) This is the standard result for motion under constant acceleration. We can continue this approach by writing: v = dx/dt = u + at. Separating the variables, dx = udt + atdt. Integrating gives x = ut + (1/2)at 2 + C (for constant a). If x = 0 when t = 0 then C = 0, and x(t) = ut + (1/2)at 2 (2.10) Multiplying this equation throughout by 2a gives 2ax = 2aut + (at) 2 = 2aut + (v u) 2 PAGE 45 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 35 and therefore, rearranging, we obtain v 2 = 2ax 2aut + 2vu u 2 = 2ax + 2u(v at) u 2 = 2ax + u 2 (2.11) In general, the acceleration is a given function of time or distance or velocity: 1) If a = f(t) then a = dv/dt =f(t), (2.12) dv = f(t)dt, therefore v = f(t)dt + C(a constant). This equation can be written v = dx/dt = F(t) + C, therefore dx = F(t)dt + Cdt. Integrating gives x(t) = F(t)dt + Ct + C. (2.13) The constants of integration can be determined if the velocity and the position are known at a given time. 2) If a = g(x) = v(dv/dx) then (2.14) vdv = g(x)dx. Integrating gives PAGE 46 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 36 v 2 = 2 g(x)dx + D, therefore v 2 = G(x) + D so that v = (dx/dt) = (G(x) + D). (2.15) Integrating this equation leads to dx/{ (G(x) + D)} = t + D. (2.16) Alternatively, if a = d 2 x/dt 2 = g(x) then, multiplying throughout by 2(dx/dt)gives 2(dx/dt)(d 2 x/dt 2 ) = 2(dx/dt)g(x). Integrating then gives (dx/dt) 2 = 2 g(x)dx + D etc. As an example of this method, consider the equation of simple harmonic motion (see later discussion) d 2 x/dt 2 = w 2 x. (2.17) Multiply throughout by 2(dx/dt), then 2(dx/dt)d 2 x/dt 2 = w 2 x(dx/dt). This can be integrated to give (dx/dt) 2 = w 2 x 2 + D. If dx/dt = 0 when x = A then D = w 2 A 2 therefore PAGE 47 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 37 (dx/dt) 2 = w 2 (A 2 x 2 ) = v 2 so that dx/dt = w (A 2 x 2 ). Separating the variables, we obtain dx/{ (A 2 x 2 )} = w dt. (The minus sign is chosen because dx and dt have opposite signs). Integrating, gives cos (x/A) = w t + D. But x = A when t = 0, therefore D = 0, so that x(t) = Acos( w t), where A is the amplitude. (2.18) 3) If a = h(v), then (2.19) dv/dt = h(v) therefore dv/h(v) = dt, and dv/h(v) = t + B. (2.20) Some of the techniques used to solve ordinary differential equations are discussed in Appendix A. 2.3 Velocity in Cartesian and polar coordinates The transformation from Cartesian to Polar Coordinates is represented by the linear equations PAGE 48 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 38 x = rcos f and y = rsin f (2.21 a,b) or x = f(r, f ) and y = g(r, f ). The differentials are dx = ( f/ r)dr + ( f/ f )d f and dy = ( g/ r)dr + ( g/ f )d f We are interested in the transformation of the components of the velocity vector under [x, y] [r, f ]. The velocity components involve the rates of change of dx and dy with respect to time: dx/dt = ( f/ r)dr/dt + ( f/ f )d f /dt and dy/dt = ( g/ r)dr/dt + ( g/ f )d f /dt or x = ( f/ r)r + ( f/ f ) f and y = ( g/ r)r + ( g/ f ) f (2.22) But, f/ r = cos f f/ f = rsin f g/ r = sin f and g/ f = rcos f therefore, the velocity transformations are x = cos f r sin f (r f ) = v x (2.23) and y = sin f r + cos f (r f ) = v y (2.24) These equations can be written v x cos f sin f dr/dt = v y sin f cos f rd f /dt Changing f f gives the inverse equations PAGE 49 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 39 dr/dt cos f sin f v x = rd f /dt sin f cos f v y or v r v x = c ( f ) (2.25) v f v y The velocity components in [r, f ] coordinates are therefore V  v f  = r f = rd f /dt  v r  = r =dr/dt P[r, f ] r + f anticlockwise O x The quantity d f /dt is called the angular velocity of P about the origin O. 2.4 Acceleration in Cartesian and polar coordinates We have found that the velocity components transform from [x, y] to [r, f ] coordinates as follows v x = cos f r sin f (r f ) = x and v y = sin f r + cos f (r f ) = y. The acceleration components are given by a x = dv x /dt and a y = dv y /dt We therefore have a x = (d/dt){cos f r sin f (r f )} (2.26) PAGE 50 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 40 = cos f (r r f 2 ) sin f (2r f + r f ) and a y = (d/dt){sin f r + cos f (r f )} (2.27) = cos f (2r f + r f ) + sin f (r r f 2 ). These equations can be written a r cos f sin f a x = (2.28) a f sin f cos f a y The acceleration components in [r, f ] coordinates are therefore A  a f  = 2r f + r f a r  = r r f 2 P[r, f ] r f O x These expressions for the components of acceleration will be of key importance in discussions of Newtons Theory of Gravitation. We note that, if r is constant, and the angular velocity w is constant then a f = r f = r w = 0, (2.29) a r = r f 2 = r w 2 = r(v f /r) 2 = v f 2 /r, (2.30) and v f = r f = r w (2.31) These equations are true for circular motion. PAGE 51 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 41 PROBLEMS 21 A point moves with constant acceleration, a, along the xaxis. If it moves distances D x 1 and D x 2 in successive intervals of time D t 1 and D t 2 prove that the acceleration is a = 2(v 2 v 1 )/T where v 1 = D x 1 / D t 1 v 2 = D x 2 / D t 2 and T = D t 1 + D t 2 22 A point moves along the xaxis with an instantaneous deceleration (negative acceleration): a(t) v n+1 (t) where v(t) is the instantaneous speed at time t, and n is a positive integer. If the initial speed of the point is u (at t = 0), show that k n t = {(u n v n )/(uv) n }/n, where k n is a constant of proportionality, and that the distance travelled, x(t), by the point from its initial position is k n x(t) = {(u n v n )/(uv) n }/(n 1). 23 A point moves along the xaxis with an instantaneous deceleration kv 3 (t), where v(t) is the speed and k is a constant. Show that v(t) = u/(1 + kux(t)) where x(t) is the distance travelled, and u is the initial speed of the point. 24 A point moves along the xaxis with an instantaneous acceleration d 2 x/dt 2 = w 2 /x 2 where w is a constant. If the point starts from rest at x = a, show that the speed of PAGE 52 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 42 the particle is dx/dt = w {2(a x)/(ax)} 1/2 Why is the negative square root chosen? 25 A point P moves with constant speed v along the xaxis of a Cartesian system, and a point Q moves with constant speed u along the yaxis. At time t = 0, P is at x = 0, and Q, moving towards the origin, is at y = D. Show that the minimum distance, d min between P and Q during their motion is d min = D{1/(1 + (u/v) 2 )} 1/2 Solve this problem in two ways:1) by direct minimization of a function, and 2) by a geometrical method that depends on the choice of a more suitable frame of reference (for example, the rest frame of P). 26 Two ships are sailing with constant velocities u and v on straight courses that are inclined at an angle q If, at a given instant, their distances from the point of intersection of their courses are a and b, find their minimum distance apart. 27 A point moves along the xaxis with an acceleration a(t) = kt 2 where t is the time the point has been in motion, and k is a constant. If the initial speed of the point is u, show that the distance travelled in time t is x(t) = ut + (1/12)kt 4 28 A point, moving along the xaxis, travels a distance x(t) given by the equation x(t) = aexp{kt} + bexp{kt} where a, b, and k are constants. Prove that the acceleration of the point is PAGE 53 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 43 proportional to the distance travelled. 29 A point moves in the plane with the equations of motion d 2 x/dt 2 1 x = d 2 y/dt 2 1 y Let the following coordinate transformation be made u = (x + y)/2 and v = (x y)/2. Show that in the uv frame, the equations of motion have a simple form, and that the timedependence of the coordinates is given by u = Acost + Bsint, and v = Ccos 3 t + Dsin 3 t, where A, B, C, D are constants. This coordinate transformation has diagonalized the original matrix: 1 0 1 0 The matrix with zeros everywhere, except along the main diagonal, has the interesting property that it does not rotate the vectors on which it acts. The diagonal elements are called the eigenvalues of the diagonal matrix, and the renormalized vectors are called eigenvectors. A small industry exists that is devoted to finding optimum ways of diagonalizing large matrices. Illustrate the motion of the system in the xy frame and in the uv frame. PAGE 54 3 CLASSICAL AND SPECIAL RELATIVITY 3.1 The Galilean transformation Events belong to the physical world they are not abstractions. We shall, nonetheless, introduce the idea of an ideal event that has neither extension nor duration. Ideal events may be represented as points in a spacetime geometry. An event is described by a fourvector E [t, x, y, z] where t is the time, and x, y, z are the spatial coordinates, referred to arbitrarily chosen origins. Let an event E [t, x], recorded by an observer O at the origin of an xaxis, be recorded as the event E [t, x] by a second observer O, moving at constant speed V along the xaxis. We suppose that their clocks are synchronized at t = t = 0 when they coincide at a common origin, x = x = 0. At time t, we write the plausible equations t = t and x = x Vt, where Vt is the distance travelled by O in a time t. These equations can be written E = GE (3.1) where 1 0 G = V 1 PAGE 55 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 45 G is the operator of the Galilean transformation. The inverse equations are t = t and x = x + Vt or E = G E (3.2) where G is the inverse Galilean operator. (It undoes the effect of G ). If we multiply t and t by the constants k and k, respectively, where k and khave dimensions of velocity then all terms have dimensions of length. In spacespace, we have the Pythagorean form x 2 + y 2 = r 2 (an invariant under rotations). We are therefore led to ask the question: is (kt) 2 + x 2 an invariant under G in spacetime? Direct calculation gives (kt) 2 + x 2 = (kt) 2 + x 2 + 2Vxt + V 2 t 2 = (kt) 2 + x 2 only if V = 0 We see, therefore, that Galilean spacetime does not leave the sum of squares invariant. We note, however, the key rle played by acceleration in GalileanNewtonian physics: The velocities of the events according to O and O are obtained by differentiating x = Vt + x with respect to time, giving v= V + v, (3.3) PAGE 56 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 46 a result that agrees with everyday observations. Differentiating v with respect to time gives dv/dt= a = dv/dt = a (3.4) where a and aare the accelerations in the two frames of reference. The classical acceleration is an invariant under the Galilean transformation. If the relationship v= v V is used to describe the motion of a pulse of light, moving in empty space at v = c @ 3 x 10 8 m/s, it does not fit the facts. For example, if V is 0.5c, we expect to obtain v = 0.5c, whereas, it is found that v = c. Indeed, in all cases studied, v = c for all values of V. 3.2 Einsteins spacetime symmetry: the Lorentz transformation It was Einstein, above all others who advanced our understanding of the nature of spacetime and relative motion. He made use of a symmetry argument to find the changes that must be made to the Galilean transformation if it is to account for the relative motion of rapidly moving objects and of beams of light. Einstein recognized an inconsistency in the GalileanNewtonian equations, based as they are, on everyday experience. The discussion will be limited to nonaccelerating, or so called inertial, frames We have seen that the classical equations relating the events E and E are E = GE and the inverse E = G E where 1 0 1 0 G = and G = V 1 V 1 These equations are connected by the substitution V V; this is an algebraic statement of the Newtonian principle of relativity. Einstein incorporated this principle in his theory. PAGE 57 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 47 He also retained the linearity of the classical equations in the absence of any evidence to the contrary. (Equispaced intervals of time and distance in one inertial frame remain equispaced in any other inertial frame). He symmetrized the spacetime equations as follows: t 1 V t = (3.5) x V 1 x Note, however, the inconsistency in the dimensions of the timeequation that has now been introduced: t = t Vx. The term Vx has dimensions of [L] 2 /[T], and not [T]. This can be corrected by introducing the invariant speed of light, c a postulate in Einstein's theory that is consistent with the result of the MichelsonMorley experiment: ct = ct Vx/c so that all terms now have dimensions of length. Einstein went further, and introduced a dimensionless quantity g instead of the scaling factor of unity that appears in the Galilean equations of spacetime. This factor must be consistent with all observations. The equations then become ct= g ct bg x x = bg ct + g x where b =V/c. These can be written E = LE (3.6) PAGE 58 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 48 where g bg L = bg g and E = [ct, x] L is the operator of the Lorentz transformation. The inverse equation is E = L E (3.7) where g bg L = bg g This is the inverse Lorentz transformation, obtained from L by changing b b (V V); it has the effect of undoing the transformation L We can therefore write LL = I (3.8) Carrying out the matrix multiplications, and equating elements gives g 2 b 2 g 2 = 1 therefore, g = 1/ (1 b 2 ) (taking the positive root). (3.9) As V 0, b 0 and therefore g 1; this represents the classical limit in which the Galilean transformation is, for all practical purposes, valid. In particular, time and space intervals have the same measured values in all Galilean frames of reference, and acceleration is the single fundamental invariant. PAGE 59 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 49 3.3 The invariant interval: contravariant and covariant vectors Previously, it was shown that the spacetime of Galileo and Newton is not Pythagorean under G We now ask the question: is Einsteinian spacetime Pythagorean under L ? Direct calculation leads to (ct) 2 + x 2 = g 2 (1 + b 2 )(ct) 2 + 4 bg 2 xct + g 2 (1 + b 2 )x 2 (ct) 2 + x 2 if b > 0. Note, however, that the difference of squares is an invariant : (ct) 2 x 2 = (ct) 2 x 2 (3.10) because g 2 (1 b 2 ) = 1. Spacetime is said to be pseudoEuclidean. The negative sign that characterizes Lorentz invariance can be included in the theory in a general way as follows. We introduce two kinds of 4vectors x = [x 0 x 1 x 2 x 3 ], a contravariant vector (3.11) and x = [x 0 x 1 x 2 x 3 ], a covariant vector where x = [x 0 x 1 x 2 x 3 ]. (3.12) The scalar (or inner) product of the vectors is defined as x T x =(x 0 x 1 x 2 x 3 )[x 0 x 1 x 2 x 3 ], to conform to matrix multiplication row column PAGE 60 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 50 =(x 0 ) 2 ((x 1 ) 2 + (x 2 ) 2 + (x 3 ) 2 ) (3.13) The superscript T is usually omitted in writing the invariant; it is implied in the form x m x m The event 4vector is E = [ct, x, y, z] and the covariant form is E = [ct, x, y, z] so that the invariant scalar product is E E = (ct) 2 (x 2 + y 2 + z 2 ). (3.14) A general Lorentz 4vector x transforms as follows: x' = L x (3.15) where g bg 0 0 L = bg g 0 0 0 0 1 0 0 0 0 1 This is the operator of the Lorentz transformation if the motion of O is along the xaxis of O's frame of reference, and the initial times are synchronized (t = t = 0 at x = x = 0). Two important consequences of the Lorentz transformation, discussed in 3.5 are that intervals of time measured in two different inertial frames are not the same; they are related by the equation D t = gD t (3.16) where D t is an interval measured on a clock at rest in O's frame, and distances are given by D l = D l/ g (3.17) PAGE 61 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 51 where D l is a length measured on a ruler at rest in O's frame. 3.4 The group structure of Lorentz transformations The square of the invariant interval s, between the origin [0, 0, 0, 0] and an arbitrary event x m = [x 0 x 1 x 2 x 3 ] is, in index notation s 2 = x m x m = x m x m (sum over m = 0, 1, 2, 3). (3.18) The lower indices can be raised using the metric tensor h mn = diag(1, ), so that s 2 = h mn x m x n = h mn x m x v (sum over m and n ). (3.19) The vectors now have contravariant forms. In matrix notation, the invariant is s 2 = x T x = x T x (3.20) (The transpose must be written explicitly). The primed and unprimed column matrices (contravariant vectors) are related by the Lorentz matrix operator, L x = Lx We therefore have x T x = ( Lx ) T ( Lx ) = x T L T Lx The x s are arbitrary, therefore L T L = (3.21) This is the defining property of the Lorentz transformations. PAGE 62 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 52 The set of all Lorentz transformations is the set L of all 4 x 4 matrices that satisfies the defining property L = { L : L T L = ; L all 4 x 4 real matrices; = diag(1, }. (Note that each L has 16 (independent) real matrix elements, and therefore belongs to the 16dimensional space, R 16 ). Consider the result of two successive Lorentz transformations L 1 and L 2 that transform a 4vector x as follows x x x where x = L 1 x and x = L 2 x The resultant vector x is given by x = L 2 ( L 1 x ) = L 2 L 1 x = L c x where L c = L 2 L 1 ( L 1 followed by L 2 ). (3.22) If the combined operation L c is always a Lorentz transformation then it must satisfy L c T L c = We must therefore have PAGE 63 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 53 ( L 2 L 1 ) T ( L 2 L 1 ) = or L 1 T ( L 2 T L 2 ) L 1 = so that L 1 T L 1 = ( L 1 L 2 L) therefore L c = L 2 L 1 L (3.23) Any number of successive Lorentz transformations may be carried out to give a resultant that is itself a Lorentz transformation. If we take the determinant of the defining equation of L det( L T L ) = det we obtain (det L ) 2 = 1 (det L = det L T ) so that det L = (3.24) Since the determinant of L is not zero, an inverse transformation L exists, and the equation L L = I the identity, is always valid. Consider the inverse of the defining equation ( L T L ) = or L ( L T ) = . PAGE 64 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 54 Using = and rearranging, gives L ( L ) T = (3.25) This result shows that the inverse L is always a member of the set L The Lorentz transformations L are matrices, and therefore they obey the associative property under matrix multiplication. We therefore see that 1. If L 1 and L 2 L then L 2 L 1 L 2. If L L then L L 3. The identity I = diag(1, 1, 1, 1) L and 4. The matrix operators L obey associativity. The set of all Lorentz transformations therefore forms a group 3.5 The rotation group Spatial rotations in two and three dimensions are Lorentz transformations in which the timecomponent remains unchanged. In Chapter 1 the geometrical properties of the rotation operators are discussed. In this section, we shall consider the algebraic structure of the operators. Let be a real 3 3 matrix that is part of a Lorentz transformation with a constant timecomponent, 1 0 0 0 0 (3.26) L = 0 0 PAGE 65 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 55 In this case, the defining property of the Lorentz transformations leads to 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 T 0 0 1 0 0 = 0 0 1 0 (3.27) 0 0 0 0 1 0 0 0 0 1 so that T = I the identity matrix, diag(1,1,1). This is the defining property of a threedimensional orthogonal matrix. (The related two dimensional case is treated in Chapter 1 ). If x = [x 1 x 2 x 3 ] is a threevector that is transformed under to give x then x T x = x T T x = x T x = x 1 2 + x 2 2 + x 3 2 = invariant under (3.28) The action of on any threevector preserves length. The set of all 3 3 orthogonal matrices is denoted by O (3), O (3) = { : T = I r ij Reals}. The elements of this set satisfy the four group axioms. 3.6 The relativity of simultaneity: time dilation and length contraction In order to record the time and place of a sequence of events in a particular inertial reference frame, it is necessary to introduce an infinite set of adjacent observers, located throughout the entire space. Each observer, at a known, fixed position in the reference frame, carries a clock to record the time and the characteristic property of every PAGE 66 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 56 event in his immediate neighborhood. The observers are not concerned with nonlocal events. The clocks carried by the observers are synchronized they all read the same time throughout the reference frame. The process of synchronization is discussed later. It is the job of the chief observer to collect the information concerning the time, place, and characteristic feature of the events recorded by all observers, and to construct the world line (a path in spacetime), associated with a particular characteristic feature (the type of particle, for example). Consider two sources of light, 1 and 2, and a point M midway between them. Let E 1 denote the event flash of light leaves 1, and E 2 denote the event flash of light leaves 2. The events E 1 and E 2 are simultaneous if the flashes of light from 1 and 2 reach M at the same time. The fact that the speed of light in free space is independent of the speed of the source means that simultaneity is relative The clocks of all the observers in a reference frame are synchronized by correcting them for the speed of light as follows: Consider a set of clocks located at x 0 x 1 x 2 x 3 ... along the xaxis of a reference frame. Let x 0 be the chiefs clock, and let a flash of light be sent from the clock at x 0 when it is reading t 0 (12 noon, say). At the instant that the light signal reaches the clock at x 1 it is set to read t 0 + (x 1 /c), at the instant that the light signal reaches the clock at x 2 it is set to read t 0 + (x 2 /c) and so on for every clock along the xaxis. All clocks in the reference frame then read the same time they are synchronized. From the viewpoint of all other inertial observers, in their own reference frames, the set of clocks, sychronized using the above procedure, appears to be unsychronized. It is the lack of symmetry in the PAGE 67 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 57 sychronization of clocks in different reference frames that leads to two nonintuitive results namely, length contraction and time dilation. Length contraction: an application of the Lorentz transformation. Consider a rigid rod at rest on the xaxis of an inertial reference frame S. Because it is at rest, it does not matter when its endpoints x 1 and x 2 are measured to give the rest, or properlength of the rod, L 0 = x 2 x 1 Consider the same rod observed in an inertial reference frame S that is moving with constant velocity V with its xaxis parallel to the xaxis. We wish to determine the length of the moving rod; we require the length L = x 2 x 1 according to the observers in S. This means that the observers in S must measure x 1 and x 2 at the same time in their reference frame. The events in the two reference frames S, and S are related by the spatial part of the Lorentz transformation: x = bg ct + g x and therefore x 2 x 1 = bg c(t 2 t 1 ) + g (x 2 x 1 ). where b = V/c and g = 1/ (1 b 2 ). Since we require the length (x 2 x 1 ) in S to be measured at the same time in S, we must have t 2 t 1 = 0, and therefore L 0 = x 2 x 1 = g (x 2 x 1 ) or PAGE 68 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 58 L 0 (at rest) = g L (moving). (3.29) The length of a moving rod, L, is therefore less than the length of the same rod measured at rest, L 0 because g > 1. Time dilation Consider a clock at rest at the origin of an inertial frame S, and a set of synchronized clocks at x 0 x 1 x 2 ... on the xaxis of another inertial frame S. Let S move at constant speed V relative to S, along the common x , xaxis. Let the clocks at x o and x o be sychronized to read t 0 and t 0 at the instant that they coincide in space. A proper time interval is defined to be the time between two events measured in an inertial frame in which the two events occur at the same place. The time part of the Lorentz transformation can be used to relate an interval of time measured on the single clock in the S frame, and the same interval of time measured on the set of synchronized clocks at rest in the S frame. We have ct = g ct + bg x or c(t 2 t 1 ) = g c(t 2 t 1 ) + bg (x 2 x 1 ). There is no separation between a single clock and itself, therefore x 2 x 1 = 0, so that c(t 2 t 1 )(moving) = g c(t 2 t 1 )(at rest) ( g > 1). (3.30) A moving clock runs more slowly than a clock at rest. In Chapter 1 it was shown that the general 2 2 matrix operator transforms rectangular coordinates into oblique coordinates. The Lorentz transformation is a special case of the PAGE 69 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 59 2 2 matrices, and therefore its effect is to transform rectangular spacetime coordinates into oblique spacetime coordinates: x x tan b E[ct, x] or E[ct, x] ct tan b ct The geometrical form of the Lorentz transformation The symmetry of spacetime means that the transformed axes rotate through equal angles, tan b The relativity of simultaneity is clearly exhibited on this diagram: two events that occur at the same time in the ct, x frame necessarily occur at different times in the oblique ct, xframe. 3.7 The 4velocity A differential time interval, dt, cannot be used in a Lorentzinvariant way in kinematics. We must use the proper time differential interval, d t defined by (cdt) 2 dx 2 = (cdt) 2 dx 2 (cd t ) 2 (3.31) The Newtonian 3velocity is v N = [dx/dt, dy/dt, dz/dt], PAGE 70 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 60 and this must be replaced by the 4velocity V = [d(ct)/d t dx/d t dy/d t dz/d t ] = [d(ct)/dt, dx/dt, dy/dt, dz/dt](dt/d t ) = [ g c, g v N ] (3.32) The scalar product is then V V = ( g c) 2 ( g v N ) 2 (the transpose is understood) = ( g c) 2 (1 (v N /c) 2 ) = c 2 (3.33) The magnitude of the 4velocity is therefore V = c, the invariant speed of light. PROBLEMS 31 Two points, A and B, move in the plane with constant velocities  v A  = 2 m.s and  v B  = 2 2 m.s They move from their initial (t = 0) positions, A(0)[1, 1] and B(0)[6, 2] as shown: y, m 6 5 4 v B 3 2 B(0) v A 1 A(0) R (0) 0 0 1 2 3 4 5 6 7 8 x, m PAGE 71 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 61 Show that the closest distance between the points is  R  min = 2.529882..meters, and that it occurs 1.40...seconds after they leave their initial positions. (Remember that all inertial frames are equivalent, therefore choose the most appropriate for dealing with this problem). 32 Show that the set of all standard (motion along the common xaxis) Galilean transformations forms a group. 33 A flash of light is sent out from a point x 1 on the xaxis of an inertial frame S, and it is received at a point x 2 = x 1 + l Consider another inertial frame, S, moving with constant speed V = b c along the xaxis; show that, in S: i) the separation between the point of emission and the point of reception of the light is l = l {(1 b )/(1 + b )} 1/2 ii) the time interval between the emission and reception of the light is D t = ( l /c){(1 b )/(1 + b )} 1/2. 34 The distance between two photons of light that travel along the xaxis of an inertial frame, S, is always l Show that, in a second inertial frame, S, moving at constant speed V = b c along the xaxis, the separation between the two photons is D x = l {(1 + b )/(1 b )} 1/2 35 An event [ct, x] in an inertial frame, S, is transformed under a standard Lorentz transformation to [ct, x] in a standard primed frame, S, that has a constant speed V PAGE 72 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 62 along the xaxis, show that the velocity components of the point x, x are related by the equation v x = (v x + V)/(1 + (v x V/c 2 )). 36 An object called a K 0 meson decays when at rest into two objects called p mesons ( p ), each with a speed of 0.8c. If the K 0 meson has a measured speed of 0.9c when it decays, show that the greatest speed of one of the p mesons is (85/86)c and that its least speed is (5/14)c. PAGE 73 4 NEWTONIAN DYNAMICS Although our discussion of the geometry of motion has led to major advances in our understanding of measurements of space and time in different inertial systems, we have yet to come to the crux of the matter, namely a discussion of the effects of forces on the motion of two or more interacting particles. This key branch of Physics is called Dynamics. It was founded by Galileo and Newton and perfected by their followers, most notably Lagrange and Hamilton. We shall see that the Newtonian concepts of momentum and kinetic energy require fundamental revisions in the light of the Einsteins Special Theory of Relativity. The revised concepts come about as a result of Einstein's recognition of the crucial rle of the Principle of Relativity in unifying the dynamics of all mechanical and optical phenomena. In spite of the conceptual difficulties inherent in the classical concepts, (difficulties that will be discussed later), the subject of Newtonian dynamics represents one of the great triumphs of Natural Philosophy. The successes of the classical theory range from accurate descriptions of the dynamics of everyday objects to a detailed understanding of the motions of galaxies. 4.1 The law of inertia Galileo (15441642) was the first to develop a quantitative approach to the study of motion. He addressed the question what property of motion is related to force? Is it the position of the moving object? Is it the velocity of the moving object? Is it the rate of change of its velocity? ...The answer to the question can be obtained only from PAGE 74 N E W T O N I A N D Y N A M I C S 64 observations; this is a basic feature of Physics that sets it apart from Philosophy proper. Galileo observed that force influences the changes in velocity (accelerations) of an object and that, in the absence of external forces (e.g: friction), no force is needed to keep an object in motion that is travelling in a straight line with constant speed. This observationally based law is called the Law of Inertia It is, perhaps, difficult for us to appreciate the impact of Galileo's new ideas concerning motion. The fact that an object resting on a horizontal surface remains at rest unless something we call force is applied to change its state of rest was, of course, wellknown before Galileo's time. However, the fact that the object continues to move after the force ceases to be applied caused considerable conceptual difficulties for the early Philosophers (see Feynman The Character of Physical Law ). The observation that, in practice, an object comes to rest due to frictional forces and air resistance was recognized by Galileo to be a side effect, and not germane to the fundamental question of motion. Aristotle, for example, believed that the true or natural state of motion is one of rest. It is instructive to consider Aristotle's conjecture from the viewpoint of the Principle of Relativity is a natural state of rest consistent with this general Principle? According to the general Principle of Relativity, the laws of motion have the same form in all frames of reference that move with constant speed in straight lines with respect to each other. An observer in a reference frame moving with constant speed in a straight line with respect to the reference frame in which the object is at rest would conclude that the natural state or motion of the object is one of constant speed in a straight line, and not one of rest. All inertial observers, in an infinite PAGE 75 N E W T O N I A N D Y N A M I C S 65 number of frames of reference, would come to the same conclusion. We see, therefore, that Aristotle's conjecture is not consistent with this fundamental Principle. 4.2 Newtons laws of motion During his early twenties, Newton postulated three Laws of Motion that form the basis of Classical Dynamics. He used them to solve a wide variety of problems including the dynamics of the planets. The Laws of Motion, first published in the Principia in 1687, play a fundamental rle in Newtons Theory of Gravitation (Chapter 7 ); they are: 1. In the absence of an applied force, an object will remain at rest or in its present state of constant speed in a straight line (Galileo's Law of Inertia) 2. In the presence of an applied force, an object will be accelerated in the direction of the applied force and the product of its mass multiplied by its acceleration is equal to the force. and, 3. If a body A exerts a force of magnitude  F AB  on a body B, then B exerts a force of equal magnitude  F BA  on A.. The forces act in opposite directions so that F AB = F BA In law number 2, the acceleration lasts only while the applied force lasts. The applied force need not, however, be constant in time the law is true at all times during the motion. Law number 3 applies to contact interactions. If the bodies are separated, and the interaction takes a finite time to propagate between the bodies, the law must be PAGE 76 N E W T O N I A N D Y N A M I C S 66 modified to include the properties of the field between the bodies. This important point is discussed in Chapter 7 4.3 Systems of many interacting particles: conservation of linear and angular momentum Studies of the dynamics of two or more interacting particles form the basis of a key part of Physics. We shall deduce two fundamental principles from the Laws of Motion; they are: 1) The Conservation of Linear Momentum which states that, if there is a direction in which the sum of the components of the external forces acting on a system is zero, then the linear momentum of the system in that direction is constant. and, 2) The Conservation of Angular Momentum which states that, if the sum of the moments of the external forces about any fixed axis (or origin) is zero, then the angular momentum about that axis (or origin) is constant. The new terms that appear in these statements will be defined later. The first of these principles will be deduced by considering the dynamics of two interacting particles of masses m l and m 2 wiith instantaneous coordinates [x l y 1 ] and [x 2 y 2 ], respectively. In Chapter 12 these principles will be deduced by considering the invariance of the Laws of Motion under translations and rotations of the coordinate systems. Let the external forces acting on the particles be F 1 and F 2 and let the mutual interactions be F 21 and F 12 The system is as shown PAGE 77 N E W T O N I A N D Y N A M I C S 67 y F 1 F 2 m 2 F 12 m 1 F 21 O x Resolving the forces into their xand ycomponents gives y F y2 F y1 F x12 F x2 F y21 m 2 F x1 F y12 m 1 F x21 O x a) The equations of motion The equations of motion for each particle are 1) Resolving in the xdirection F x1 + F x21 = m 1 (d 2 x 1 /dt 2 ) (4.1) and F x2 F x12 = m 2 (d 2 x 2 /dt 2 ). (4.2) Adding these equations gives F x1 + F x2 + (F x21 F x12 ) = m 1 (d 2 x 1 /dt 2 ) + m 2 (d 2 x 2 /dt 2 ). (4.3) 2) Resolving in the ydirection gives a similar equation, namely PAGE 78 N E W T O N I A N D Y N A M I C S 68 F y1 + F y2 + (F y12 F y12 ) = m 1 (d 2 y 1 /dt 2 ) + m 2 (d 2 y 2 /dt 2 ). (4.4) b) The rle of Newtons 3rd Law For instantaneous mutual interactions, Newtons 3rd Law gives  F 21  =  F 12  so that the xand ycomponents of the internal forces are themselves equal and opposite, therefore the total equations of motion are F x1 + F x2 = m 1 (d 2 x 1 /dt 2 ) + m 2 (d 2 x 2 /dt 2 ), (4.5) and F y1 + F y2 = m 1 (d 2 y 1 /dt 2 ) + m 2 (d 2 y 2 /dt 2 ). (4.6) c) The conservation of linear momentum If the sum of the external forces acting on the masses in the xdirection is zero, then F x1 + F x2 = 0 (4.7) in which case, 0 = m 1 (d 2 x 1 /dt 2 ) + m 2 (d 2 x 2 /dt 2 ) or 0 = (d/dt)(m 1 v x1 ) + (d/dt)(m 2 v x2 ), which, on integration gives constant = m 1 v x1 + m 2 v x2 (4.8) The product (mass velocity) is the linear momentum. We therefore see that if there is no resultant external force in the xdirection, the linear momentum of the two particles in the xdirection is conserved. The above argument can be generalized so that we can state: the linear momentum of the two particles is constant in any direction in which there is no resultant external force. PAGE 79 N E W T O N I A N D Y N A M I C S 69 4.3.1 Interaction of nparticles The analysis given in 4 3 can be carried out for an arbitrary number of particles, n, with masses m 1 m 2 ...m n and with instantaneous coordinates [x 1 y 1 ], [x 2 y 2 ] ..[x n y n ]. The mutual interactions cancel in pairs so that the equations of motion of the nparticles are, in the xdirection F x1 + F x2 + ... F xn = m 1 x 1 + m 2 x 2 + ... m n x n = sum of the xcomponents of (4.9) the external forces acting on the masses, and, in the ydirection F y1 + F y2 + ... F yn = m 1 y 1 + m 1 y 2 + ...m n y n = sum of the ycomponents of (4.10) the external forces acting on the masses. In this case, we see that if the sum of the components of the external forces acting on the system in a particular direction is zero, then the linear momentum of the system in that direction is constant. If, for example, the direction is the xaxis then m 1 v x1 + m 2 v x2 + ... m n v xn = constant. (4.11) 4.3.2 Rotation of two interacting particles about a fixed point We begin the discussion of the second fundamental conservation law by cosidering the motion of two interacting particles that move under the influence of external forces F 1 and F 2 and mutual interactions (internal forces) F 21 and F 12 We are interested in the motion of the two masses about a fixed point O that is chosen to be the origin of Cartesian coordinates. The system is illustrated in the following figure PAGE 80 N E W T O N I A N D Y N A M I C S 70 y F 2 F 1 F 12 m 2 m 1 F 21 ^ + Moment R O R 2 R 1 x a) The moment of forces about a fixed origin The total moment G 1,2 of the forces about the origin O is defined as G 1,2 = R 1 F 1 + R 2 F 2 + (RF 12 RF 21 ) (4.12) moment of moment of external forces internal forces A positive moment acts in a counterclockwise sense. Newtons 3rd Law gives  F 21  =  F 12  therefore the moment of the internal forces obout O is zero. (Their lines of action are the same). The total effective moment about O is therefore due to the external forces, alone. Writing the moment in terms of the xand ycomponents of F 1 and F 2 we obtain G 1,2 = x 1 F y1 + x 2 F y2 y 1 F x1 y 2 F x2 (4.13) b) The conservation of angular momentum PAGE 81 N E W T O N I A N D Y N A M I C S 71 If the moment of the external forces about the origin O is zero then, by integration, we have constant = x 1 p y1 + x 2 p y2 y 1 p x1 y 2 p x2 where p x1 is the xcomponent of the momentum of mass 1, etc.. Rearranging, gives constant = (x 1 p y1 y 1 p x1 ) + (x 2 p y2 y 2 p x2 ). (4.14) The righthand side of this equation is called the angular momentum of the two particles about the fixed origin, O. Alternatively, we can discuss the conservation of angular momentum using vector analysis. Consider a nonrelativistic particle of mass m and momentum p moving in the plane under the influence of an external force F about a fixed origin, O: y F p m r f O x The angular momentum, L of m about O can be written in vector form L = r p (4.15) The torque G associated with the external force F acting about O is = r F (4.16) The rate of change of the angular momentum with time is PAGE 82 N E W T O N I A N D Y N A M I C S 72 d L /dt = r (d p /dt) + p (d r /dt) (4.17) = r m(d v /dt) + m v v = r F (because v v = 0) = If there is no external torque, = 0. We have, therefore = d L /dt = 0, (4.18) so that L is a constant of the motion. 4.3.3 Rotation of ninteracting particles about a fixed point The analysis given in 4.3.2 can be extended to a system of ninteracting particles. The moments of the mutual interactions about the origin O cancel in pairs (Newtons 3rd Law) so that we are left with the moment of the external forces about O. The equation for the total moment is therefore G 1, 2, ....n = [i=1, n] (x i d(m i v yi )/dt y i d(m i v xi )/dt). If the moment of the external forces about the fixed origin is zero then the total angular momentum of the system about O is a constant. This result follows directly by integrating the expression for G 1, 2, ...n = 0. (4.19) If the origin moves with constant velocity, the angular momentum of the system, relative to the new coordinate system, is constant if the external torque is zero. 4.4 Work and energy in Newtonian dynamics 4.4.1 The principle of work: kinetic energy and the work done by forces Consider a mass m moving along a path in the [x, y]plane under the influence of a PAGE 83 N E W T O N I A N D Y N A M I C S 73 resultant force F that is not necessarily constant. Let the components of the force be F x and F y when the mass is at the point P[x, y]. We wish to study the motion of m in moving from a point A[x A y A ] where the force is F A to a point B[x B y B ] where the force is F B The equations of motion are m(d 2 x/dt 2 ) = F x (4.20) and m(d 2 y/dt 2 ) = F y (4.21) Multiplying these equations by dx/dt and dy/dt, respectively, and adding, we obtain m(dx/dt)(d 2 x/dt 2 ) + m(dy/dt)(d 2 y/dt 2 ) = F x (dx/dt) + F y (dy/dt). This equation now can be integrated with respect to t, so that m((dx/dt) 2 + (dy/dt) 2 )/2 = (F x dx + F y dy) or mv 2 /2 = (F x dx + F y dy), (4.22) where v = ((dx/dt) 2 + (dy/dt) 2 ) 1/2 is the speed of the particle at the point [x, y]. The term mv 2 /2 is called the classical kinetic energy of the mass m. It is important to note that the kinetic energy is a scalar If the resultant forces acting on m are F A at A[x A y A ] at time t A and F B at B[x B y B ] at time t B then we have mv B 2 /2 mv A 2 /2 = [xA, xB] F x dx + [yA, yB] F y dy (4.23) The terms on the righthand side of this equation represent the work done by the resultant forces acting on the particle in moving it from A to B. The equation is the mathematical form of the general Principle of Work : the change in the kinetic energy of a PAGE 84 N E W T O N I A N D Y N A M I C S 74 system in any interval of time is equal to the work done by the resultant forces acting on the system during that interval. 4.5 Potential energy 4.5.1 General features Newtonian dynamics involves vector quantities force, momentum, angular momentum, etc.. There is, however, another form of dynamics that involves scalar quantities; a form that originated in the works of Huygens and Leibniz, in the 17th century. The scalar form relies upon the concept of energy, in its broadest sense. We have met the concept of kinetic energy in the previous section. We now meet a more abstract quantity called potential energy The work done, W, by a force, F in moving a mass m from a position s A to a position s B along a path s is, from section 4.3 W = [sA, sB] F d s = the change in the kinetic energy during the motion, = [sA, sB] Fdscos a where a is the angle between F and d s (4.24) If the force is constant, we can write W = F(s B s A ), where s B s A is the arc length. If the motion is along the xaxis, and F = F x is constant then W = F x (x B x A ), the force multiplied by the distance moved. (4.25) This equation can be rearranged, as follows mv xB 2 /2 F x x B = mv xA 2 /2 F x x A (4.26) PAGE 85 N E W T O N I A N D Y N A M I C S 75 This is a surprising result; the kinetic energy of the mass is not conserved during the motion whereas the quantity (mv x 2 /2 F x x) is conserved during the motion. This means that the change in the kinetic energy is exactly balanced by the change in the quantity F x x. Since the quantity mv 2 /2 has dimensions of energy, the quantity F x x must have dimensions of energy if the equation is to be dimensionally correct. The quantity F x x is called the potential energy of the mass m, when at the position x, due to the influence of the force F x We shall denote the potential energy by V. The negative sign that appears in the definition of the potential energy will be discussed later when explicit reference is made to the nature of the force (for example, gravitational or electromagnetic). The energy equation can therefore be written T B + V B = T A + V A (4.27) This is found to be a general result that holds in all cases in which a potential energy function can be found that depends only on the position of the object (or objects). 4.5.2 Conservative forces Let F x and F y be the Cartesian components of the forces acting on a moving particle with coordinates [x, y]. The work done W 1 2 by the forces while the particle moves from the position P 1 [x 1 y 1 ] to another position P 2 [x 2 y 2 ] is W 1 2 = [x1, x2] F x dx + [y1, y2] F y dy (4.28) = [P1, P2] (F x dx + F y dy) If the quantity F x dx + F y dy is a perfect differential then a function U = f(x, y) exists such that PAGE 86 N E W T O N I A N D Y N A M I C S 76 F x = U/ x and F y = U/ y (4.29) Now, the total differential of the function U is dU = ( U/ x)dx + ( U/ y)dy (4.30) = F x dx + F y dy. In this case, we can write dU = (F x dx + F y dy) = U = f(x, y). The definite integral evaluated between P 1 [x 1 y 1 ] and P 2 [x 2 y 2 ] is [P1, P2] (F x dx + F y dy) =f(x 2 y 2 ) f(x 1 y 1 ) = U 2 U 1 (4.31) We see that in evaluating the work done by the forces during the motion, no mention is made of the actual path taken by the particle. If the forces are such that the function U(x, y) exists, then they are said to be conservative The function U(x, y) is called the force function The above method of analysis can be applied to a system of many particles, n. The total work done by the resultant forces acting on the system in moving the particles from their initial configuration, i, to their final configuration, f, is W i f = [k=1, n] [Pk1, Pk2] (F kx dx k + F ky dy k ), (4.32) = U f U i a scalar quantity that is independent of the paths taken by the individual particles. P k1 [x k1 y k1 ] and P k2 [x k2 y k2 ] are the initial and final coordinates of the kthparticle. The potential energy, V, of the system moving under the influence of conservative forces is defined in terms of the function U: V U . PAGE 87 N E W T O N I A N D Y N A M I C S 77 Examples of interactions that take place via conservative forces are: 1) gravitational interactions 2) electromagnetic interactions and 3) interactions between particles of a system that, for every pair of particles, act along the line joining their centers, and that depend in some way on their distance apart. These are the socalled central interactions Frictional forces are examples of nonconservative forces. There are two other major methods of solving dynamical problems that differ in fundamental ways from the method of Newtonian dynamics; they are Lagrangian dynamics and Hamiltonian dynamics. We shall delay a discussion of these more general methods until our study of the Calculus of Variations in Chapter 9 4.6 Particle interactions 4.6.1 Elastic collisions Studies of the collisions bewteen objects, first made in the 17thcentury, led to the discovery of two basic laws of Nature: the conservation of linear momentum, and the conservation of kinetic energy associated with a special class of collisions called elastic collisions The conservation of linear momentum in an isolated system forms the basis for a quantitative discussion of all problems that involve the interactions between particles. The present discussion will be limited to an analysis of the elastic collision between two particles. A typical twobody collision, in which an object of mass m 1 and momentum p 1 PAGE 88 N E W T O N I A N D Y N A M I C S 78 makes a grazing collision with another object of mass m 2 and momentum p 2 (p 2 < p 1 ), is shown in the following diagram. (The coordinates are chosen so that the vectors p 1 and p 2 have the same directions). After the collision, the two objects move in directions characterized by the angles q and f with momenta p 1 and p 2 Before After m 1 p 1 q m 1 p 1 m 2 p 2 f m 2 p 2 If there are no external forces acting on the particles so that the changes in their states of motion come about as a result of their mutual interactions alone, the total linear momentum of the system is conserved. We therefore have p 1 + p 2 = p 1 + p 2 (4.33) or, rearranging to give the momentum transfer, p 1 p 1 = p 2 p 2 The kinetic energy of a particle, T is related to the square of its momentum (T = p 2 /2m); we therefore form the scalar product of the vector equation for the momentum transfer, to obtain p 1 2 2 p 1 p 1 + p 1 2 = p 2 2 2 p 2 p 2 + p 2 2 (4.34) Introducing the scattering angles q and f we have PAGE 89 N E W T O N I A N D Y N A M I C S 79 p 1 2 2p 1 p 1 cos q + p 1 2 = p 2 2 2p 2 p 2 cos f + p 2 2 This equation can be written p 1 2 (x 2 2xcos q + 1) = p 2 2 (y 2 2ycos f + 1) (4.35) where x = p 1 /p 1 and y = p 2 /p 2 If we choose a frame in which p 2 = 0 then y = 0 and we have x 2 2xcos q + 1 = (p 2 /p 1 ) 2 (4.36) If the collision is elastic, the kinetic energy of the system is conserved, so that T 1 + 0 = T 1 + T 2 (T 2 = 0 because p 2 = 0) (4.37) Substituting T i = p i 2 /2m i and rearranging, gives (p 2 /p 1 ) 2 = (m 2 /m 1 )(x 2 1) We therefore obtain a quadratic equation in x: x 2 + 2x(m 1 /(m 2 m 1 ))cos q [(m 2 + m 1 )/(m 2 m 1 )] = 0 The valid solution of this equation is x = (T 1 /T 1 ) 1/2 = (m 1 /(m 2 m 1 ))cos q + {(m 1 /(m 2 m 1 )) 2 cos 2 q + [(m 2 + m 1 )/(m 2 m 1 )]} 1/2 (4.38) If m 1 = m 2 the solution is x = 1/cos q in which case T 1 = T 1 cos 2 q (4.39) In the frame in which p 2 = 0, a geometrical analysis of the twobody collision is useful. We have p 1 + ( p 1 ) = p 2 (4.40) PAGE 90 N E W T O N I A N D Y N A M I C S 80 leading to p 1 p 1 q f q p 2 p 1 If the masses are equal then p 1 = p 1 cos q In this case, the two particles always emerge from the elastic collision at right angles to each other ( q + f = 90 o ). In the early 1930s, the measured angle between two outgoing highspeed nuclear particles of equal mass was shown to differ from 90 o Such experiments clearly demonstrated the breakdown of Newtonian dynamics in these interactions. 4.6.2 Inelastic collisions Collisions between everyday objects are never perfectly elastic. An object that has an internal structure can undergo inelastic collisions involving changes in its structure. Inelastic collisions are found to obey two laws; they are 1) the conservation of linear momentum and 2) an empirical law, due to Newton, that states that the relative velocity of the colliding objects, measured along their line of centers immediately after impact, is e times their relative velocity before impact. PAGE 91 N E W T O N I A N D Y N A M I C S 81 The quantity e is called the coefficient of restitution. Its value depends on the nature of the materials of the colliding objects. For very hard substances such as steel, e is close to unity, whereas for very soft materials such as putty, e approaches zero. Consider in the simplest case, the impact of two deformable spheres with masses m 1 and m 2 Let their velocities be v 1 and v 2 and v 1 and v 2 (along their line of centers) before and after impact, respectively. The linear momentum is conserved, therefore m 1 v 1 + m 2 v 2 = m 1 v 1 + m 2 v 2 and, using Newtons empirical law, v 1 v 2 = e(v 1 v 2 ) (4.41) Rearranging these equations we can obtain the values v 1 and v 2 after impact in terms of their valus before impact: v 1 = [m 1 v 1 + m 2 v 2 em 2 (v 1 v 2 )]/(m 1 + m 2 ), (4.42) and v 2 = [m 1 v 1 + m 2 v 2 + em 1 (v 1 v 2 )]/(m 1 + m 2 ) (4.43) If the two spheres initially move in directions that are not colinear, the above method of analysis is still valid because the momenta can be resolved into components along and perpendicular to a chosen axis. The perpendicular components remain unchanged by the impact. We shall find that the classical approach to a quantitative study of inelastic collisions must be radically altered when we treat the subject within the framework of Special Relativity. It will be shown that the combined mass (m 1 + m 2 ) of the colliding objects is not conserved in an inelastic collision. PAGE 92 N E W T O N I A N D Y N A M I C S 82 4.7 The motion of rigid bodies Newtons Laws of Motion apply to every pointlike mass in an object of finite size. The smallest objects of practical size contain very large numbers of microscopic particles Avogadros number is about 6 10 23 atoms per gramatom. The motions of the individual microscopic particles in an extended object can be analyzed in terms of the motion of their equivalent total mass, located at the center of mass of the object. 4.7.1 The center of mass For a system of discrete masses, m i located at the vector positions, r i the position, r CM of the center of mass is defined as r CM i m i r i / i m i = i m i r i / M, where M is the total mass. (4.44) The center of mass (CM) of an (idealized) continuous distribution of mass with a density r (mass/volume), can be obtained by considering an element of volume dV with an elemental mass dm. We then have dm = r dV. (4.45) The position of the CM is therefore r CM = (1/M) r dm = (1/M) r r dV. (4.46) The Cartesian components of r CM are x iCM = (1/M) x i r dV. (4.47) In nonuniform materials, the density is a function of r. 4.7.2 Kinetic energy of a rigid body in general motion PAGE 93 N E W T O N I A N D Y N A M I C S 83 Consider a rigid body that has both translational and rotational motion in a plane. Let the angular velocity, w be constant. At an arbitrary time ,t, we have y y y y v = r w the velocity of m r f relative to G O,G x x w = constant Total mass, M = m O x x Let the coordinates of an element of mass m of the body be [x, y] in the fixed frame (origin O) and [x, y] in the frame moving with the center of mass, G (origin O), and let u and v be the components of velocity of G, in the fixed frame. For constant angular velocity w the instantaneous velocity of the element of mass m, relative to G has a direction perpendicular to the radius vector r and a magnitude v = r w (4.48) The components of the instantaneous velocity of G, relative to the fixed frame, are u in the xdirection, and v in the ydirection. The velocity components of m in the [x, y]frame are therefore u r w sin f = u y w in the xdirection, and v + r w cos f = v + x w in the ydirection. PAGE 94 N E W T O N I A N D Y N A M I C S 84 The kinetic energy of the body, E K of mass M is therefore E K = (1/2) m{(u y w ) 2 + (v + x w ) 2 } (4.49) = (1/2)M(u 2 + v 2 ) + (1/2) w 2 m(x 2 + y 2 ) u w my + v w mx. Therefore E K = (1/2)Mv G 2 + (1/2)I G w 2 (4.50) where v G = (u 2 + v 2 ) 1/2 the speed of G, relative to the fixed frame, my = mx = 0, by definition of the center of mass, and I G = m(x 2 + y 2 ) = mr 2 is called the moment of imertia of M about an axis through G, perpendicular to the plane. We see that the total kinetic energy of the moving object of mass M is made up of two parts, 1) the kinetic energy of translation of the whole mass moving with the velocity of the center of mass and 2) the kinetic energy of rotation of the whole mass about its center of mass. 4.8 Angular velocity and the instantaneous center of rotation The angular velocity of a body is defined as the rate of increase of the angle between any line AB, fixed in the body, and any line fixed in the plane of the motion. If f PAGE 95 N E W T O N I A N D Y N A M I C S 85 is the instantaneous angle between AB and an axis Oy, in the plane, then the angular velocity is d f /dt. Consider a circular disc of radius a, that rolls without sliding in contact with a line Ox, and let f be the instantaneous angle that the fixed line AB in the disc makes with the yaxis. At t = 0, the rolling begins with the point B touching the origin, O: y a v y A y B f v x O x P (corresponds to f = 0) x At time t, after the rolling begins, the coordinates of B[x, y] are x = OP asin f = BP asin f = a f asin f = a( f sin f ), and y = AP acos f = a(1 cos f ). The components of the velocity of B are therefore v x = dx/dt = a(d f /dt)(1 cos f ), (4.51) and v y = dy/dt = a(d f /dt)sin f (4.52) The components of the acceleration of B are a x = dv x /dt = (d/dt)(a(d f /dt)(1 cos f )) (4.53) = a(d f /dt) 2 sin f + a(1 cos f )(d 2 f /dt 2 ), and PAGE 96 N E W T O N I A N D Y N A M I C S 86 a y = dv y /dt = (d/dt)(a(d f /dt)sin f ) (4.54) = a(d f /dt) 2 cos f + asin f (d 2 f /dt 2 ). If f = 0, dx/dt = 0 and dy/dt = 0, which means that the point P has no instantaneous velocity. The point B is therefore instantaneously rotating about P with a velocity equal to 2asin( f /2)(d f /dt); P is a center of rotation. Also, d 2 x/dt 2 = 0 and d 2 y/dt 2 = a(d f /dt) 2 the point of contact only has an acceleration towards the center. 4.9 An application of the Newtonian method The following example illustrates the use of some basic principles of classical dynamics, such as the conservation of linear momentum, the conservation of energy, and instantaneous rotation about a moving point: Consider a perfectly smooth, straight horizontal rod with a ring of mass M that can slide along the rod. Attached to the ring is a straight, hinged rod of length L and of negligible mass; it has a mass m at its end. At time t = 0, the system is held in a horizontal position in the constant gravitational field of the Earth. At t = 0: g m L M x = 0 at t = 0 At t = 0, the mass m is released and falls under gravity. At time t, we have PAGE 97 N E W T O N I A N D Y N A M I C S 87 g v x f x x = 0 L v x Lsin f (d f /dt) Lcos f (d f /dt) L(d f /dt) = instantaneous velocity of m about M There are no external forces acting on the system in the xdirection and therefore the horizontal momentum remains zero: M(dx/dt) + m((dx/dt) Lsin f (d f /dt)) = 0. (4.56) Integrating, we have Mx + mx + mLcos f = constant. (4.57) If x = 0 and f = 0 at t = 0, then mL = constant, (4.58) therefore (M + m)x + mL(cos f 1) = 0, so that x = mL(1 cos f )/(M + m). (4.59) We see that the instantaneous position x(t) is obtained by integrating the momentum equation. The equation of conservation of energy can now be used; it is PAGE 98 N E W T O N I A N D Y N A M I C S 88 (M/2)v x 2 + (m/2)(v x Lsin f (d f /dt)) 2 + (m/2)(Lcos f (d f /dt)) 2 = mgLsin f (The change in kinetic energy is equal to the change in the potential energy). Rearranging, gives (M + m)v x 2 2mLsin f v x (d f /dt) + (mL 2 (d f /dt) 2 2mgLsin f ) = 0. (4.40) This is a quadratic in v x with a solution (M + m)v x = mLsin f (d f /dt)[1 {1 [(M + m)(mL 2 (d f /dt) 2 2mLgsin f )]/[m 2 L 2 (d f /dt) 2 sin 2 f ]} 1/2 ]. The lefthand side of this equation is also given by the momentum equation: (M + m)v x = mLsin f (d f /dt). We therefore obtain, after substitution and rearrangement, d f /dt = {[2(M + m)gsin f ]/[L(M + mcos 2 f )]} 1/2 (4.41) the angular velocity of the rod of length L at time t. PROBLEMS 41 A straight uniform rod of mass m and length 2 l is held at an angle q 0 to the vertical. Its lower end rests on a perfectly smooth horizontal surface. The rod is released and falls under gravity. At time t after the motion begins, we have g q 0 Initial position q m Mass m, length 2 l m g PAGE 99 N E W T O N I A N D Y N A M I C S 89 If the moment of inertia of the rod about an axis through its center of mass, perpendicular to the plane of the motion, is m l 2 /3, prove that the angular velocity of the rod when it makes an angle q with the vertical, is d q /dt = {6g(cos q 0 cos q )/ l (1 + 3sin 2 q )} 1/2 42 Show that the center of mass of a uniform solid hemisphere of radius R is 3R/8 above the center of its plane surface. 43 Show that the moment of inertia of a uniform solid sphere of radius R and mass M about a diameter is 2MR 2 /5. 44 A uniform solid sphere of radius r can roll, under gravity, on the inner surface of a perfectly rough spherical surface of radius R. The motion is in a vertical plane. At time t during the motion, we have: g w q R m g rolling sphere, radius r Show that d 2 q /dt 2 + [5g/(7(R r))]sin q = 0. As a preliminary result, show that r w = (R r)(d q /dt) for rolling motion without PAGE 100 N E W T O N I A N D Y N A M I C S 90 slipping. 45 A particle of mass m hangs on an inextensible string of length l and negligible mass. The string is attached to a fixed point O. The mass oscillates in a vertical plane under gravity. At time t, we have O l q w = d q /dt Tension, T m m g Show that 1) d 2 q /dt 2 + (g/ l )sin q = 0. 2) w 2 = (2g/ l )[cos q cos q 0 ], where q 0 is the initial angle of the string with respect to the vertical, so that w = 0 when q = q 0 This equation gives the angular velocity in any position. 46 Let l 0 be the natural length of an elastic string fixed at the point O. The string has a negligible mass. Let a mass m be attached to the string, and let it stretch the string until the equilibrium position is reached. The tension in the string is given by Hookes law: Tension, T = l (extension)/original length, where l is a constant for a given material. The mass is displaced vertically from its equilibrium position, and oscillates under PAGE 101 N E W T O N I A N D Y N A M I C S 91 gravity. We have O Equilibrium General position g l 0 y E y(t) T E m g T m g Show that the mass oscillates about the equilibrium position with simple harmonic motion, and that y(t) = l 0 + (mg l 0 / l ){1 cos[t l /m l 0 ]} (starts with zero velocity at y(0) = l 0 ) 47 A dynamical system is in stable equilibrium if the system tends to return to its original state if slightly displaced. A system is in a position of equilibrium when the height of its center of gravity is a maximum or a minimum. Consider a rod of mass m with one end resting on a perfectly smooth vertical wall OA and the other end on a perfectly smooth inclined plane, OB. Show that, in the position of equilibrium cot q = 2tan f where the angles are given in the diagram: PAGE 102 N E W T O N I A N D Y N A M I C S 92 g A Center of gravity q B y( q ) f (fixed angle) O Find y = f( q ), and show, by considering derivatives, that this is a state of unstable equilibrium. 48 A particle A of mass m A = 1 unit, scatters elastically from a stationary particle B of mass m B = 2 units. If A scatters through an angle q show that the ratio of the kinetic energies of A, before (T A ) and after (T A ) scattering is (T A /T A ) = (cos q + 3 + cos 2 q ) 2 Sketch the form of the variation of this ratio with angle in the range 0 q p (This problem is met in practice in lowenergy neutrondeuteron scattering). PAGE 103 5 INVARIANCE PRINCIPLES AND CONSERVATION LAWS 5.1 Invariance of the potential under translations and the conservation of linear momentum The equation of motion of a Newtonian particle of mass m moving along the xaxis under the influence of a force F x is md 2 x/dt 2 = F x (5.1) If F x can be represented by a potential V(x) then md 2 x/dt 2 = dV(x)/dx (5.2) In the special case in which the potential is not a function of x, the equation of motion becomes md 2 x/dt 2 = 0, or md(v x )/dt = 0. (5.3) Integrating this equation gives mv x = constant. (5.4) We see that the linear momentum of the particle is constant if the potential is independent of the position of the particle. 5.2 Invariance of the potential under rotations and the conservation of angular momentum PAGE 104 I N V A R I A N C E P R I N C I P L E S A N D C O N S E R V A T I O N L A W S 94 Let a Newtonian particle of mass m move in the plane about a fixed origin, O, under the influence of a force F The equations of motion, in the xand ydirections, are md 2 x/dt 2 = F x and md 2 y/dt 2 = F y (5.5 a,b) If the force can be represented by a potential V(x, y) then we can write md 2 x/dt 2 = V/ x and md 2 y/dt 2 = V/ y (5.6 a,b) The total differential of the potential is dV = ( V/ x)dx + ( V/ y)dy. Let a transformation from Cartesian to polar coordinates be made using the standard linear equations x = rcos f and y = rsin f The partial derivatives are x/ f = rsin f = y, x/ r = cos f y/ f = rcos f = x, and y/ r = sin f We therefore have V/ f = ( V/ x)( x/ f ) + ( V/ y)( y/ f ) (5.7) = ( V/ x)(y) + ( V/ y)(x) = yF x + x(F y ) = m(ya x xa y ) (a x and a y are the components of acceleration) = m(d/dt)(yv x xv y ) (v x and v y are the components of velocity). If the potential is independent of the angle f then V/ f = 0, (5.8) in which case PAGE 105 I N V A R I A N C E P R I N C I P L E S A N D C O N S E R V A T I O N L A W S 95 m(d/dt)(yv x xv y ) = 0 and therefore m(yv x xv y ) = a constant. (5.9) The quantity on the lefthand side of this equation is the angular momentum (yp x xp y ) of the mass about the fixed origin. We therefore see that if the potential is invariant under rotations obout the origin (independent of the angle f ), the angular momentum of the mass about the origin is conserved. In Chapter 9 we shall treat the subject of invariance principles and conservation laws in a more general way, using arguments that involve the Lagrangians and Hamiltonians of dynamical systems. PAGE 106 6 EINSTEINIAN DYNAMICS 6.1 4momentum and the energymomentum invariant In Classical Mechanics, the concept of momentum is important because of its rle as an invariant in an isolated system. We therefore introduce the concept of 4momentum in Relativistic Mechanics in order to find possible Lorentz invariants involving this new quantity. The contravariant 4momentum is defined as: P = mV (6.1) where m is the mass of the particle. (It is a Lorentz scalar the mass measured in the rest frame of the particle). The scalar product is P P = (mc) 2 (6.2) Now, P = [m g c, m g v N ] (6.3) therefore, P P = (m g c) 2 (m g v N ) 2 Writing M = g m, the relativistic mass, we obtain P P = (Mc) 2 (M v N ) 2 = (mc) 2 (6.4) Multiplying throughout by c 2 gives M 2 c 4 M 2 v N 2 c 2 = m 2 c 4 (6.5) PAGE 107 E I N S T E I N I A N D Y N A M I C S 97 The quantity Mc 2 has dimensions of energy; we therefore write E = Mc 2 (6.6) the total energy of a freely moving particle. This leads to the fundamental invariant of dynamics c 2 P P = E 2 ( p c) 2 = E o2 (6.7) where E o = mc 2 is the rest energy of the particle, and p is its relativistic 3momentum The total energy can be written: E = g E o = E o + T, (6.8) where T = E o ( g 1), (6.9) the relativistic kinetic energy The magnitude of the 4momentum is a Lorentz invariant P  = mc. (6.10) The 4momentum transforms as follows: P = L P (6.11) 6.2 The relativistic Doppler shift For relative motion along the xaxis, the equation P m = L P m is equivalent to the equations E = g E bg cp x (6.12) and, cp x = bg E + g cp x (6.13) PAGE 108 E I N S T E I N I A N D Y N A M I C S 98 Using the PlanckEinstein equations E = h n and E = p x c for photons, the energy equation becomes n = gn bgn = gn (1 b ) = n (1 b )/(1 b 2 ) 1/2 = n {(1 b )/(1 + b )} 1/2 (6.14) This is the relativistic Doppler shift for photons of the frequency n measured in an inertial frame (primed) in terms of the frequency n measured in another inertial frame. 6.3 Relativistic collisions and the conservation of 4momentum Consider the interaction between two particles, 1 and 2, to form two particles, 3 and 4. (3 and 4 are not necessarily the same as 1 and 2). The contravariant 4momenta are P i m : Before After 3 P 3 m P 1 m P 2 m q 1 2 f 4 P 4 m 1 + 2 3 + 4 All experiments are consistent with the fact that the 4momentum of the system is conserved. We have, for the contravariant 4momentum vectors of the interacting particles, PAGE 109 E I N S T E I N I A N D Y N A M I C S 99 P 1 m + P 2 m = P 3 m + P 4 m (6.15) ______ ______ initial free state final free state and a similar equation for the covariant 4momentum vectors, P 1 m + P 2 m = P 3 m + P 4 m (6.16) If we are interested in the change P 1 m P 3 m then we require P 1 m P 3 m = P 4 m P 2 m (6.17) and P 1 m P 3 m = P 4 m P 2 m (6.18) Forming the invariant scalar products, and using P i m P i m = (E i 0 /c) 2 we obtain (E 1 0 /c) 2 2(E 1 E 3 /c 2 p 1 p 3 ) + (E 3 0 /c) 2 = (E 4 0 /c) 2 2(E 2 E 4 /c 2 p 2 p 4 ) + (E 2 0 /c) 2 (6.19) Introducing the scattering angles, q and f this equation becomes E 1 0 2 2(E 1 E 3 c 2 p 1 p 3 cos q ) + E 3 0 2 = E 2 0 2 2(E 2 E 4 c 2 p 2 p 4 cos f ) + E 4 0 2 If we choose a reference frame in which particle 2 is at rest (the LAB frame), then p 2 = 0 and E 2 = E 2 0 so that E 1 0 2 2(E 1 E 3 c 2 p 1 p 3 cos q ) + E 3 0 2 = E 2 0 2 2E 2 0 E 4 + E 4 0 2 (6.20) The total energy of the system is conserved, therefore E 1 + E 2 = E 3 + E 4 = E 1 + E 2 0 (6.21) or E 4 = E 1 + E 2 0 E 3 PAGE 110 E I N S T E I N I A N D Y N A M I C S 100 Eliminating E 4 from the above scalar product equation gives E 1 0 2 2(E 1 E 3 c 2 p 1 p 3 cos q ) + E 3 0 2 = E 4 0 2 E 2 0 2 2E 2 0 (E 1 E 3 ) (6.22) This is the basic equation for all interactions in which two relativistic entities in the initial state interact to give two relativistic entities in the final state. It applies equally well to interactions that involve massive and massless entities. 6.3.1 The Compton effect The general method discussed in the previous section can be used to provide an exact analysis of Comptons famous experiment in which the scattering of a photon by a stationary, free electron was studied. In this example, we have E 1 = E ph (the incident photon energy), E 2 = E e 0 (the rest energy of the stationary electron, the target), E 3 = E ph (the energy of the scattered photon), and E 4 = E e (the energy of the recoilling electron). The rest energy of the photon is zero: E ph q E ph = p ph c E e 0 > E e The general equation (6.22), is now 0 2(E ph E ph E ph E ph cos q ) = E e 0 2 2E e 0 (E ph + E e 0 E ph ) + E e 0 2 (6.23) or E ph E ph (1 cos q ) = E e 0 (E ph E ph ) so that PAGE 111 E I N S T E I N I A N D Y N A M I C S 101 E ph E ph = E ph E ph (1 cos q )/E e 0 (6.24) Compton measured the energyloss of the photon on scattering and its cos q dependence. 6.4 Relativistic inelastic collisions We shall consider an inelastic collision between a particle 1 and a particle 2 (initially at rest) to form a composite particle 3. In such a collision, the 4momentum is conserved (as it is in an elastic collision) however, the kinetic energy is not conserved. Part of the kinetic energy of particle 1 is transformed into excitation energy of the composite particle 3. This excitation energy can take many forms heat energy, rotational energy, and the excitation of quantum states at the microscopic level. The inelastic collision is as shown: Before After 1 2 3 p 1 p 2 = 0 p 3 Rest energy: E 1 0 E 2 0 E 3 0 Total energy: E 1 E 2 = E 2 0 E 3 3momentum: p 1 p 2 = 0 p 3 Kinetic energy: T 1 T 2 = 0 T 3 In this problem, we shall use the energymomentum invariants associated with each particle, directly: i) E 1 2 ( p 1 c) 2 = E 1 0 2 (6.25) ii) E 2 2 = E 2 0 2 (6.26) iii) E 3 2 ( p 3 c) 2 = E 3 0 2 (6.27) PAGE 112 E I N S T E I N I A N D Y N A M I C S 102 The total energy is conserved, therefore E 1 + E 2 = E 3 = E 1 + E 2 0 (6.28) Introducing the kinetic energies of the particles, we have (T 1 + E 1 0 ) + E 2 0 = T 3 + E 3 0 (6.29) The 3momentum is conserved, therefore p 1 + 0 = p 3 (6.30) Using E 3 0 2 = E 3 2 ( p 3 c) 2 (6.31) we obtain E 3 0 2 = (E 1 + E 2 0 ) 2 ( p 3 c) 2 = E 1 2 + 2E 1 E 2 0 + E 2 0 2 ( p 1 c) 2 = E 1 0 2 + 2E 1 E 2 0 + E 2 0 2 = E 1 0 2 + E 2 0 2 + 2(T 1 + E 1 0 )E 2 0 (6.32) or E 3 0 2 = (E 1 0 + E 2 0 ) 2 + 2T 1 E 2 0 (E 3 0 > E 1 0 + E 2 0 ). (6.33) Using T 1 = g 1 E 1 0 E 1 0 where g 1 = (1 b 1 2 ) /2 and b 1 = v 1 /c, we have E 3 0 2 = E 1 0 2 + E 2 0 2 + 2 g 1 E 1 0 E 2 0 (6.34) If two identical particles make a completely inelastic collision then E 3 0 2 = 2( g 1 + 1)E 1 0 2 (6.35) 6.5 The Mandelstam variables In discussions of relativistic interactions it is often useful to introduce additional Lorentz invariants that are known as Mandelstam variables. They are, for the special case PAGE 113 E I N S T E I N I A N D Y N A M I C S 103 of two particles in the initial and final states (1 + 2 3 + 4): s = (P 1 m + P 2 m )[P 1 m + P 2 m ], the total 4momentum invariant = ((E 1 + E 2 )/c, ( p 1 + p 2 ))[(E 1 + E 2 )/c, ( p 1 + p 2 )] = (E 1 + E 2 ) 2 /c 2 ( p 1 + p 2 ) 2 (6.36) Lorentz invariant, t = (P 1 m P 3 m )[P 1 m P 3 m ], the 4momentum transfer (1 3) invariant = (E 1 E 3 ) 2 /c 2 ( p 1 p 3 ) 2 (6.37) Lorentz invariant, and ` u = (P 1 m P 4 m )[P 1 m P 4 m ], the 4momentum transfer (1 4) invariant = (E 1 E 4 ) 2 /c 2 ( p 1 p 4 ) 2 (6.38) Lorentz invariant. Now, sc 2 = E 1 2 + 2E 1 E 2 + E 2 2 (p 1 2 + 2 p 1 p 2 + p 2 2 )c 2 = E 1 0 2 + E 2 0 2 + 2E 1 E 2 2 p 1 p 2 c 2 = E 1 0 2 + E 2 0 2 + 2(E 1 p 1 c)[E 2 p 2 c]. (6.39) _____________ Lorentz invariant The Mandelstam variable sc 2 has the same value in all inertial frames. We therefore evaluate it in the LAB frame, defined by the vectors [E 1 L p 1 L c] and [E 2 L = E 2 0 p 2 L c = 0], (6.40) PAGE 114 E I N S T E I N I A N D Y N A M I C S 104 so that 2(E 1 L E 2 L p 1 L p 2 L c 2 ) = 2E 1 L E 2 0 (6.41) and sc 2 = E 1 0 2 + E 2 0 2 + 2E 1 L E 2 0 (6.42) We can evaluate sc 2 in the centerof mass (CM) frame, defined by the condition p 1 CM + p 2 CM = 0 (the total 3momentum is zero): sc 2 = (E 1 CM + E 2 CM ) 2 (6.43) This is the square of the total CM energy of the system. 6.5.1 The total CM energy and the production of new particles The quantity c s is the energy available for the production of new particles, or for exciting the internal structure of particles. We can now obtain the relation between the total CM energy and the LAB energy of the incident particle (1) and the target (2), as follows: sc 2 = E 1 0 2 + E 2 0 2 + 2E 1 L E 2 0 = (E 1 CM + E 2 CM ) 2 = W 2 say. (6.44) Here, we have evaluated the lefthand side in the LAB frame, and the righthand side in the CM frame! At very high energies, c s >> E 1 0 and E 2 0 the rest energies of the particles in the initial state, in which case, W 2 = sc 2 2E 2 L E 2 0 (6.45) The total CM energy, W, available for the production of new particles therefore depends on the square root of the incident laboratory energy. This result led to the development of colliding, or intersecting, beams of particles (such as protons and antiprotons) in order PAGE 115 E I N S T E I N I A N D Y N A M I C S 105 to produce sufficient energy to generate particles with rest masses typically 100 times the rest mass of the proton (~ 10 9 eV). 6.6 Positronelectron annihilationinflight A discussion of the annihilationinflight of a relativistic positron and a stationary electron provides a topical example of the use of relativistic conservation laws. This process, in which two photons are spontaneously generated, has been used as a source of nearly monoenergetic highenergy photons for the study of nuclear photodisintegration since 1960. The general result for a 1 + 2 3 + 4 interaction, given in section 6. 3 provides the basis for an exact calculation of this process; we have E 1 = E pos (the incident positron energy), E 2 = E e 0 (the rest energy of the stationary electron), E 3 = E ph1 (the energy of the forwardgoing photon), and E 4 = E ph2 (the energy of the backwardgoing photon). The rest energies of the positron and the electron are equal. The general equation (6.22), now reads E e 02 2{E pos E ph1 cp pos E ph1 (cos q )} + 0 = 0 E e 02 E e 0 (E pos E ph1 ) (6.46) therefore E ph1 {E pos + E e 0 [E pos 2 E e 02 ] 1/2 cos q } = (E pos + E e 0 )E e 0 giving E ph1 = E e 0 /(1 kcos q ) (6.47) where k = [(E pos E e 0 )/(E pos + E e 0 )] 1/2 . PAGE 116 E I N S T E I N I A N D Y N A M I C S 106 The maximum energy of the photon, E ph1 max occurs when q = 0, corresponding to motion in the forward direction; its energy is E ph1 max = E oe /(1 k). (6.48) If, for example, the incident total positron energy is 30 MeV, and E e 0 = 0.511MeV then E ph1 max = 0.511/[1 (29.489/30.511) 1/2 ] MeV = 30.25 MeV. The forwardgoing photon has an energy equal to the kinetic energy of the incident positron (T 1 = 30 0.511 MeV) plus about threequarters of the total rest energy of the positronelectron pair (2E e 0 = 1.02 MeV). Using the conservation of the total energy of the system, we see that the energy of the backwardgoing photon is about 0.25 MeV. The method of positronelectron annihilationinflight provides one of the very few ways of generating nearly monoenergetic photons at high energies. PROBLEMS 61 A particle of rest energy E 0 has a relativistic 3momentum p and a relativistic kinetic energy T. Show that 1)  p  = (1/c)(2TE 0 ) 1/2 {1 + (T/2E 0 )} 1/2 and 2)  v  = c{1 + [E 02 /T(T + 2E 0 )]} /2 where v is the 3velocity. 62 Two similar relativistic particles, A and B, each with rest energy E 0 move towards each other in a straight line. The constant speed of each particle measured in the PAGE 117 E I N S T E I N I A N D Y N A M I C S 107 LAB frame is V = b c. Show that their total energy, measured in the rest frame of A, is E 0 (1 + b 2 )/(1 b 2 ). 63 An atom of rest energy E A 0 is initially at rest. It completely absorbs a photon of energy E ph and the excited atom of rest energy E A 0 recoils freely. If the excitation energy of the atom is given by E ex = E A 0 E A 0 show that E ex = E A 0 + E A 0 {1 + (2E ph /E A 0 )} 1/2 exactly. If, as is often the case, E ph E A 0 show that the recoil energy of the atom is E recoil E ph 2 /2E A 0 Explain how this approximation can be deduced using a Newtonianlike analysis. 64 A completely inelastic collision occurs between particle 1 and particle 2 (initially at rest ) to form a composite particle, 3. Show that the speed of 3 is v 3 = v 1 /{1 + (E 2 0 /E 1 )}, where v 1 and E 1 are the speed and the total energy of 1, and E 2 0 is the rest energy of 2. 65 Show that the minimum energy that a g ray must have to just break up a deuteron into a neutron and a proton is g min 2.23 MeV, given E neut 0 = 939.5656 MeV, E prot 0 = 938.2723 MeV, and E deut 0 = 1875.6134 MeV. 66 In a general relativistic collision: 1 + 2 nparticles PAGE 118 E I N S T E I N I A N D Y N A M I C S 108 (3 + 4 + ...m) + (m+1, m+2. + ...n) where the particles 3 m are observed, and the particles m+ 1 n are unobserved. We have E 1 + E 2 = (E 3 + E 4 + ...E m ) + (E m+1 + E m+2 + ...E n ), the total energy, = E obs + E unobs and p 1 + p 2 = p obs + p unobs If W unobs /c 2 is the unobserved (missing) mass of the particles m+1 to n, show that, in the LAB frame (W unobs ) 2 = (E 1 L + E 2 0 [i = 3,m] E i L ) 2 ( p 1 L c c [i = 3,m] p i L ) 2 This is the missing (energy) 2 in terms of the observed quantities. This is the principle behind the socalled missingmass spectrometers used in Nuclear and Particle Physics. 6.7 If the contravariant 4force is defined as F m = dP m /d t = [f 0 f ] where t is the proper time, and P m is the contravariant 4momentum, show that F m V m = 0, where V m is the covariant 4velocity. (The 4force and the 4velocity are orthogonal). Obtain dE/dt in terms of g v and f . PAGE 119 7 NEWTONIAN GRAVITATION We come now to one of the highlights in the history of intellectual endeavor, namely Newtons Theory of Gravitation. This spectacular work ranks with a handful of masterpieces in Natural Philosophy the GalileoNewton Theory of Motion, the CarnotClausiusKelvin Theory of Heat and Thermodynamics, Maxwells Theory of Electromagnetism, the MaxwellBoltzmannGibbs Theory of Statistical Mechanics, Einsteins Theories of Special and General Relativity, Planck s Quantum Theory of Radiation, and the BohrdeBroglieSchrdingerHeisenberg Quantum Theory of Matter. Newtons most significant ideas on Gravitation were developed in his early twenties at a time when the University of Cambridge closed down because of the Great Plague. He returned to his home, a farm at WoolsthorpebyColsterworth, in Lincolnshire. It is a part of England dominated by vast, changing skies; a region buffeted by the winds from the North Sea. The thoughts of the young Newton naturally turned skyward there was little on the ground to stir his imagination except, perhaps, the proverbial apple tree and the falling apple. Newtons work set us on a new course. Before discussing the details of the theory, it will be useful to give an overview using the simplest model, consistent with logical accuracy. In this way, we can appreciate Newtons radical ideas, and his development of the now standard Scientific Method in which a crucial interplay exists between the results of observations and mathematical PAGE 120 N E W T O N I A N G R A V I T A T I O N 110 models that best account for the observations. The great theories are often based upon relatively small numbers of observations. The uncovering of the Laws of Nature requires deep and imaginative thoughts that go far beyond the demonstration of mathematical prowess. Newtons development of Differential Calculus in the late 1660s was strongly influenced by his attempts to understand, analytically, the empirical ideas concerning motion that had been put forward by Galileo. In particular, he investigated the analytical properties of motion in curved paths. These properties are required in his Theory of Gravitation. We shall consider motion in 2dimensions. 7.1 Properties of motion along curved paths in the plane The velocity of a point in the plane is a vector, drawn at the point, such that its component in any direction is given by the rate of change of the displacement, in that direction. Consider the following diagram y B y + D y Q PQ y P D y A D x O x x x + D x t t + D t P and Q are the positions of a point moving along the curved path AB. The coordinates are P[x, y] at time t and Q[x + D x, y + D y] at time t + D t. The components of the velocity of the point are PAGE 121 N E W T O N I A N G R A V I T A T I O N 111 lim( D t 0) D x/ D t = dx/dt = v x and lim( D t 0) D y/ D t = dy/dt =v y D x and D y are the components of the vector PQ The velocity is therefore lim( D t 0) chordPQ/ D t We have lim(Q P) chord PQ/ D s = 1, where s is the length of the curve AP and D s is the length of the arc PQ. The velocity can be written lim( D t 0) (chordPQ/ D s)( D s/ D t) = ds/dt. (7.1) The directionof the instantaneous velocity at P is along the tangent to the path at P. The xand ycomponents of the acceleration of P are lim( D t 0) D v x / D t = dv x /dt = d 2 x/dt 2 and lim( D t 0) D v y / D t = dv y /dt = d 2 y/dt 2 The resultant acceleration is not directed along the tangent at P. Consider the motion of P along the curve APQB: y B v + D v Q Dy P v A O x PAGE 122 N E W T O N I A N G R A V I T A T I O N 112 The change D v in the vector v is shown in the diagram: v + D v Dy D v v The vector D v can be written in terms of two components, a perpendicular to the direction of v, and b along the direction of v + D v : The acceleration is lim( D t 0) D v / D t, The component along a is lim( D t 0) D a/ D t = lim( D t 0) v Dy / D t = lim( D t 0) (v Dy / D s)( D s/ D t) = v 2 (d y /ds) = v 2 / r (7.2) where r = ds/d y is the radius of curvature at P. (7.3) The direction of this component of the acceleration is along the inward normal at P. If the particle moves in a circle of radius R then its acceleration towards the center is v 2 /R, a result first given by Newton. The component of acceleration along the tangent at P is dv/dt = v(dv/ds) = d 2 s/dt 2 7.2 An overview of Newtonian gravitation Newton considered the fundamental properties of motion, embodied in his three Laws, to be universal in character the natural laws apply to all motions of all particles throughout all space, at all times. Such considerations form the basis of a Natural Philosophy. In the Principia Newton wrote ..I began to think of gravity as extending to PAGE 123 N E W T O N I A N G R A V I T A T I O N 113 the orb of the Moon... He reasoned that the Moon, in its steady orbit around the Earth, is always accelerating towards the Earth! He estimated the acceleration as follows: If the orbit of the Moon is circular (a reasonable assumption), the dynamical problem is v a R Moon Earth R The acceleration of the Moon towards the Earth is  a R = v 2 /R Newton calculated v = 2 p R/T, where R =240,000 miles, and T = 27.4 days, the period, so that a R = 4 p 2 R/T 2 0.007 ft/sec 2 (7.4) He knew that all objects, close to the surface of the Earth, accelerate towards the Earth with a value determined by Galileo, namely g 32ft/sec 2 He was therefore faced with the problem of explaining the origin of the very large difference between the value of the acceleration a R nearly a quarter of a million miles away from Earth, and the local value, g. He had previously formulated his 2nd Law that relates force to acceleration and therefore he reasoned that the difference between the accelerations, a R and g, must be associated with a property of the force acting between the Earth and the Moon the force must decrease in some unknown way. Newton then introduced his conviction that the force of gravity between objects is a universal force; each planet in the solar system interacts with the Sun via the same basic PAGE 124 N E W T O N I A N G R A V I T A T I O N 114 force, and therefore undergoes a characteristic acceleration towards the Sun. He concluded that the answer to the problem of the nature of the gravitational force must be contained in the three empirical Laws of Planetary Motion announced by Kepler, a few decades before. The three Laws are 1) The planets describe ellipses about the Sun as focus, 2) The line joining the planet to the Sun sweeps out equal areas in equal intervals of time, and 3) The period of a planet is proportional to the length of the semimajor axis of the orbit, raised to the power of 3/2. These remarkable Laws were deduced after an exhaustive study of the motion of the planets, made over a period of about 50 years by Tycho Brahe and Kepler. The 3rd Law was of particular interest to Newton because it relates the square of the period to the cube of the radius for a circular orbit: T 2 R 3 (7.5) or T 2 = CR 3 where C is a constant. He replaced the specific value of (R/T 2 ) that occurs in the expression for the acceleration of the Moon towards the Earth with the value obtained from Keplers 3rd Law and obtained a value for the acceleration a R : a R = v 2 /R = 4 p 2 R/T 2 (Newton) (7.6) PAGE 125 N E W T O N I A N G R A V I T A T I O N 115 but R/T 2 = 1/CR 2 (Kepler) (7.7) therefore a R = 4 p 2 (R/T 2 ) = (4 p 2 /C)(1/R 2 ) (Newton). (7.8) The acceleration of the Moon towards the Earth varies as the inverse square of the distance between them. Newton was now prepared to develop a general theory of gravitation. If the acceleration of a planet towards the Sun depends on the inverse square of their separation, then the force between them can be written, using the 2nd Law of Motion, as follows F = M planet a planet = M planet (4 p 2 /C)(1/R 2 ). (7.9) At this point, Newton introduced the first symmetry argument in Physics: if the planet experiences a force from the Sun then the Sun must experience the same force from the planet (the 3rd Law of Motion!). He therefore argued that the expression for the force between the planet and the Sun must contain, explicitly, the masses of the planet and the Sun. The gravitational force F G between them therefore has the form F G = GM Sun M planet /R 2 (7.10) where G is a constant. Newton saw no reason to limit this form to the Sunplanet system, and therefore he announced that for any two spherical masses, M 1 and M 2 the gravitational force between them is given by PAGE 126 N E W T O N I A N G R A V I T A T I O N 116 F G = GM 1 M 2 /R 2 (7.11) where G is a universal constant of Nature. All evidence points to the fact that the gravitational force between two masses is always attractive Returning to the EarthMoon system, the force on the Moon (mass M M ) in orbit is F R = GM E M M /R 2 = M M a R (7.12) so that a R = GM E /R 2 which is independent of M M (The cancellation of the mass M M in the expressions for F R involves an important point that is discussed later in the section 8.1 ). At the surface of the Earth, the acceleration, g, of a mass M is essentially constant. It does not depend on the value of the mass, M, thus g = GM E /R E 2 where R E is the radius of the Earth. (7.13) (It took Newton many years to prove that the entire mass of the Earth, M E is equivalent to a point mass, M E located at the center of the Earth when calculating the Earths gravitational interaction with a mass on its suface. This result depends on the exact 1/R 2 nature of the force). The ratio of the accelerations, a R /g, is therefore a R /g = (GM E /R 2 )/(GM E /R E 2 ) = (R E /R) 2 (7.14) Newton knew from observations that the ratio of the radius of the Earth to the radius of the Moons orbit is about 1/60, and therefore he obtained PAGE 127 N E W T O N I A N G R A V I T A T I O N 117 a R /g (1/60) 2 = 1/3600. so that a R = g/3600 = (32/3600)ft/sec 2 = 0.007...ft/sec 2 In one of the great understatements of analysis, Newton said, in comparing this result with the value for a R that he had deduced using a R = v 2 /R, ..that it agreed pretty nearly..The discrepancy came largely from the errors in the observed ratio of the radii. 7.3 Gravitation: an example of a central force Central forces, in which a particle moves under the influence of a force that acts on the particle in such a way that it is always directed towards a single point the center of force form an important class of problems Let the center of force be chosen as the origin of coordinates: v m P[r, f ] F r Center of Force f O x The description of particle motion in terms of polar coordinates (Chapter 2 ), is wellsuited to the analysis of the central force problem. For general motion, the acceleration of a point P[r, f ] moving in the plane has the following components in the rand f directions a r = u r (d 2 r/dt 2 r(d f /dt) 2 ), (7.15) PAGE 128 N E W T O N I A N G R A V I T A T I O N 118 and a f = u f (r(d 2 f /dt 2 ) + 2(dr/dt)(d f /dt)), (7.16) where u r and u f are unit vectors in the rand f directions. In the central force problem, the force F is always directed towards O, and therefore the component a f perpendicular to r is always zero: a f = u f (r(d 2 f /dt 2 ) + 2(dr/dt)(d f /dt) = 0, (7.17) and therefore r(d 2 f /dt 2 ) + 2(dr/dt)(d f /dt) = 0. (7.18) This is the equation of motion of a particle moving under the influence of a central force, centered at O. If we take the Sun as the (fixed) center of force, the motion of a planet moving about the Sun is given by this equation. The differential equation can be solved by making the substitution w = d f /dt, (7.19) giving rd w /dt + 2 w (dr/dt) = 0, (7.20) or rd w = w dr. Separating the variables, we obtain d w / w = dr/r. Integrating, gives PAGE 129 N E W T O N I A N G R A V I T A T I O N 119 log e w = log e r + C (constant), therefore log e ( w r 2 ) = C. Taking antilogs gives r 2 w = r 2 (d f /dt) = e C = k, a constant. (7.21) 7.4 Motion under a central force and the conservation of angular momentum The above solution of the equation of motion of a particle of mass m, moving under the influence of a central force at the origin, O, can be multiplied throughout by the mass m to give mr 2 (d f /dt) = mk (7.22) or mr(r(d f /dt)) = K, a constant for a given mass, (7.23) We note that r(d f /dt) = v f the component of velocity perpendicular to r therefore angular momentum of m about O = r(mv f ) = K, a constant of the motion for a central force 7.5 Keplers 2nd law explained The equation r 2 (d f /dt) = constant, K, can be interpreted in terms of an element of area swept out by the radius vector, r, as follows D A r Df (r + D r) D f r + D r Df r f O x PAGE 130 N E W T O N I A N G R A V I T A T I O N 120From the diagram, we see that the following inequality holds r2Df /2 < D A < (r + D r)2Df /2 or r2/2 < D A/ Df < (r + D r)2/2. When Df 0, r + D r r, so that, in the limit, dA/d f = r2/2. The element of area is therefore dA = r2d f /2. Twice the time rate of change of this element is therefore 2dA/dt = r2(d f /dt).(7.24) We recognize that this expression is equal to k, the constant that occurs in the solution of the differential equation of motion for a central path. The radius vector r therefore sweeps out area at a constant rate. This is Keplers 2nd Law of Planetary Motion; it is seento be a direct consequence of the fact that the gravitational attraction between the Sunand a planet is a central force problem.7.6 Central orbits A central orbit must be a plane curve (there is no force out of the plane), and the moment of the velocity r2(d f /dt), about the center of force, must be a constant of the motion. The moment can be written in three equivalent ways: rd f /dt v v y dy/dy v dr/dt r Fc Fc Fc dx/dt O f x O p x O x PAGE 131 N E W T O N I A N G R A V I T A T I O N 121 The moment of the velocity about O is then r(r(d f /dt) = pv = x(dy/dt) y(dx/dt) = a constant, h, say. (7.25) The result r 2 (d f /dt) = constant for a central force can be derived in the following alternative way: The time derivative of r 2 (d f /dt) is (d/dt)(r 2 (d f /dt)) = r 2 (d 2 f /dt 2 ) + (d f /dt)2r(dr/dt) (7.26) If this equation is divided throughout by r then (1/r)(d/dt)(r 2 (d f /dt)) = r(d 2 f /dt 2 ) + 2(dr/dt)(d f /dt) (7.27) = the transverse acceleration = 0 for a central force. (7.28) Integrating then gives r 2 (d f /dt) = constant for a central force. (7.29) 7.6.1 The law of force in [p, r] coordinates There are advantages to be gained in using a new set of coordinates [p, r] coordinates in which a point P in the plane is defined in terms of the radial distance r from the origin, and the perpendicular distance p from the origin onto the tangent to the path at P. (See following diagram). Let a particle of unit mass move along a path under the influence of a central force directed towards a fixed point, O. Let a c be the central acceleration of the unit mass at P, PAGE 132 N E W T O N I A N G R A V I T A T I O N 122 let the perpendicular distance from O to the tangent at P be p, and let the instantaneous radius of curvature of the path at the point P be r : Central orbit v Component of acceleration P[r, p] along inward normal at P, a ^ a r a c r p Center of Force O The component of the central acceleration along the inward normal at P is a ^ = a c sin a = v 2 / r = a c (p/r). (7.30) The instantaneous radius of curvature is given by r = r(dr/dp). (Show this!) (7.31) For all central forces, pv = constant = h, (7.32) therefore a ^ = v 2 / r = (h 2 /p 2 )(1/r)(dp/dr) = a c (p/r), (7.33) so that a c = (h 2 /p 3 )(dp/dr). (7.34) This differential equation is the law of force per unit mass given the orbit in [p, r] coordinates. PAGE 133 N E W T O N I A N G R A V I T A T I O N 123 (It is left as a problem to show that given the orbit in polar coordinates, the law of force per unit mass is a c = h 2 u 2 {u + d 2 u/d f 2 }, where u = 1/r ). (7.35) In order to find the law of force per unit mass (acceleration), given the [p, r] equation of the orbit, it is necessary to calculate dp/dr. For example, if the orbit is parabolic, the [p, r] equation can be obtained as follows y P Tangent at P Q p r Apex A x F the Focus The triangles FAQ and FQP are similar, therefore p/a = r/p, where AF = a, (7.36) giving 1/p 2 = 1/ar, the pr equation of a parabola. (7.37) Differentiating this equation, we obtain (1/p 3 )dp/dr = 1/2ar 2 (7.38) The law of acceleration for the parabolic central orbit is therefore a c = (h 2 /p 3 )dp/dr = (h 2 /2a)(1/r 2 ) = constant/r 2 (7.39) The instantaneous speed of P is given by the equation v = h/p; we therefore find PAGE 134 N E W T O N I A N G R A V I T A T I O N 124 v = h/ ar.(7.40) This approach can be taken in discussing central orbits with elliptic and hyperbolic forms. Consider the ellipse Q y P R b p1 r1 r2 p2 O x F1 F2 a The foci are F1 and F2, the semimajor axis is a, the semiminor axis is b, the radius vectors to the point P[r, f ] are r1 and r2 and the perpendiculars from F1 and F2 onto the tangent at P are p1 and p2. Using standard results from analytic geometry, we have for the ellipse 1) r1 + r2 = 2a, (7.41 ac) 2) p1p2 = b2, and 3) angle QPF1 = angle RPF2. The triangles F1QP and F2RP are similar, and therefore p1/r1 = p2/r2 (7.42) or (p1p2/r1r2)1/2 = b/{r1(2a r1)}1/2 = p1/r1so that b2/p1 2 = 2a/r1 1.(7.43) PAGE 135 N E W T O N I A N G R A V I T A T I O N 125 This is the [p, r] equation of an ellipse. The [p, r] equation for the hyperbola can be obtained using a similar analysis. The standard results from analytical geometry that apply in this case are 1) p 1 p 2 = b 2 (7.44 ac) 2) r 2 r 1 = 2a, and 3) the tangent at P bisects the angle between the focal distances. (b 2 = a 2 (e 2 1) where e is the eccentricity (e 2 > 1), and 2b 2 /a is the latus rectum). We therefore obtain b 1 2 /p 1 2 = 2a/r 1 + 1. (7.45) This is the [p, r] equation of an hyperbola. 7.7 Bound and unbound orbits For a central force, we have the equation for the acceleration in [p, r] form (h 2 /p 3 )dp/dr = a c (7.46) If the acceleration varies as 1/r 2 then the form of the orbit is given by separating the variables, and integrating, thus h 2 dp/p 3 = k dr/r 2 (7.47) so that h 2 /2p 2 = k/r, where k is a constant, or h 2 /p 2 = 2k/r + C, where the value of C depends on the form of the orbit. Comparing this form with the general form of the [p, r] equations of conic sections, we see that the orbit is an ellipse, parabola, or hyperbola depending on the value of C. If PAGE 136 N E W T O N I A N G R A V I T A T I O N 126 C is negative, the orbit is an ellipse, C is zero, the orbit is a parabola, and if C is positive, the orbit is an hyperbola. The speed of the particle in a central orbit is given by v = h/p. If, therefore, the particle is projected from the origin, O (corresponding to r = r 0 ) with a speed v 0 then h 2 /p 2 = v 0 2 = 2k/r 0 + C, (7.48) so that the orbit is 1) an ellipse if v 0 2 < 2k/r 0 (7.49 ac) 2) a parabola if v 0 2 = 2k/r 0 or 3) an hyperbola if v 0 2 > 2k/r 0 The escape velocity, the initial velocity required for the particle to go into an unbound orbit is therefore given by v 2 escape = 2k/r 0 = 2GM E /R E for a particle launched from the surface of the Earth. This condition is, in fact, an energy equation (1/2)(m = 1)v 2 escape = GM E (m = 1)/R E (7.50) kinetic energy potential energy 7. 8 The concept of the gravitational field Newton was wellaware of the great difficulties that arise in any theory of the gravitational interaction between two masses not in direct contact with each other. In the Principia he assumes, in the absence of any experimental knowledge of the speed of PAGE 137 N E W T O N I A N G R A V I T A T I O N 127 propagation of the gravitational interaction, that the interaction takes place instantaneously. However, in letters to other luminaries of his day, he postulated an intervening agent bewteen two approaching masses an agent that requires a finite time to react. In the early 17th century, the problem of understanding the interaction between spatially separated objets appeared in a new guise, this time in discussions of the electromagnetic interaction between charged objects. Faraday introduced the idea of a field of force with dynamical properties In the Faraday model, an accelerating electric charge acts as the source of a dynamical electromagnetic field that travels at a finite speed through spacetime, and interacts with a distant charge. Energy and momentum are thereby transferred from one charged object to another distant charged object. Maxwell developed Faradays idea into a mathematical theory the electromagnetic theory of light in which the speed of propagation of light appears as a fundamental constant of Nature. His theory involves the differential equations of motion of the electric and magnetic field vectors; the equations are not invariant under the Galilean transformation but they are invariant under the Lorentz transformation. (The discovery of the transformation that leaves Maxwells equations invariant for all inertial observers was made by Lorentz in 1897). We have previously discussed the development of the Special Theory of Relativity by Einstein, a theory in which there is but one universal constant, c, for the speed of propagation of a dynamical field in a vacuum This means that c is not only the speed of light in free space but also the speed of the gravitational field in the void between interacting masses. PAGE 138 N E W T O N I A N G R A V I T A T I O N 128 We can gain some insight into the dynamical properties associated with the interaction between distant masses by investigating the effect of a finite speed of propagation, c, of the gravitational interaction on Newtons Laws of Motion. Consider a nonorbiting mass M, at a distance R from a mass mass M S simply falling from rest with an acceleration a (R) towards M S According to Newtons Theory of Gravitation, the magnitude of the force on the mass M is  F (R) = GM S M/R 2 = Ma(R), (7.51) We therefore have a(R) = GM S /R 2 (7.52) Let D t be the time that it takes for the gravitational interaction to travel the distance R at the universal speed c, so that D t = R/c. (7.53) In the time interval D t, the mass M moves a distance, D R, towards the mass M S ; D R = a D T 2 /2 = (GM S /R 2 ) D t 2 /2 = (GM S /R 2 )(R/c) 2 /2. (7.54) Consider the situation in which the mass M is in a circular orbit of radius R about the mass, M S Let v (t) be the velocity of the mass M at time t, and v (t + D t) its velocity at t + D t, where D t is chosen to be the interaction travel time. Let us consider the motion of M if there were no mass M S present, and therefore no interaction; the mass M then would continue its motion with constant velocity v (t) in a straight line. We are interested in the PAGE 139 N E W T O N I A N G R A V I T A T I O N 129 difference in the positions of M at time t + D t, with and without the mass M S in place. We have, to a good approximation: M v (t) extrapolated position (no mass M S ) F (R) M D R v (t + D t) R R M S The magnitude of the gravitational force, F EX at the extrapolated position, with M S in place, is F EX = GM S M/(R + D R) 2 (7.55) = (GM S M/R 2 )(1 + D R/R) (GM S M/R 2 )(1 2 D R/R), for D R << R. (7.56) Substituting the value of D R obtained above, we find F EX GM S M/R 2 (GM S M/Rc 2 )(GM S /R 2 ). (7.57) Nerwtons 3rd Law states that F MS, M = F M, MS (7.58) This Law is true, however, for contact interactions only. For all interactions that take place between separated objects, there is a mismatch between the action and the reaction. It takes time for one particle to respond to the presence of the other! In the present example, we obtain a good estimate of the mismatch by taking the difference between F EX (R + D R) and F(R), namely F EX (R + D R) F(R) (GM S M/Rc 2 )(GM S /R 2 ). (7.59) PAGE 140 N E W T O N I A N G R A V I T A T I O N 130 On the righthand side of this equation, we note that the term (GM S /R 2 ) has dimensions of acceleration, and therefore the term (GM S M/Rc 2 ) must have dimensions of mass. We see that this term is an estimate of the mass associated with the interaction, itself. The space between the interacting masses must be endowed with this effective mass if Newtons 3rd Law is to include noncontact interactions. The appearance of the term c 2 in the denominator of this effective mass term has a special significance. If we invoke Einsteins famous relation E = Mc 2 then D E = D Mc 2 so that the effective mass of the gravitational interaction can be written as an effective energy: D E GRAV = GM S M/R. (7.60) This is the energy stored in the gravitational field between the two interacting masses. Note that it has a 1/Rdependence the correct form for the potential energy associated with a 1/R 2 gravitational force. We see that the notion of a dynamical field of force is a necessary consequence of the finite propagation time of the interaction. 7.9 The gravitational potential The concept of a gravitational potential has its origins in the work of Leibniz. The potential energy, V( x ), asssociated with n interacting particles, of masses m 1 m 2 ...m n situated at x 1 x 2 ... x n is related to the gravitational force on a mass M at x due to the n particles, by the equation F ( x ) = V( x ). (7.61) The exact forms of F ( x ) and V( x ) are F ( x ) = GM [i = 1, n] m i ( x x i )/ x x i  3 (7.62) PAGE 141 N E W T O N I A N G R A V I T A T I O N 131 and V( x ) = GM [i = 1, n] m i / x x i  In upperindex notation, the components of the force are F k ( x ) = V/ x k k = 1, 2, 3. (7.63) The gravitational field, g ( x ), is the force per unit mass: g ( x ) = F ( x )/M, (7.64) and the gravitational potential is defined as F ( x ) = V( x )/M = [i = 1, n] Gm i / x x i . (7.65) The sign of the potential is chosen to be negative because the gravitational force is always attractive. (This convention agrees with that used in Electrostatics). If the mass consists of a continuous distribution that can be described by a mass density r ( x ), then the potential is F ( x ) = (G r ( x )/ x x ) d 3 x. (7.66) It is left as an exercise to show that this form of F means that the potential obeys Poissons equation 2 F ( x ) 4 p G r ( x ) = 0. We should note that the gravitational potential of a mass M has the form V(r) = GM/r (7.67) only around a mass distribution with spherical symmetry. For an arbitrary mass distribution, the potential can be written as a series of multipoles. PAGE 142 N E W T O N I A N G R A V I T A T I O N 132 The potential of a circular disc at a point on its axis can be found as follows P R p dr Q r O Let the disc be divided into concentric circles. The potential at P, on the axis, due to the elemental ring of radius r and width dr is 2 p rdrG s /PQ, where s is the mass per unit area of the disc. The potential at P of the entire disc is therefore V P = [0, a] 2 p G s rdr/PQ, (7.68) where a is the radius of the disc. Therefore, V P = 2 p G s [ 0, a] rdr/(r 2 + p 2 ) 1/2 = 2 p G s [(r 2 + p 2 ) 1/2 ] [0, a] = 2 p G s (R p), (7.69) where R is the distance of P from any point on the circumference. PROBLEMS 71 Show that the gravitational potential of a thin spherical shell of radius R and mass M at a point P is 1) GM/d where d is the distance from P to the center of the shell if d >R, and 2) GM/R if P is inside or on the shell. PAGE 143 N E W T O N I A N G R A V I T A T I O N 133 72 If d is the distance from the center of a solid sphere (radius R and density r ) to a point P inside the sphere, show that the gravitational potential at P is F P = 2 p G r (R 2 d 2 /3). 73 Show that the gravitational attraction of a circular disc of radius R and mass per unit area s at a point P distant p from the center of the disc, and on the axis, is 2 p G s {[p/(p 2 + R 2 ) 1/2 ] 1}. 74 A particle moves in an ellipse about a center of force at a focus. Prove that the instantaneous velocity v of the particle at any point in its orbit can be resolved into two components, each of constant magnitude: 1) of magnitude ah/b 2 perpendicular to the radius vector r at the point, and 2) of magnitude ahe/b 2 perpendicular to the major axis of the ellipse. Here, a and b are the semimajor and semiminor axes, e is the eccentricity, and h = pv = constant for a central orbit. 7.5 A particle moves in an orbit under a central acceleration a = k/r 2 where k = constant. If the particle is projected with an initial velocity v 0 in a direction at right angles to the radius vecttor r when at a distance r 0 from the center of force (the origin ), prove (dr/dt) 2 = {(2k/r 0 ) v o 2 (1 + (r 0 /r))}{(r 0 /r) 1}. This problem involves the energy and momentum equations in r, f coordinates. 76 A particle moves in a cardioidal orbit, r = a(1 + cos f ), under the influence of a central force v a P[r, f ] p F r f O 2a PAGE 144 N E W T O N I A N G R A V I T A T I O N 134 1) show that the pr equation of the cardioid is p 2 = r 3 /2a, and 2) show that the central acceleration is 3ah 2 /r 4 where h = pv = constant. 77 A planet moves in a circular orbit of radius r about the Sun as focus at the center. If the gravitational constant G changes slowly with time G(t), then show that the angular velocity, w of the planet and the radius of the orbit change in time according to the equations (1/ w )(d w /dt) = (2/G)(dG/dt) and (1/r)(dr/dt) = (/G)(dG/dt). (This is a central force problem!). 78 A particle moves under a central acceleration a = k(1/r 3 ) where k is a constant. If k = h 2 where h = r 2 (d f /dt) = pv, then show that the path is 1/r = A f + B, a reciprocal spiral, where A and B are constants. PAGE 145 8 EINSTEINIAN GRAVITATION: AN INTRODUCTION TO GENERAL RELATIVITY8.1 The principle of equivalence The term mass that appears in Newtons equation for the gravitational force between two interacting masses refers to gravitational mass that property of matter that responds to the gravitational force. Newtons Law should indicate this property o f matter: FG = G MGmG/r2, where MG and mG are the gravitational masses of the interacting objects, separated by a distance r. The term mass that appears in Newtons equation of motion, F = ma, refers to the inertial mass that property of matter that resists changes in its state of motion.Newtons equation of motion should indicate this property of matter: F(r) = mIa(r), where mI is the inertial mass of the particle moving with an acceleration a(r) in the gravitational field of the masss MG. Newton showed by experiment that the inertial mass of an object is equal to its gravitational mass, mI = mG to an accuracy of 1 part in 103. Recent experiments have shown this equality to be true to an accuracy of 1 part in 1012. Newton therefore took the equations F(r) = GMGmG/r2 = mIa(r),(8.1) and used the condition mG = mI to obtain PAGE 146 E I N S T E I N I A N G R A V I T A T I O N 136 a(r) = GM G /r 2 (8.2) Galileo had, of course, previously shown that objects made from different materials fall with the same acceleration in the gravitational field at the surface of the Earth, a result that implies m G m I This is the Newtonian Principle of Equivalence. Einstein used this Principle as a basis for a new Theory of Gravitation! He extended the axioms of Special Relativity, that apply to fieldfree frames, to frames of reference in free fall. A freely falling frame must be in a state of unpowered motion in a uniform gravitational field The field region must be sufficiently small for there to be no measurable variation in the field throughout the region. If a field gradient does exist in the region then so called tidal effects are present, and these can, in principle, be determined (by distorting a liquid drop, for example). The results of all experiments carried out in ideal freely falling frames are therefore fully consistent with Special Relativity. All freelyfalling observers measure the speed of light to be c, its constant freespace value. It is not possible to carry out experiments in ideal freelyfalling frames that permit a distinction to be made between the acceleration of local, freelyfalling objects, and their motion in an equivalent external gravitational field. As an immediate consequence of the extended Principle of Equivalence, Einstein showed that a beam of light would be observed to be deflected from its straight path in a close encounter with a sufficiently massive object. The observers would, themselves, be far removed from the gravitational field of the massive object causing the deflection. Einsteins original calculation of the deflection of light from a distant star, grazing the Sun, as observed here on the Earth, included only those changes in time intervals that he had predicted would PAGE 147 E I N S T E I N I A N G R A V I T A T I O N 137 occur in the near field of the Sun. His result turned out to be in error by exactly a factor of two. He later obtained the correct value for the deflection by including in the calculation the changes in spatial intervals caused by the gravitational field. A plausible argument is given in the section 8.6 for introducing a nonintuitive concept, the refractive index of spacetime due to a gravitational field. This concept is, perhaps, the characteristic physical feature of Einsteins revolutionary General Theory of Relativity. 8.2 Time and length changes in a gravitational field We have previously discussed the changes that occur in the measurement of length and time intervals in different inertial frames. These changes have their origin in the invariant speed of light and the necessary synchronization of clocks in a given inertial frame. Einstein showed that measurements of length and time intervals in a given gravitational potential are changed relative to the measurements made in a different gravitational potential. These fielddependent changes are not to be confused with the SpecialRelativistic changes discussed in 3.5 Although an exact treatment of this topic requires the solution of the full Einstein gravitational field equations, we can obtain some of the key results of the theory by making approximations that are valid in the case of our solar system. These approximations are treated in the following sections. 8.3 The Schwarzschild line element An observer in an ideal freelyfalling frame measures an invariant infinitesimal interval of the standard Special Relativistic form ds 2 = (cdt) 2 (dx 2 + dy 2 + dz 2 ). (8.3) PAGE 148 E I N S T E I N I A N G R A V I T A T I O N 138 It is advantageous to transform this form to spherical polar coordinates, using the linear equations x = rsin q cos f y = rsin q sin f and z = rcos q We then have z dr dl, the diagonal of the cube d f z rd q q r y d q rsin q x f x The square of the length of the diagonal of the infinitesimal cube is seen to be dl 2 = dr 2 + (rd q ) 2 + (rsin q d f ) 2 (8.4) The invariant interval can therefore be written ds 2 = (cdt) 2 dr 2 r 2 (d q 2 + sin 2 q d f 2 ). (8.5) The key question that now faces us is this: how do we introduce gravitation into the problem? We can solve the problem by introducing an energy equation into the argument. Consider two observers O and O, passing by one another in a state of free fall in a gravitational field due to a mass M, fixed at the origin of coordinates. Both observers measure a standard interval of spacetime, ds according to O, and ds according to O, so that PAGE 149 E I N S T E I N I A N G R A V I T A T I O N 139 ds 2 = ds 2 = (cdt) 2 dr 2 r 2 (d q 2 + sin 2 q d f 2 ) (8.6) The situation is as shown z v O (r) O O r v O (r) 0 y q Mass, M (the source of the field) f x Let the observer O just begin free fall towards M at the radial distance r, and let the observer O, close to O, be freely falling away from the mass M. The observer O is in a state of unpowered motion with just the right amount of kinetic energy to escape to infinity. Since both observers are in states of free fall, we can, according to Einstein, treat them as if they were inertial observers. This means that they can relate their local spacetime measurements by a Lorentz transformation. In particular, they can relate their measurements of the squared intervals, ds 2 and ds 2 in the standard way. Since their relative motion is along the radial direction, r, time intervals and radial distances will be measured to be changed: D t = gD t and gD r = D r, (8.7 a,b) where g = 1/{1 (v/c) 2 } 1/2 in which v = v O (r) because v O (r) 0. PAGE 150 E I N S T E I N I A N G R A V I T A T I O N 140 If O has just enough kinetic energy to escape to infinity, then we can equate the kinetic energy to the potential energy, so that v O 2 (r)/2 = 1 F (r) if the observer O has unit mass. (8.8) F (r) is the gravitational potential at r due to the presence of the mass, M, at the origin. This procedure enables us to introduce the gravitational potential into the value of g in the Lorentz transformation. We have v O 2 = 2 F (r) = v 2 and therefore D t = D t/{1 2 F (r)/c 2 } 1/2 (8.9) and D r = D r{1 2 F (r)/c 2 } 1/2 (8.10) Only lengths parallel to r change, therefore r 2 (d q 2 + sin 2 q d f 2 ) = r 2 (d q 2 + sin q d f 2 ), (8.11) and therefore we obtain ds 2 = ds 2 = c 2 (1 2 F (r)/c 2 )dt 2 dr 2 /(1 2 F (r)/c 2 ) r 2 (d q 2 + sin 2 q d f 2 ). (8.12) If the potential is due to a mass M at the origin then F (r) = GM/r, (r > R, the radius of the mass, M) therefore, ds 2 = c 2 (1 2GM/rc 2 )dt 2 (1 2GM/rc 2 ) dr 2 r 2 (d q 2 + sin 2 q d f 2 ). (8.13) This is the famous Schwarzschild line element, originally obtained as an exact solution of the Einstein field equations. The present approach fortuitously gives the exact result! 8.4 The metric in the presence of matter PAGE 151 E I N S T E I N I A N G R A V I T A T I O N 141 In the absence of matter, the invariant interval of spacetime is ds 2 = h mn dx m dx n ( m n = 0, 1, 2, 3), (8.14) where h mn = diag(1, ) (8.15) is the metric of Special Relativity; it lowers the indices dx m = h mn dx n (8.16) The form of the Schwarzschild line element, ds 2 sch shows that the metric g mn in the presence of matter differs from h mn We have ds 2 sch = g mn dx m dx n (8.17) where dx 0 = cdt, dx 1 = dr, dx 2 = rd q and dx 3 = rsin q d f and g mn = diag((1 c ), (1 c ) (1 c ) (1 c ) ) in which c = 2GM/rc 2 The Schwarzschild metric lowers the indices dx m = g mn dx n (8.18) so that ds 2 sch = dx m dx m (8.19) 8.5 The weak field approximation PAGE 152 E I N S T E I N I A N G R A V I T A T I O N 142 If c = 2GM/rc 2 << 1, the coefficient, (1 c ) of dr 2 in the Schwarzschild line element can be replaced by the leading term of its binomial expansion, (1 + c ...) to give the weak field line element: ds 2 W = (1 c )(cdt) 2 (1 + c )dr 2 r 2 (d q 2 + sin 2 q d f 2 ). (8.20) At the surface of the Sun, the value of c is 4.2 x 8 so that the weak field approximation is valid in all gravitational phenomena in our solar system. Consider a beam of light travelling radially in the weak field of a mass M, then ds 2 W = 0 (a lightlike interval) and d q 2 + sin 2 q d f 2 = 0, (8.21) giving 0 = (1 c )(cdt) 2 (1 + c )dr 2 (8.22) The velocity of the light v L = dr/dt, as determined by observers far from the gravitational influence of M, is therefore v L = c{(1 c )/(1 + c )} 1/2 c if c 0 !. (8.23) (Observers in free fall near M always measure the speed of light to be c). Expanding the term {(1 c )/(1 + c )} 1/2 to first order in c we obtain v L (r)/c (1 c /2 ...)(1 c /2 ...) = (1 c ...). (8.24) Therefore v L (r) c(1 2GM/rc 2 ...), (8.25) so that v L (r) < c in the presence of a mass M according to observers far removed from M. 8.6 The refractive index of spacetime in the presence of mass PAGE 153 E I N S T E I N I A N G R A V I T A T I O N 143 In Geometrical Optics, the refractive index, n, of a material is defined as n c/v medium (8.26) where v medium is the speed of light in the medium. We introduce the concept of the refractive index of spacetime n G (r), at a point r in the gravitational field of a mass, M: n G c/v L (r) 1/(1 c ) = 1 + c to firstorder in c = 1 + 2GM/rc 2 (8.27) The value of n G increases as r decreases This effect can be interpreted as an increase in the density of spacetime as M is approached. 8.7 The deflection of light grazing the sun As a plane wave of light approaches a spherical mass, those parts of the wave front nearest the mass are slowed down more than those parts farthest from the mass. The speed of the wave front is no longer constant along its surface, and therefore the normal to the surface must be deflected: v L c v L c Deflection angle Normal to wavefront v L < c Mass, M, the source of the field PAGE 154 E I N S T E I N I A N G R A V I T A T I O N 144 The deflection of a plane wave of light by a spherical mass, M, as it travels through spacetime can be calculated in the weak field approximation. We choose coordinates as shown y dx = vdt x dy Plane wave of light dx = vdt y r y = 0 x Mass, M (this includes the mass of its field) R We have shown that the speed of light (moving radially) in a gravitational field, measured by an observer far from the source of the field, depends on the distance, r, from the source v(r) = c(1 2GM/rc 2 ) (8.28) where c is the invariant speed of light as r We wish to compare dx with dx, the distances travelled in the xdirection by the wavefront at y and y + dy, in the interval dt. We have r 2 = (y + R) 2 + x 2 (8.29) PAGE 155 E I N S T E I N I A N G R A V I T A T I O N 145 therefore v(r) v(x, y) so that 2r( r/ y) = 2(y + R), and r/ y = (y + R )/r (8.30) Very close to the surface of the mass M (radius R), the gradient is r/ y y 0 R/r. (8.31) Now, v(r)/ y = ( / r)(c(1 2GM/rc 2 ))( r/ y) = (2GM/r 2 c)( r/ y). (8.32) We therefore obtain v(r)/ y y 0 = (2GM/r 2 c)(R/r) = 2GMR/r 3 c. (8.33) Let the speed of the wavefront be v at y + dy and v at y. The distances moved in the interval dt are therefore dx = vdt and dx = vdt. (8.34 a,b) The firstorder Taylor expansion of v is v = v + ( v/ y)dy, and therefore dx dx = (v + ( v/ y)dy)dt vdt = ( v/ y)dydt. (8.35) Let the corresponding angle of deflection of the normal to the wavefront be d a then d a = (dx dx)/dy = ( v/ y)dt = ( v/ y)(dx/v). (8.36) PAGE 156 E I N S T E I N I A N G R A V I T A T I O N 146 The total deflection of the normal to the plane wavefront is therefore Da = [ ] ( v/ y)(dx/v) (8.37) (1/c) [ ] ( v/ y)dx (v @ c over most of the range of the integral). The portion of the wavefront that grazes the surface of the mass M (y 0) therefore undergoes a total deflection Da (1/c) [ ] (2GMR/r 3 c)dx (8.38) = 2GMR/c 2 [ ] dx/(R 2 +x 2 ) 3/2 = 2GMR/c 2 [x/(R 2 (R 2 + x 2 ) 1/2 )] = 2(GMR/c 2 )(2/R 2 ). so that Da = 4GM/Rc 2 This is Einsteins famous prediction; putting in the known values for G, M, R, and c, gives Da = 1.75 arcseconds. (8.39) Measurements of this very small effect, made during total eclipses of the Sun at various times and places since 1919, are fully consistent with Einsteins prediction. PROBLEMS 81 If a particle A is launched with a velocity v 0A from a point P on the surface of the Earth at the same instant that a particle B is dropped from a point Q, use the Principle of Equivalence to show that if A and B are to collide then v 0A must be directed along the line PQ. PAGE 157 E I N S T E I N I A N G R A V I T A T I O N 147 Q g B A v 0A P 82 A satellite is in a circular orbit above the Earth. It carries a clock that is similar to a clock on the Earth. There are two effects that must be taken into account in comparing the rates of the two clocks. 1) the time shift due to their relative speeds (Special Relativity), and 2) the time shift due to their different gravitational potentials (General Relativity). Calculate the SR shift to secondorder in (v/c), where v is the orbital speed and the GR shift to the same order. In calculating the difference in the potentials integrate from the surface of the Earth to the orbit radius. The two effects differ in sign. Show that the total relative change in the frequency of the satellite clock compared with the Earth clock is ( Dn / n E ) (gR E /c 2 ){1 (3R E /2r S )}, where r S is the radius of the satellite orbit (measured from the center of the Earth). We see that if the altitude of the satellite is > R E /2 (~ 3200 km) Dn is positive since the gravitational effect then predominates, whereas at altitudes less than ~3200 km, the Special Relativity effect predominates. At an altitude ~ 3200 km, the clocks remain in synchronism. PAGE 158 9 AN INTRODUCTION TO THE CALCULUS OF VARIATIONS 9.1 The Euler equation A frequent problem in Differential Calculus is to find the stationary values (maxima and minima) of a function y(x). The necessary condition for a stationary value at x = a is dy/dx x = a = 0. For a minimum, d 2 y/dx 2  x = a > 0, and for a maximum, d 2 y/dx 2  x = a < 0. The Calculus of Variations is concerned with a related problem, namely that of finding a function y(x) such that a definite integral taken over a function of this function shall be a maximum or a minimum. This is clearly a more complicated problem than that of simply finding the stationary values of a function, y(x). Explicitly, we wish to find that function y(x) that will cause the definite integral [x1, x2] F(x, y, dy(x)/dx)dx (9.1) to have a stationary value. The integrand F is a function of y(x) as well as of x and dy(x)/dx. The limits x 1 and x 2 are assumed to be fixed as are the values y(x 1 ) and y(x 2 ). The integral has different values along different paths that connect (x 1 y 1 ) and (x 2 y 2 ). Let a path be Y(x), and let this be PAGE 159 C A L C U L U S O F V A R I A T I O N S 149 one of a set of paths that are adjacent to y(x). We take Y(x) y(x) to be an infinitesimal for every value of x in the range of integration. Let the difference be defined as Y(x) y(x) d y(x) (a firstorder change), (9.2) and F(x, Y(x), dY(x)/dx) F(x, y(x), dy(x)/dx) d F. (9.3) The symbol d is called a variation ; it represents the change in the quantity to which it is applied as we go from y(x) to Y(x) at the same value of x Note d x = 0, and d (dy/dx) = dY(x)/dx dy(x)/dx = (d/dx)(Y(x) y(x)) = (d/dx)( d y(x)). The symbols d and (d/dx) commute: d (d/dx) (d/dx) d = 0. (9.4) Graphically, we have y Y(x), the varied path y 2 d y y 1 y(x), the true path O x 1 x 2 x Using the definition of d F, we find d F = F(x, y + d y, dy/dx + d (dy/dx)) F(x, y, dy/dx) (9.5) Y(x) (d/dx)Y(x) PAGE 160 C A L C U L U S O F V A R I A T I O N S 150 = ( F/ y) d y + ( F/ y) d y for fixed x. (Here, dy/dx = y). The integral [x1, x2] F(x, y, y)dx, (9.6) is stationary if its value along the path y is the same as its value along the varied path, y + d y = Y. We therefore require [x1, x2] d F(x, y, y)dx = 0. (9.7) This integral can be written [x1, x2] {( F/ y) d y + ( F/ y) d y}dx = 0. (9.8) The second term in this integral can be evaluated by parts, giving [( F/ y) d y] x1 x2 [x1, x2] (d/dx)( F/ y) d ydx. (9.9) But d y 1 = d y 2 = 0 at the endpoints x 1 and x 2 therefore the term [ ] x1 x2 = 0, so that the stationary condition becomes [x1, x2] { F/ y (d/dx) F/ y} d ydx = 0. (9.10) The infinitesimal quantity d y is positive and arbitrary, therefore, the integrand is zero: F/ y (d/dx) F/ y = 0. (9.11) This is known as Eulers equation. 9.2 The Lagrange equations Lagrange, one of the greatest mathematicians of the 18th century, developed Eulers equation in order to treat the problem of particle dynamics within the framework of generalized coordinates. He made the transformation F(x, y, dy/dx) L(t, u, du/dt) (9.12) PAGE 161 C A L C U L U S O F V A R I A T I O N S 151 where u is a generalized coordinate and du/dt is a generalized velocity. The Euler equation then becomes the Lagrange equationofmotion : L/ u (d/dt)( L/ u) = 0, where u is the generalized velocity. (9.13) The Lagrangian L(t; u, u) is defined in terms of the kinetic and potential energy of a particle, or system of particles: L T V. (9.14) It is instructive to consider the Newtonian problem of the motion of a mass m, moving in the plane, under the influence of an inversesquarelaw force of attraction using Lagranges equationsofmotion. Let the center of force be at the origin of polar coordinates. The kinetic energy of m at [r, f ] is T = m((dr/dt) 2 + r 2 (d f /dt) 2 )/2, (9.15) and its potential energy is V = k/r, where k is a constant. (9.16) The Lagrangian is therefore L = T V = m((dr/dt) 2 + r 2 (d f /dt) 2 )/2 + k/r. (9.17) Put r = u, and f = v, the generalized coordinates. We have, for the uequation (d/dt)( L/ u) = (d/dt)( L/ r) = (d/dt)(m(dr/dt)) = md 2 r/dt 2 (9.18) and L/ u = L/ r = mr(d f /dt) 2 k/r 2 (9.19) Using Lagranges equationofmotion for the ucoordinate, we have m(d 2 r/dt 2 ) mr(d f /dt) 2 + k/r 2 = 0 (9.20) PAGE 162 C A L C U L U S O F V A R I A T I O N S 152 or m(d 2 r/dt 2 r(d f /dt) 2 ) = k/r 2 (9.21) This is, as it should be, the Newtonian equation mass acceleration in the rdirection = force in the rdirection. Introducing a second generalized coordinate, we have, for the vequation (d/dt)( L/ v) = (d/dt)( L/ f ) = (d/dt)(mr 2 f ) (9.22) = m(r 2 f + f 2r r), and L/ v = L/ f = 0, (9.23) therefore m(r 2 f + 2r r f ) = 0 so that (d/dt)(mr 2 f ) = 0. (9.24) Integrating we obtain mr 2 f = constant, (9.25) showing, again, that the angular momentum is conserved. The advantages of using the Lagrangian method to solve dynamical problems stem from the fact that L is a scalar function of generalized coordinates 9.3 The Hamilton equations The Lagrangian L is a function of the generalized coordinates and velocities, and the time: L = L(u, v, ...;u, v, ...;t). (9.26) PAGE 163 C A L C U L U S O F V A R I A T I O N S 153 If the discussion is limited to two coordinates, u and v, the total differential of L is dL = ( L/ u)du + ( L/ v)dv + ( L/ u)du + ( L/ v)dv + ( L/ t)dt. Consider the simplest case of a mass m moving along the xaxis in a potential, so that u = x and u = x = v x then L = T V = mv x 2 /2 V (9.27) and L/ v x = mv x = p x the linear momentum. (9.28) In general, it is found that terms of the form L/ u and L/ v are momentum terms; they are called generalized momenta, and are written L/ u = p u L/ v = p v ..etc. (9.29) Such forms are not limited to linear momenta. The Lagrange equation (d/dt)( L/ u) L/ u = 0 (9.30) can be transformed, therefore, into an equation that involves the generalized momenta: (d/dt)(p u ) L/ u = 0, or L/ u = p u (9.31) The total differential of L is therefore dL = p u du + p v dv + p u du + p v dv + ( L/ t)dt. (9.32) We now introduce an important function, the Hamiltonian function, H, defined by H p u u + p v v L, (9.33) PAGE 164 C A L C U L U S O F V A R I A T I O N S 154 therefore dH = {p u du +udp u + p v dv + vdp v } dL (9.34) It is not by chance that H is defined in the way given above. The definition permits the cancellation of the terms in dH that involve du and dv, so that dH depends only on du, dv, dp u and dp v (and perhaps, t). We can therefore write H = f(u, v, p u p v ; t) (limiting the discussion to the two coordinates u and v). (9.35) The total differential of H is therefore dH = ( H/ u)du + ( H/ v)dv + ( H/ p u )dp u + ( H/ p v )dp v + ( H/ t)dt. (9.36) Comparing the two equations for dH, we obtain Hamiltons equationsofmotion : H/ u = p u H/ v = p v (9.37) H/ p u = u, H/ p v = v, (9.38) and H/ t = L/ t. (9.39) We see that H = p u u + p v v (T V). (9.40) If we consider a mass m moving in the (x, y)plane then H = (mv x )v x + (mv y )v y T + V (9.41) = 2(mv x 2 /2 + mv y 2 /2 ) T + V = T + V, the total energy. (9.42) In advanced treatments of Analytical Dynamics, this form of the Hamiltonian is shown to have general validity. PAGE 165 C A L C U L U S O F V A R I A T I O N S 155 PROBLEMS 91 Studies of geodesics the shortest distance between two points on a surface form a natural part of the Calculus of Variations. Show that the straight line between two points in a plane is the shortest distance between them. 92 The surface generated by revolving the ycoordinate about the xaxis has an area 2 p yds where ds = {dx 2 + dy 2 } 1/2. Use Eulers variational method to show that the surface of revolution is a minimum if (dy/dx) = {(y 2 /a 2 ) 1} 1/2 where a = constant. Hence show that the equation of the minimum surface is y = acosh{(x/a) + b} where b = constant, and y 0. 93 The Principle of Least Time predates the Calculus of Variations. The propagation of a ray of light in adjoining media that have different indices of refraction is found to be governed by this principle. A ray of light moves at constant speed v 1 in a medium (1) from a point A to a point B 0 on the xaxis. At B 0 its speed changes to a new constant value v 2 on entering medium (2). The ray continues until it reaches a point C in (2). If the true path A B 0 C is such that the total travel time of the light in going from A to C is a minimum, show that (v 1 /v 2 ) = x 0 {[y c 2 + (d x 0 ) 2 ]/[y A 2 + x 0 2 ]} 1/2 /(d x 0 ), (Snells law) where the symbols are defined in the following diagram: PAGE 166 C A L C U L U S O F V A R I A T I O N S 156 y Medium 1, speed v 1 A y A 0 B 0 B x x 0 y C x C d Medium 2, speed v 2 The path A B C is an arbitrarily varied path. 94 Hamiltons Principle states that when a system is moving under conservative forces the time integral of the Lagrange function is stationary. (It is possible to show that this Principle holds for nonconservative forces). Apply Hamiltons Principle to the case of a projectile of mass m moving in a constant gravitational field, in the plane. Let the projectile be launched from the origin of Cartesian coordinates at time t = 0. The Lagrangian is L = m((dx/dt) 2 + (dy/dt) 2 )/2 mgy Calculate d [0, t1] Ldt, and obtain Newtons equations of motion d 2 y/dt 2 + g = 0 and d 2 x/dt 2 = 0. 95 Reconsider the example discussed in section 9.2 from the point of view of the Hamiltonian of the system. Obtain H(r, f p r p f ), and solve Hamiltons equations of motion to obtain the results given in Eqs.9.21 and 9.25. PAGE 167 10 CONSERVATION LAWS, AGAIN 10.1 The conservation of mechanical energy If the Hamiltonian of a system does not depend explicitly on the time, we have H = H(u, v, ...;p u p v ...). (10.1) In this case, the total differential dH is (for two generalized coordinates, u and v) dH = ( H/ u)du + ( H/ v)dv + ( H/ p u )dp u + ( H/ p v )dp v (10.2) If the positions and the momenta of the particles in the system change with time under their mutual interactions, then H also changes with time, so that dH/dt = ( H/ u)du/dt + ( H/ v)dv/dt + ( H/ p u )dp u /dt + ( H/ p v )dp v /dt = (p u u) + (p v v) + (up u ) + (vp v ) (10.3) = 0, using Hamiltons equationsofmotion. (10.4) Integration then gives H = constant. (10.5) In any system moving under the influence of conservative forces, a potential V exists. In such systems, the total mechanical energy is H = T + V, and we see that it is a constant of the motion. 10.2 The conservation of linear and angular momentum If the Hamiltonian, H, does not depend explicitly on a given generalized coordinate then the generalized momentum associated with that coordinate is conserved. PAGE 168 C O N S E R V A T I O N L A W S 158 For example, if H contains no explicit reference to an angular coordinate then the angular momentum associated with that angle is conserved Formally, we have dp j /dt = H/ q j where p j and q j are the generalized momenta and coordinates. (10.6) Let an infinitesimal change in the jthcoordinate q j be made, so that q j q j + d q j (10.7) then we have d H = ( H/ q j ) d q j (10.8) If the Hamiltonian is invariant under the infinitesimal displacement d q j then the generalized momentum p j is a constant of the motion. The conservation of linear momentum is therefore a consequence of the homogeneity of space, and the conservation of angular momentum is a consequence of the isotropy of space The observed conservation laws therefore imply that the choice of a point in space for the origin of coordinates, and the choice of an axis of orientation play no part in the formulation of the physical laws; the Laws of Nature do not depend on an absolute space. PAGE 169 11 CHAOS The behavior of many nonlinear dynamical systems as a function of time is found to be chaotic. The characteristic feature of chaos is that the system never repeats its past behavior. Chaotic systems nonetheless obey classical laws of motion which means that the equations of motion are deterministic. Poincar was the first to study the effects of small changes in the initial conditions on the evolution of chaotic systems that obey nonlinear equations of motion. In a chaotic system, the erratic behavior is due to the internal, or intrinsic, dynamics of the system. Let a dynamical system be described by a set of firstorder differential equations: dx 1 /dt = f 1 (x 1 ,x 2 ,x 3 ,...x n ) (11.1) dx 2 /dt= f 2 (x 1 ,x 2 ,x 3 ,...x n ) dx n /dt= f n (x 1 ,x 2 ,x 3 ,...x n ) where the functions f n are functions of nvariables. The necessary conditions for chaotic motion of the system are 1) the equations of motion must contain a nonlinear term that couples several of the variables. A typical nonlinear equation, in which two of the variables are coupled, is therefore PAGE 170 C H A O S 160 dx 1 /dt= ax 1 + bx 2 + cx 1 x 2 + ... rx n (a, d, c, ...r are constants) (11.2) and 2) the number of independent variables, n, must be at least three. The second condition is discussed later. The nonlinearity often makes the solution of the equations unstable for particular choices of the parameters. Numerical methods of solution must be adopted in all but a few standard cases. 11.1 The general motion of a damped, driven pendulum The equation of a damped, driven pendulum is ml(d 2 q /dt 2 ) + kml(d q /dt) + mgsin q = Acos( w D t) (11.3) or (d 2 q /dt 2 ) + k(d q /dt) + (g/l)sin q = (A/ml)cos( w D t), (11.4) where q is the angular displacement of the pendulum, l is its length, m is its mass, the resistance is proportional to the velocity (constant of proportionality, k), A is the amplitude and w D is the angular frequency of the driving force. Baker and Gollub in Chaotic Dynamics (Cambridge, 1990) write this equation in the form (d 2 q /dt 2 ) + (1/q)(d q /dt) + sin q = Ccos( w D t), (11.5) where q is the damping factor. The lowamplitude natural angular frequency of the pendulum is unity, and time is dimensionless. We can therefore write the equation in terms of three firstorder differential equations d w /dt = (1/q) w sin q + Ccos( f ) where f is the phase, (11.6) PAGE 171 C H A O S 161 d q /dt = w (11.7) and d f /dt = w D (11.8) The three variables are ( w q f ). The onset of chaotic motion of the pendulum depends on the choice of the parameters q, C, and w D The phase space of the oscillations is threedimensional: w q A spiral with a pitch of 2 p f The q w trajectories are projections of the spiral onto the q w plane. The motion is sensitive to w D since the nonlinear terms generate many new resonances that occur when w D / w natural is a rational number. (Here, w natural is the angular frequency of the undamped linear oscillator). For particular values of q and w D the forcing term produces a damped motion that is no longer periodic the motion becomes PAGE 172 C H A O S 162 chaotic. Periodic motion is characterized by closed orbits in the ( q w ) plane. If the damping is reduced considerably, the motion can become highly chaotic. The system is sensitive to small changes in the initial conditions. The trajectories in phase space diverge from each other with exponential timedependence. For chaotic motion, the projection of the trajectory in ( q w f ) space onto the ( q w ) plane generates trajectories that intersect. However, in the full 3 space, a spiralling line along the f axis never intersects itself. We therefore see that chaotic motion can exist only when the system has at least a 3 dimensional phase space. The path then converges towards the attractor without selfcrossing. Small changes in the initial conditions of a chaotic system may produce very different trajectories in phase space. These trajectories diverge, and their divergence increases exponentially with time. If the difference between trajectories as a function of time is d(t) then it is found that logd(t) ~ l t or d(t) ~ e l t (11.9) where l > 0 a positive quantity called the Lyapunov exponent. In a weakly chaotic system l << 0.1 whereas, in a strongly chaotic system, l >> 0.1. 11.2 The numerical solution of differential equations A numerical method of solving linear differential equations that is suitable in the present case is known as the RungeKutta method. The algorithm for solving two equations that are functions of several variables is: Let PAGE 173 C H A O S 163 dy/dx = f(x, y, z) and dz/dx = g(x, y, z). (11.10) If y = y 0 and z = z 0 when x = x 0 then, for increments h in x 0 k in y 0 and l in z 0 the RungeKutta equations are k 1 = hf(x 0 y 0 z 0 ) l 1 = hg(x 0 y 0 z 0 ) k 2 = hf(x 0 + h/2, y 0 + k 1 /2, z 0 + l 1 /2) l 2 = hg(x 0 + h/2, y 0 + k 1 /2, z 0 + l 1 /2) k 3 = hf(x 0 + h/2, y 0 + k 2 /2, z 0 + l 2 /2) l 3 = hg(x 0 + h/2, y 0 + k 2 /2, z 0 + l 2 /2) k 4 = hf(x 0 + h, y 0 + k 3 z 0 + l 3 ) l 4 = hg(x 0 + h, y 0 + k 3 z 0 + l 3 ) k = (k 1 + 2k 2 + 2k 3 + k 4 )/6 and l = (l 1 + 2l 2 + 2l 3 + l 4 )/6. (11.11) The initial values are incremented, and successive values of the x, y, and z are generated by iterations. It is often advantageous to use varying values of h to optimize the procedure. In the present case, f(x, y, z) f(t, q w ) and g(x, y, z) g( w ). As a problem, develop an algorithm to solve the nonlinear equation 11.5 using the RungeKutta method for three equations (11.6, 11.7, and 11.8). Write a program to calculate the necessary iterations. Choose increments in time that are small enough to reveal the details in the q w plane. Examples of nonchaotic and chaotic behavior are shown in the following two diagrams. PAGE 174 C H A O S 164 Points in the q w plane for a nonchaotic system 200 150 100 50 0 50 100 150 200 400 300 200 100 0 100 q w The parameters used to obtain this plot in the q w plane are : damping factor (1/q) = 1/5, amplitude (C) = 2, drive frequency ( w D ) = 0.7, and time increment, D t = 0.05. All the initial values are zero. PAGE 175 C H A O S 165 Points in the q w plane for a chaotic system 100 80 60 40 20 0 20 40 60 80 100 60 40 20 0 20 40 60 q w The parameters used to obtain this plot in the q w plane are: damping factor (1/q) = 1/2, amplitude (C) = 1.15, drive frequency ( w D ) = 0.597, and time increment, D t = 1. The intial value of the time is 100. PAGE 176 12 WAVE MOTION 12.1 The basic form of a wave Wave motion in a medium is a collective phenomenon that involves local interactions among the particles of the medium. Waves are characterized by: 1) a disturbance in space and time. 2) a transfer of energy from one place to another, and 3) a nontransfer of material of the medium. (In a water wave, for example, the molecules move perpendicularly to the velocity vector of the wave). Consider a kink in a rope that propagates with a velocity V along the +xaxis, as shown y Displacement V the velocity of the waveform x x at time t Assume that the shape of the kink does not change in moving a small distance D x in a short interval of time D t. The speed of the kink is defined to be V = D x/ D t. The displacement in the ydirection is a function of x and t, PAGE 177 W A V E M O T I O N 167 y = f(x, t). We wish to answer the question: what basic principles determine the form of the argument of the function, f ? For water waves, acoustical waves, waves along flexible strings, etc. the wave velocities are much less than c. Since y is a function of x and t, we see that all points on the waveform move in such a way that the Galilean transformation holds for all inertial observers of the waveform. Consider two inertial observers, observer #1 at rest on the xaxis, watching the wave move along the xaxis with constant speed, V, and a second observer #2, moving with the wave. If the observers synchronize their clocks so that t 1 = t 2 = t 0 = 0 at x 1 = x 2 = 0, then x 2 = x 1 Vt. We therefore see that the functional form of the wave is determined by the form of the Galilean transformation, so that y(x, t) = f(x Vt), (12.1) where V is the wave velocity in the particular medium. No other functional form is possible! For example, y(x, t) = Asink(x Vt) is permitted, whereas y(x, t) = A(x 2 + V 2 t) is not. If the wave moves to the left (in the x direction) then y(x, t) = f(x + Vt). (12.2) We shall consider waves that superimpose linearly If, for example, two waves move along a rope in opposite directions, we observe that they pass through each other. PAGE 178 W A V E M O T I O N 168 If the wave is harmonic, the displacement measured as a function of time at the origin, x = 0, is also harmonic: y 0 (0, t) = Acos( w t) where A is the maximum amplitude, and w = 2 pn is the angular frequency. The general form of y(x, t), consistent with the Galilean transformation, is y(x, t) = Acos{k(x Vt)} where k is introduced to make the argument dimensionless (k has dimensions of 1/[length]). We then have y 0 (0, t) = Acos(kVt) = Acos( w t). Therefore, w = kV, the angular frequency, (12.3) or 2 pn = kV, so that, k = 2 pn /V = 2 p /VT where T = 1/ n is the period. (12.4) The general form is then y(x, t) = Acos{(2 p /VT)(x Vt)} = Acos{(2 p / l )(x Vt)}, where l = VT is the wavelength, = Acos{(2 p x/ l 2 p t/T)}, = Acos(kx 2 p t/T), where k = 2 p / l the wavenumber, = Acos(kx w t), PAGE 179 W A V E M O T I O N 169 = A cos( w t kx), because cos( q ) = cos( q ). (12.5) For a wave moving in three dimensions, the diplacement at a point x, y, z at time t has the form y (x, y, z, t) = Acos( w t k r ), (12.6) where  k  = 2 p / l and r = [x, y, z]. 12.2 The general wave equation An arbitrary waveform in one space dimension can be written as the superposition of two waves, one travelling to the right (+x) and the other to the left (x) of the origin. The displacement is then y(x, t) = f(x Vt) + g(x + Vt). (12.7) Put u = f(x Vt) = f(p), and v = g(x + Vt) = g(q), then y = u + v Now, y/ x = u/ x + v/ x = (du/dp)( p/ x) + (dv/dq)( q/ x) = f(p)( p/ x) + g(q)( q/ x). Also, 2 y/ x 2 = ( / x){(du/dp)( p/ x) + (dv/dq)( q / x)} = f(p)( 2 p/ x 2 ) + f(p)( p/ x) 2 + g(q)( 2 q/ x 2 ) + g(q)( q/ x) 2 We can obtain the second derivative of y with respect to time using a similar method: PAGE 180 W A V E M O T I O N 170 2 y/ t 2 = f(p)( 2 p/ t 2 ) + f(p)( p/ t) 2 + g(q)( 2 q/ t 2 ) + g(q)( q/ t) 2 Now, p/ x = 1, q/ x = 1, p/ t = V, and q/ t = V, and all second derivatives are zero (V is a constant). We therefore obtain 2 y/ x 2 = f(p) + g(q), and 2 y/ t 2 = f(p)V 2 + g(q)V 2 Therefore, 2 y/ t 2 = V 2 ( 2 y/ x 2 ). or 2 y/ x 2 (1/V 2 )( 2 y/ t 2 ) = 0. (12.8) This is the wave equation in onedimensional space. For a wave propagating in threedimensional space, we have 2 y (1/V 2 )( 2 y / t 2 ) = 0, (12.9) the general form of the wave equation, in which y (x, y, z, t) is the general amplitude function. 12.3 The Lorentz invariant phase of a wave and the relativistic Doppler shift A wave propagating through space and time has a wave function y (x, y, z, t) = Acos( w t k r ) where the symbols have the meanings given in 12.2 The argument of this function can be written as follows y = Acos{( w /c)(ct) k r ). (12.10) PAGE 181 W A V E M O T I O N 171 It was not until deBroglie developed his revolutionary idea of particlewave duality in 192324 that the Lorentz invariance of the argument of this function was fully appreciated! We have y = Acos{[ w /c, k ] T [ct, r ]} = Acos{K m E m } = Acos f where f is the phase. (12.11) deBroglie recognized that the phase f is a Lorentz invariant formed from the 4vectors K m = [ w /c, k ], the frequencywavelength 4vector, (12.12) and E m = [ct, r ], the covariant event 4vector. deBroglies discovery turned out to be of great importance in the development of Quantum Physics. It also provides us with the basic equations for an exact derivation of the relativistic Doppler shift The frequencywavelength vector is a Lorentz 4vector, which means that it transforms between inertial observers in the standard way: K m = L K m (12.13) or w /c g bg 0 0 w /c k x bg g 0 0 k x k y = 0 0 1 0 k y k z 0 0 0 1 k z The transformation of the first element therefore gives w /c = g ( w /c) bg k x (12.14) so that PAGE 182 W A V E M O T I O N 172 2 pn = g 2 pn bg c(2 p / l ) or n = gn V g ( n /c), (where w = 2 pn V/c = b and c = nl ) therefore n = gn (1 b ) or n = ( n /(1 b 2 ) 1/2 )(1 b ) giving n = n {(1 b )/(1 + b )} 1/2 (12.15) This is the relativistic Doppler shift for the special case of photons we have Lorentz invariance in action. This result was derived in section 6.2 using the Lorentz invariance of the energymomentum 4vector, and the PlanckEinstein result E = h n for the relation between the energy E and the frequency n of a photon. The present derivation of the relativistic Doppler shift is independent of the PlanckEinstein result, and therefore provides an independent verification of their fundamental equation E = h n for a photon. 12.4 Plane harmonic waves The onedimensional wave equation (12.8) has the solution y(t, x) = Acos(kx w t), where w = kV and A is independent of both x and t. This form is readily shown to be a solution of (12.8) by direct calculation of its 2nd partial derivatives, and their substitution in the wave equation. The threedimensional wave equation (12.9) has the solution PAGE 183 W A V E M O T I O N 173 y (t, x, y, z) = y 0 cos{(k x x + k y y + k z z) w t}, where w =  k V, and k = [k x k y k z ], the wave vector. The solution y (t, x, y, z) is called a plane harmonic wave because constant values of the argument (k x x + k y y + k z z) w t define a set of planes in space surfaces of constant phase: z k normal to the wave surface Equiphase surfaces of a plane wave y O x It is often useful to represent a plane harmonic wave as the real part of the remarkable CotesEuler equation e i q = cos q + isin q i = so that y 0 cos(( k r ) w t) = R.P. y 0 e i( k r w t) The complex form is readily shown to be a solution of the threedimensional wave equation. 12.5 Spherical waves For given values of the radial coordinate, r, and the time, t, the functions PAGE 184 W A V E M O T I O N 174 cos(kr w t) and e i(kr w t) have constant values on a sphere of radius r. In order for the wave functions to represent expanding spherical waves we must modify their forms as follows: (1/r)cos(kr w t) and (1/r)e i(kr w t) ( k along r ). (12.16) These changes are needed to ensure that the wave functions are solutions of the wave equation. To demonstrate that the spherical wave (1/r)cos(kr w t) is a solution of (12.9), we must transform the Laplacian operator from Cartesian to polar coordinates, 2 (x, y, z) 2 (r, q f ). The transformation is 2 / x 2 + 2 / y 2 + 2 / z 2 (1/r 2 )[( / r)(r 2 ( / r)) + (1/sin q )( / q )(sin q ( / q )) + (1/sin 2 q )( 2 / f 2 )]. (12.17) This transformation is set as a problem. If there is spherical symmetry, there is no angulardependence, in which case, 2 (r) = (1/r 2 )( / r)(r 2 ( / r)) = 2 / r 2 + (2/r)( / r). (12.18) We can check that y = y 0 (1/r)cos(kr w t) is a solution of the radial form of (12.9), Differentiating twice, we find 2 y / r 2 = y 0 [(k 2 /r)cosu + (2k/r 2 )sinu + (2/r 3 )cosu], where u = kr w t, and PAGE 185 W A V E M O T I O N 175 2 y / t 2 = y 0 ( w 2 /r)cosu, w = kV, from which we obtain (1/V 2 ) 2 y / t 2 [ 2 y / r 2 + (2/r) y / r] = 0. (12.19) 12.6 The superposition of harmonic waves Consider two harmonic waves with the same amplitudes, y 0 travelling in the same direction, the xaxis. Let their angular frequencies be slightly different w dw with corresponding wavenumbers k d k. Their resultant, Y is given by Y = y 0 e i{(k + d k)x ( w + dw )t} + y 0 e i{(k d k)x ( w dw )t} = y 0 e i(kx w t) [e i( d kx dw t) + e i( d kx dw t) ] = y 0 e i(kx w t) [2cos( d kx dw t)] = Acos f (12.20) where A = 2 y 0 e i(kx w t) the resultant amplitude, and f = d kx dw t, the phase of the modulation envelope The individual waves travel at a speed w /k = v f the phase velocity, (12.21) and the modulation envelope travels at a speed dw / d k = v G the group velocity. (12.22) PAGE 186 W A V E M O T I O N 176 In the limit of a very large number of waves, each differing slightly in frequency from that of a neighbor, dk 0, in which case d w /dk = v G For electromagnetic waves travelling through a vacuum, v G = v f = c, the speed of light. We shall not, at this stage, deal with the problem of the superposition of an arbitrary number of harmonic waves. 12.7 Standing waves The superposition of two waves of the same amplitudes and frequencies but travelling in opposite directions has the form Y = y 1 + y 2 = Acos(kx w t) + Acos(kx + w t) = 2Acos(kx)cos( w t). (12.23) This form describes a standing wave that pulsates with angular frequency w associated with the timedependent term cos w t. In a travelling wave, the amplitudes of the waves of all particles in the medium are the same and their phases depend on position. In a standing wave, the amplitudes depend on position and the phases are the same. For standing waves, the amplitudes are a maximum when kx = 0, p 2 p 3 p ... and they are a minimum when kx = p /2, 3 p /2, 5 p /2, ...(the nodes). PROBLEMS The main treatment of wave motion, including interference and diffraction effects, takes place in the second semester (Part 2) in discussing Electromagnetism and Optics. PAGE 187 W A V E M O T I O N 177 121 Ripples on the surface of water with wavelengths of about one centimeter are found to have a phase velocity v f = ( a k) where k is the wave number and a is a constant characteristic of water. Show that their group velocity is v G = (3/2)v f 122 Show that y(x, t) = exp{x vt} represents a travelling wave but not a periodic wave. 123 Two plane waves have the same frequency and they oscillate in the zdirection; they have the forms y (x, t) = 4sin{20t + ( p x/3) + p }, and y (y, t) = 2sin{20t + ( p y/4) + p }. Show that their superposition at x = 5 and y = 2 is given by y (t) = 2.48sin{20t ( p /5)}. 124 Express the standing wave y = Asin(ax)sin(bt), where a and b are constants as a combination of travelling waves. 125 Perhaps the most important application of the relativistic Doppler shift has been, and continues to be, the measurement of the velocities of recession of distant galaxies relative to the Earth. The electromagnetic radiation associated with ionized calcium atoms that escape from a galaxy in Hydra has a measured wavlength of 4750 10 m, and this is to be compared with a wavelength of 3940 10 m for the same process measured for a stationary source on Earth. Show that the measured wavelengths indicate that the galaxy in Hydra is receding from the Earth with a speed v = 0.187c. PAGE 188 13 ORTHOGONAL FUNCTIONS AND FOURIER SERIES 13.1 Definitions Two nvectors A n = [a 1 a 2 ...a n ] and B n = [b 1 b 2 ...b n ] are said to be orthogonal if [i = 1, n] a i b i = 0. (13.1) (Their scalar product is zero). Two functions A(x) and B(x) are orthogonal in the range x = a to x = b if [a, b] A(x)B(x)dx = 0. (13.2) The limits must be given in order to specify the range in which the functions A(x) and B(x) are defined. The set of real, continuous functions { f 1 (x), f 2 (x), ...} is orthogonal in [a, b] if [a, b] f m (x) f n (x)dx = 0 for m n. (13.3) If, in addition, [a, b] f n 2 (x)dx = 1 for all n, (13.4) the set is normal and therefore it is said to be orthonormal The infinite set {cos0x, cos1x, cos2x, ... sin0x, sin1x, sin2x, ...} (13.5) in the range [ p p ] of x is an example of an orthogonal set. For example, [ p p ] cosx cos2xdx = 0 etc., (13.6) PAGE 189 O R T H O G O N A L F U N C T I O N S 179 and [ p p ] cos 2 xdx 0 = p etc. This set, which is orthogonal in any interval of x of length 2 p is of interest in Mathematics because a large class of functions of x can be expressed as linear combinations of the members of the set in the interval 2 p For example we can often write f (x) = c 1 f 1 + c 2 f 2 + where the cs are constants = a 0 cos0x + a 1 cos1x + a 2 cos2x + ... + b 0 sin0x + b 1 sin1x + b 2 sin2x + ... (13.7) A large class of periodic functions ,of period 2 p can be expressed in this way. When a function can be expressed as a linear combination of the orthogonal set {1, cos1x, cos2x, ...0, sin1x, sin2x, ...} it is said to be expanded in its Fourier series 13.2 Some trigonometric identities and their Fourier series Some of the familiar trigonometric identities involve Fourier series. For example, cos2x = 1 2sin 2 x (13.8) can be written sin 2 x = (1/2) (1/2)cos2x and this can be written sin 2 x = {(1/2)cos0x + 0cos1x (1/2)cos2x + 0cos3x + ... + 0{sin0x + sin1x + sin2x + ...} (13.9) the Fourier series of sin 2 x. PAGE 190 O R T H O G O N A L F U N C T I O N S 180 The Fourier series of cos 2 x is cos 2 x = (1/2) + (1/2)cos2x. (13.10) More complicated trigonometric identies also can be expanded in their Fourier series. For example, the identity sin3x = 3sinx 4sin 3 x can be written sin 3 x = (3/4)sinx (1/4)sin3x, (13.11) and this is the Fourier series of sin 3 x. The terms in the series represent the harmonicsof the function sin 3 x. They are shown in the following diagram The Fourier components of sin 3 x 1 1 1 180 150 120 90 60 30 0 30 60 90 120 150 180 Degrees In a similar fashion, we find that the identity cos3x = 4cos 3 x 3cosx PAGE 191 O R T H O G O N A L F U N C T I O N S 181 can be rearranged to give the Fourier series of cos 3 x cos 3 x = (3/4) + (1/4)cos3x. (13.12) In general, a combination of deMoivres theorem and the binomial theorem can be used to write cos(nx) and sin(nx) (for n a positive integer) in terms of powers of sinx and cosx. We have cos(nx) + isin(nx) = (cosx + isinx) n (i = ) (deMoivre) (13.13) and (a + b) n = a n + na n b + (n(n)/2!) a n b 2 ...+b n (13.14) For example, if n = 4, we obtain cos 4 x = (1/8)cos4x + (1/2)cos2x + (3/8), (13.15) and sin 4 x = (1/8)cos4x (1/2)cos2x + (3/8). (13.16) 13.3 Determination of the Fourier coefficients of a function If, in the interval [a, b], the function f(x) can be expanded in terms of the set { f 1 (x), f 2 (x), ...}, which means that f(x) = [i=1, ] c i f i (x), (13.17) where { f 1 (x), f 2 (x), ...} is orthogonal in [a, b], then the coefficients can be evaluated as follows: to determine the kthcoefficient, c k multiply f(x) by f k (x), and integrate over the interval [a, b]: [a, b] f(x) f k (x)dx = [a, b] c 1 f 1 f k dx + ... [a, b] c k f k 2 dx + ... (13.18) PAGE 192 O R T H O G O N A L F U N C T I O N S 182 = 0 + 0 + 0 + 0 ... The integrals of the products f m f n in the range [ p p ] are all zero except for the case that involves f k 2 We therefore obtain the kthcoefficient c k = [a, b] f(x) f k (x)dx / [a, b] f k 2 (x)dx k = 1, 2, 3, ... (13.19) 13.4 The Fourier series of a periodic sawtooth waveform In standard works on Fourier analysis it is proved that every periodic continuous function f(x) of period 2 p can be expanded in terms of {1, cosx, cos2x, ...0, sinx, sin2x, ...}; this orthogonal set is said to be complete with respect to the set of periodic continuous functions f(x) in [a, b]. Let f(x) be a periodic sawtooth waveform with an amplitude of 1: f(x) +1 p p 0 p 2 p x 1 The function has the following forms in the three intervals f(x) = (/ p )(x + p ) for p x p /2, = 2x/ p for p /2 x p /2, and = (/ p )(x p ) for p /2 x p PAGE 193 O R T H O G O N A L F U N C T I O N S 183 The periodicity means that f(x + 2 p ) = f(x). The function f(x) can be represented as a linear combination of the series {1, cosx, cos2x, ...sinx, sin2x, ...}: f(x) = a 0 cos0x + a 1 cos1x + a 2 cos2x + ...a k coskx + ... + b 0 sin0x + b 1 sin1x + b 2 sin2x + ...b k sinkx + ... (13.20) The coefficients are given by a k = [ p p ] coskx f(x)dx / [ p p ] cos 2 kxdx = 0, (f(x) is odd, coskx is even, and [ p p ] is symmetric about 0), (13.21) and b k = [ p p ] sinkx f(x)dx / [ p p ] sin 2 kxdx 0, = (1/ p ){ [ p p /2] (/ p )(x + p )sinkxdx + [ p /2, p /2] (2x/ p )sinkxdx + [ p /2, p ] (/ p )(x p )sinkxdx } = {8/( p k) 2 }sin(k p /2). (13.22) The Fourier series of f(x) is therefore f(x) = (8/ p 2 ){sinx (1/3 2 )sin3x + (1/5 2 )sin5x (1/7 2 )sin7x + ...}. Sum of Four Fourier Components of a SawTooth Waveform 0.2 0 0.2 0.4 0.6 0.8 1 0 30 60 90 120 150 180 Degrees PAGE 194 O R T H O G O N A L F U N C T I O N S 184 The above procedure can be generalized to include functions that are not periodic. The sum of discrete Fourier components then becomes an integral of the amplitude of the component of angular frequency w = 2 pn with respect to w This is a subject covered in the more advanced treatments of Physics. PROBLEMS 131 Use deMoivres theorem and the binomial theorem to obtain the Fourier expansions: 1) cos 4 x = 3/8 + (1/2)cos2x + (1/8)cos4x, and 2) sin 4 x = 3/8 (1/2)cos2x + (1/8)cos4x. Plot these components (harmonics) and their sums for p 0 p 132 Use the method of integration of orthogonal functions to obtain the Fourier series of problem 131; you should obtain the same results as above! 133 Show that 1) if f(x) = f(x), only sine functions occur in the Fourier series for f(x), and 2) if f(x) = f(x), only cosine functions occur in the Fourier series for f(x). 134 The Fourier series of a function f(t) that is a periodic repetition outside (T, T), of the shape inside, with period 2 p is often written in the form f(t) = (a 0 /2) + [n = 1, ] {a n cos(n p t/T) + b n sin(n p t/T)}, where a n = (1/T) [T, T] f(t)cos(n p t/T)dt, and PAGE 195 O R T H O G O N A L F U N C T I O N S 185 b n = (1/T) [T, T] f(t)sin(n p t/T)dt. If f(t) is a periodic squarewave: f(t) = 3 for 0 < t < 5 m s = 0 for 5 < t < 10 m s, with period 2T = 10 m s f(t) 3 0 0 5 10 t obtain the Fourier series : f(t) = (3/2) +(3/ p ) [n = 1, ] [(1 cosn p )/n]sin(n p t/5)). Compute this series for n = 1 to 5 and < t < 5, and compare the truncated series with the exact waveform. 135 It is interesting to note that the series in 134 converges to the exact value f(t) = 3 at the value t = 5/2 m s, so that 3 = (3/2) + (3/ p ) [n=1, ] [(1 cosn p )/n]sin(n p /2). Use this result to obtain the important GregoryLeibniz infinite series : ( p /4) = 1 (1/3) + (1/5) (1/7) + ... PAGE 196 Appendix A Solving ordinary differential equations Typical dynamical equations of Physics are 1) Force in the xdirection = mass acceleration in the xdirection with the mathematical form F x = ma x = md 2 x/dt 2 and 2) The amplitude y(x, t) of a wave at (x, t), travelling at constant speed V along the xaxis with the mathematical form (1/V 2 ) 2 y/ t 2 2 y/ x 2 = 0. Such equations, that involve differential coefficients, are called differential equations An equation of the form f(x, y(x), dy(x)/dx; a r ) = 0 (A.1) that contains i) a variable y that depends on a single, independent variable x, ii) a first derivative dy(x)/dx, and iii) constants, a r , PAGE 197 O R D I N A R Y D I F F E R E N T I A L E Q U A T I O N S 187 is called an ordinary (a single independent variable) differential equation of the first order (a first derivative, only). An equation of the form f(x 1 x 2 ...x n y(x 1 x 2 ...x n ), y/ x 1 y/ x 2 ... y/ x n ; 2 y x 1 2 2 y/ x 2 2 ... 2 y/ x n 2 ; n y/ x 1 n n y/ x 2 n ... n y/ x n n ; a 1 a 2 ...a r ) = 0 (A.2) that contains i) a variable y that depends on nindependent variables x 1 x 2 ...x n ii) the 1st, 2nd, ...nthorder partial derivatives: y/ x 1 ... 2 y/ x 1 2 ... n y/ x 1 n ..., and iii) r constants, a 1 a 2 ...a r is called a partial differential equation of the nthorder. Some of the techniques for solving ordinary linear differential equations are given in this appendix. An ordinary differential equation is formed from a particular functional relation, f(x, y; a 1 a 2 ...a n ) that involves n arbitrary constants. Successive differentiations of f with respect to x, yield n relationships involving x, y, and the first n derivatives of y with respect to x, and some (or possibly all) of the n constants. There are (n + 1) relationships from which the n constants can be eliminated. The result will involve d n y/dx n differential coefficients of lower orders, together with x, and y, and no arbitrary constants. Consider, for example, the standard equation of a parabola: PAGE 198 O R D I N A R Y D I F F E R E N T I A L E Q U A T I O N S 188 y 2 4ax. = 0, where a is a constant. Differentiating, gives 2y(dy/dx) 4a = 0 so that y 2x(dy/dx) = 0, a differential equation that does not contain the constant a. As another example, consider the equation f(x, y, a, b, c) = 0 = x 2 + y 2 + ax + by + c = 0. Differentiating three times successively, with respect to x, gives 1) 2x + 2y(dy/dx) + a + b(dy/dx) = 0, 2) 2 + 2{y(d 2 y/dx 2 ) + (dy/dx) 2 } + b(d 2 y/dx 2 ) = 0, and 3) 2{y(d 3 y/dx 3 ) + (d 2 y/dx 2 )(dy/dx)} + 4(dy/dx)(d 2 y/dx 2 ) + b(d 3 y/dx 3 ) = 0. Eliminating b from 2) and 3), (d 3 y/dx 3 ){1 + (dy/dx) 2 } = (dy/dx)(d 2 y/dx 2 ) 2 The most general solution of an ordinary differential equation of the nthorder contains n arbitrary constants. The solution that contains all the arbitrary constants is called the complete primative If a solution is obtained from the complete primative by giving definite values to the constants then the (nonunique) solution is called a particular integral. Equations of the 1storder and degree The equation PAGE 199 O R D I N A R Y D I F F E R E N T I A L E Q U A T I O N S 189 M(x, y)(dy/dx) + N(x, y) = 0 (A.3) is separable if M/N can be reduced to the form f 1 (y)/f 2 (x), where f 1 does not involve x, and f 2 does not involve y. Specific cases that are met are: i) y absent in M and N, so that M and N are functions of x only; Eq. (A.3) then can be written (dy/dx) = (M/N) = F(x) therefore y = F(x)dx + C, where C is a constant of integration. ii) x absent in M and N. Eq. (A.3) then becomes (M/N)(dy/dx) = 1, so that F(y)(dy/dx) = (M/N = F(y)) therefore x = F(y)dy + C. iii) x and y present in M and N, but the variables are separable. Put M/N = f(y)/g(x), then Eq. (A.3) becomes f(y)(dy/dx) + g(x) = 0. Integrating over x, f(y)(dy/dx)dx + g(x)dx = 0. or PAGE 200 O R D I N A R Y D I F F E R E N T I A L E Q U A T I O N S 190 f(y))dy + g(x)dx = 0. For example, consider the differential equation x(dy/dx) + coty = 0. This can be written (siny/cosy)(dy/dx) + 1/x = 0. Integrating, and putting the constant of integration C = lnD, (siny/cosy)dy + (1/x)dx = lnD, so that ln(cosy) + lnx = lnD, or ln(x/cosy) = lnD. The solution is therefore y = cos (x/D). Exact equations The equation ydx + xdy = 0 is said to be exact because it can be written as d(xy) = 0, or xy = constant. Consider the nonexact equation (tany)dx + (tanx)dy = 0. We see that it can be made exact by multiplying throughout by cosxcosy, giving PAGE 201 O R D I N A R Y D I F F E R E N T I A L E Q U A T I O N S 191 sinycosxdx + sinxcosydy = 0 (exact) so that d(sinysinx) = 0, or sinysinx = constant. The term cosxcosy is called an integrating factor Homogeneous differential equations. A homogeneous equation of the nth degree in x and y is such that the powers of x and y in every term of the equation is n. For example, x 2 y + 2xy 2 + 3y 3 is a homogeneous equation of the third degree. If, in the differential equation M(dy/dx) + N = 0 the terms M and N are homogeneous functions of x and y, of the same degree, then we have a homogeneous differential equation of the 1st order and degree. The differential equation then reduces to dy/dx = (N/M) = F(y/x) To find whether or not a function F(x, y) can be written F(y/x), put y = vx. If the result is F(v) (all xs cancel) then F is homogeneous. For example dy/dx = (x 2 + y 2 )/2x 2 dy/dx = (1 + v 2 )/2 = F(v), therefore the equation is homogeneous. Since dy/dx F(v) by putting y = vx on the righthand side of the equation, we make the same substitution on the lefthand side to obtain PAGE 202 O R D I N A R Y D I F F E R E N T I A L E Q U A T I O N S 192 v + x(dv/dx) = (1 + v 2 )/2 therefore 2xdv = (1 + v 2 2v)dx. Separating the variables 2dv/(v 1) 2 = dx/x., and this can be integrated. Linear Equations The equation dy/dx + M(x)y = N(x) is said to be linear and of the 1st order. An example of such an equation is dy/dx + (1/x)y = x 2 This equation can be solved by introducing the integrating factor, x, so that x(dy/dx) + y = x 3 therefore (d/dx)(xy) = x 3 giving xy = x 4 /4 + constant. In general, let R be an integrating factor, then R(dy/dx) + RMy = RN, in which case, the lefthand side is the differential coefficient of some product with a first term R(dy/dx). The product must be Ry! Put, therefore R(dy/dx) + RMy = (d/dx)(Ry) = R(dy/dx) + y(dR/dx). PAGE 203 O R D I N A R Y D I F F E R E N T I A L E Q U A T I O N S 193 Now, RMy = y(dR/dx), which leads to M(x)dx = dR/R = lnR, or R = exp{ M(x)dx}. We therefore have the following procedure: to solve the differential equation (dy/dx) + M(x)y = N(x), multiply each side by the integrating factor exp{ M(x)dx}, and integrate. For example, let (dy/dx) + (1/x)y = x 2 so that M(x)dx = (1/x)dx = lnx and the integrating factor is exp{lnx} = x:. We therefore obtain the equation x(dy/dx) + (1/x)y = x 3 deduced previously on intuitive grounds. Linear Equations with Constant Coefficients. Consider the 1st order linear differential equation p 0 (dy/dx) + p 1 y = 0, where p 0 p 1 are constants. Writing this as p 0 (dy/y) + p 1 dx = 0, we can integrate termbyterm, so that p 0 lny + p 1 x = constant, PAGE 204 O R D I N A R Y D I F F E R E N T I A L E Q U A T I O N S 194 therefore lny = (p 1 /p 0 )x + constant = (p 1 /p 0 )x + lnA, say therefore y = Aexp{(p 1 /p 0 )x}. Linear differential equations with constant coefficients of the 2nd order occur often in Physics. They are typified by the forms p 0 (d 2 y/dx 2 ) + p 1 (dy/dx) + p 2 y = 0. The solution of an equation of this form is obtained by following the insight gained in solving the 1st order equation!. We try a solution of the type y = Aexp{mx}, so that the equation is Aexp{mx}(p 0 m 2 + p 1 m + p 2 ) = 0. If m is a root of p 0 m 2 + p 1 m + p 2 = 0 then y = Aexp{mx} is a solution of the original equation for all values of A. Let the roots be a and b If a b there are two solutions y = Aexp{ a x }and y = Bexp{ b x.}. If we put y = Aexp{ a x} + Bexp{ b x} in the original equation then PAGE 205 O R D I N A R Y D I F F E R E N T I A L E Q U A T I O N S 195 Aexp{ a x}(p 0 a 2 + p 1 a + p 2 ) + Bexp{ b x}(p 0 b 2 + p 1 b + p 2 ) = 0, which is true as a and b are the roots of p 0 m 2 + p 1 m + p 2 = 0, (called the auxilliary equation ) The original equation is linear, therefore the sum of the two solutions is, itself, a (third) solution. The third solution contains two arbitrary constants (the order of the equation), and it is therefore the general solution As an example of the method, consider solving the equation 2(d 2 y/dx 2 ) + 5(dy/dx) + 2y = 0. Put y = Aexp{mx }as a trial solution, then Aexp{mx}(2m 2 + 5m + 2) = 0, so that m = or /2, therefore the general solution is y = Aexp{x} + Bexp{(/2)x}. If the roots of the auxilliary equation are complex, then y = Aexp{p + iq}x + Bexp{p iq}x, where the roots are p iq ( p, q R). In practice, we write y = exp{px}[Ecosqx + Fsinqx] where E and F are arbitrary constants. For example, consider the solution of the equation d 2 y/dx 2 6(dy/dx) + 13y = 0, therefore PAGE 206 O R D I N A R Y D I F F E R E N T I A L E Q U A T I O N S 196 m 2 6m + 13 = 0, so that m = 3 i2. We therefore have y = Aexp{(3 + i2)x} + Bexp{3 i2)x} = exp{3x}(Ecos2x + Fsin2x). The general solution of a linear differential equation with constant coefficients is the sum of a particular integral and the complementary function (obtained by putting zero for the function of x that appears in the original equation). PAGE 207 BIBLIOGRAPHY Those books that have had an important influence on the subject matter and the style of this book are recognized with the symbol *. I am indebted to the many authors for providing a source of fundamental knowledge that I have attempted to absorb in aprocess of continuing education over a period of fifty years.General Physics*Feynman, R. P., Leighton, R. B., and Sands, M., The Feynman Lectures on Physics 3 vols., AddisonWesley Publishing Company, Reading, MA (1964). *Joos, G., Theoretical Physics Dover Publications, Inc., New York, 3rd edn (1986). Lindsay, R. B., Concepts and Methods of Theoretical Physics Van Nostrand Company, Inc., New York (1952). MathematicsArmstrong, M. A., Groups and Symmetry SpringerVerlag, New York (1988). *Caunt, G. W., An Introduction to Infinitesimal Calculus The Clarendon Press, Oxford (1949). *Courant R., and John F., Introduction to Calculus and Analysis 2 vols., John Wiley & Sons, New York (1974). Kline, M., Mathematical Thought from Ancient to Modern Times Oxford University Press, Oxford (1972). PAGE 208 B I B L I O G R A P H Y 198 *Margenau, H., and Murphy, G. M., The Mathematics of Physics and Chemistry Van Nostrand Company, Inc., New York, 2nd edn (1956). Mirsky, L., An Introduction to Linear Algebra Dover Publications, Inc., New York (1982). *Piaggio, H. T. H., An Elementary Treatise on Differential Equations G. Bell & Sons, Ltd., London (1952). Samelson, H., An Introduction to Linear Algebra John Wiley & Sons, New York (1974). Stephenson, G., An Introduction to Matrices, Sets and Groups for Science Students Dover Publications, Inc., New york (1986). Yourgrau, W., and Mandelstam, S., Variational Principles in Dynamics and Quantum Theory Dover Publications, Inc., New York 1979). Dynamics Becker, R. A., Introduction to Theoretical Mechanics McGrawHill Book Company, Inc., New York (1954). Byerly, W. E., An Introduction to the Use of Generalized Coordinates in Mechanics and Physics Dover Publications, Inc., New York (1965). Kilmister, C. W., Lagrangian Dynamics: an Introduction for Students Plenum Press, New York (1967). *Ramsey, A. S., Dynamics Part I Cambridge University Press, Cambridge (1951). *Routh, E. J., Dynamics of a System of Rigid Bodies Dover Publications, Inc., New York (1960). PAGE 209 B I B L I O G R A P H Y 199 Whittaker, E. T., A Treatise on the Analytical Dynamics of Particles and Rigid Bodies Cambridge University Press, Cambridge (1961). This is a classic work that goes well beyond the level of the present book. It is, nonetheless, well worth consulting to see what lies ahead! Relativity and Gravitation *Einstein, A.., The Principle of Relativity Dover Publications, Inc., New York (1952). A collection of the original papers on the Special and General Theories of Relativity. Dixon, W. G., Special Relativity Cambridge University Press, Cambridge (1978). French, A. P., Special Relativity W. W. Norton & Company, Inc., New York (1968). Kenyon, I. R., General Relativity Oxford University Press, Oxford (1990). Lucas, J. R., and Hodgson, P. E., Spacetime and Electromagnetism Oxford University Press, Oxford (1990). *Ohanian, H. C., Gravitation and Spacetime W. W. Norton & Company, Inc., New York (1976). *Rindler, W., Introduction to Special Relativity Oxford University Press, Oxford, 2nd edn (1991). Rosser, W. G. V., Introductory Relativity Butterworth & Co. Ltd., London (1967). NonLinear Dynamics *Baker, G. L., and Gollub, J. P., Chaotic Dynamics Cambridge University Press, Cambridge (1991). Press, W. H., Teukolsky, S. A., Vetterling W. T., and Flannery, B. P., Numerical Recipes in PAGE 210 B I B L I O G R A P H Y 200 C Cambridge University Press, Cambridge 2nd edn (1992). Waves Crawford, F. S., Waves (Berkeley Physics Series, vol 3), McGrawHill Book Company, Inc., New York (1968). French, A. P., Vibrations and Waves W. W. Norton & Company, Inc., New York (1971). General reading Bronowski, J., The Ascent of Man Little, Brown and Company, Boston (1973). Calder, N., Einsteins Universe The Viking Press, New York (1979). Davies, P. C. W., Space and Time in the Modern Universe Cambridge University Press, Cambridge (1977). Schrier, E. W., and Allman, W. F., eds., Newton at the Bat Charles Scribners Sons, New York (1984). 