Manifestly Covariant Lagrangians, Classical Particles with Spin, and the Origins of Gauge Invariance Jacob A. Barandes1, ∗ 1Jefferson Physical Laboratory, Harvard University, Cambridge, MA 02138 (Dated: September 7, 2020) In this paper, we review a general technique for converting the standard Lagrangian description of a classical system into a formulation that puts time on an equal footing with the system's degrees of freedom. We show how the resulting framework anticipates key features of special relativity, including the signature of the Minkowski metric tensor and the special role played by theories that are invariant under a generalized notion of Lorentz transformations. We then use this technique to revisit a classification of classical particle-types that mirrors Wigner's classification of quantum particle-types in terms of irreducible representations of the Poincaré group, including the cases of massive particles, massless particles, and tachyons. Along the way, we see gauge invariance naturally emerge in the context of classical massless particles with nonzero spin, as well as study the massless limit of a massive particle and derive a classical-particle version of the Higgs mechanism. I. INTRODUCTION The Lagrangian formulation of classical physics provides an elegant and powerful set of techniques for analyzing the behavior of physical systems. For classical fields, it is customary to employ Lagrangians that make the symmetries of special relativity manifest, but textbook treatments of mechanical systems tend to treat time and energy very differently from degrees of freedom and momenta. In this paper, we cast new light on a technique for resolving this shortcoming. Among its useful features, we show that this framework anticipates key aspects of special relativity, like the signature of the Minkowski metric tensor and the special role played by classical systems that exhibit generalizations of Lorentz invariance. Extending earlier work, including [1–3], we then present a fully classical version of Wigner's famous classification [4] of quantum particles into general types-massive, massless, and tachyonic. In close parallel with Wigner's construction, which is based on identifying the Hilbert spaces of quantum particles with irreducible representations of the Poincaré group, our classification of classical particle-types consists of identifying their phase spaces with "irreducible" (or, more properly, transitive) group actions of the Poincaré group. Our classical particles generically possess fixed total spin but without spin quantization, and therefore correspond to the limit of large spin quantum numbers. Along the way, and as a case study in how kinematics can determine dynamics, we show that the structure of these phase spaces leads to a simple Lagrangian formulation that can handle both massive and massless particles and that neatly accommodates spin. In addition, by paying careful attention to the compactness properties of these phase spaces at fixed energy, we show that physically acceptable massless particles with spin feature ∗ barandes@physics.harvard.edu a classical point-particle manifestation of gauge invariance that is deeply connected to the gauge invariance of electromagnetism-meaning that this form of gauge invariance is not solely a property of classical field theory or of relativistic quantum mechanics. By studying the relationship between the massive and massless cases through the massless limit, we also derive a classical version of the Higgs mechanism. II. THE LAGRANGIAN FORMULATION We start with a brief review of general classical systems and their standard Lagrangian formulation [5]. Afterward, we will turn to the development of a manifestly covariant approach. A. Classical Systems In general, a classical system consists of a configuration space whose points denote the possible "snapshots" that the system can occupy, together with a list of rules or laws that determine how the system's instantaneous configuration is allowed to evolve. If qα are a collection of independent numerical coordinates that label the points in the system's configuration space, with α an index distinguishing the different coordinates, then we call qα a set of degrees of freedom for the system. We will assume for simplicity that we can cover the entire configuration space with a single such coordinate system, apart from possible regions of measure zero where the coordinates are not well-defined. A candidate trajectory of the system is an arbitrary continuous path through the system's configuration space, and is conveniently defined by specifying the system's degrees of freedom qα(t) as functions of a realvalued parameter t called the time. The system's rates of change are then denoted by qα(t), where dots denote 2 derivatives with respect to t: qα(t) ≡ dqα(t) dt , (1) qα(t) ≡ d2qα(t) dt2 , (2) and so forth. Altogether, the system's configuration space, a choice of degrees of freedom qα, and all the system's candidate trajectories make up the system's kinematics. On the other hand, the rules that govern which candidate trajectories are physical trajectories that the system can actually follow make up the system's dynamics. In the simplest cases, these rules take the form of firstor second-order differential equations of the form fα(q, q, q) = 0, (3) which are called the system's equations of motion. As a simple example, consider a Newtonian particle of constant mass m in an inertial reference frame in three spatial dimensions. At the level of kinematics, the particle has a three-dimensional configuration space isomorphic to R3, and three degrees of freedom qx, qy, qz that make up the particle's position vector X in Cartesian coordinates: X ≡ (X,Y, Z) ≡ (qx, qy, qz). (4) At the level of dynamics, we assume a given force vector F ≡ (Fx, Fy, Fz), (5) in which case the system's equations of motion make up the three components of Newton's second law, F = ma, (6) where a is the system's acceleration vector: a ≡ Ẍ = (Ẍ, Ÿ , Z). (7) B. The Lagrangian Formulation Returning again to the case a general classical system, let L(q, q, t), assumed to have units of energy, be a function of the system's degrees of freedom qα, its rates of change qα, and the time t, which are all independent variables if we do not specify a candidate trajectory. On the other hand, if we are given a candidate trajectory qα(t) from an arbitrary initial time tA to an arbitrary final time tB , then the degrees of freedom qα(t) and their rates of change qα(t) become functions of t, and we can define an integral of L(q(t), q(t), t) over time: S[q] ≡ ∫ tB tA dtL(q(t), q(t), t). (8) The bracketed argument [q] in this notation indicates that S[q] is a functional of the system's candidate trajectory, meaning that S[q] depends on the infinite continuum of real numbers that make up the entire candidate trajectory qα(t). If we extremize S[q] over all candidate trajectories that share the same initial and final conditions, δS[q] = 0, with qα(tA) and qα(tB) held fixed for all α, (9) then, as we will review in detail, we obtain the EulerLagrange equations, ∂L ∂qα − d dt ( ∂L ∂qα ) = 0, (10) which are typically second-order in the time t. If the Euler-Lagrange equations collectively turn out to be equivalent to the system's equations of motion (3), then we respectively call L = L(q, q, t) and S[q] a Lagrangian and an action functional for the system, and we say that S[q] ≡ ∫ dtL provides a Lagrangian formulation for the system. (Note that L and S[q] are generally not unique.) Deriving the Euler-Lagrange equations from the extremization condition (9), known as Hamilton's principle or the principle of least action, takes just a few steps, and will be an illustrative exercise before we generalize the construction later on. We start by varying the system's candidate trajectory qα(t) according to qα(t) 7→ qα(t) + δqα(t), (11) where the variations δqα(t) are infinitesimal functions of the time t that are assumed to vanish at the endpoints of the system's trajectory in keeping with (9), δqα(tA) = 0, δqα(tB) = 0, (12) but are otherwise arbitrary and independent. Taking a time derivative of the variation rule (11) yields the corresponding variations in the system's rates of change qα(t): qα(t) = dqα(t) dt 7→ d(qα(t) + δqα(t)) dt = qα(t) + d dt δqα(t). (13) We infer that the induced variation in qα(t) is precisely the time derivative of the variation in qα(t), δqα(t) = d dt δqα(t), (14) so, loosely speaking, the variation operator δ "commutes" with the time derivative d/dt. Applying the extremization condition (9), using the chain rule, taking an integration by parts, and dropping boundary terms that vanish by the assumption that the 3 variations vanish at the initial and final times, we find δS[q] ≡ ∫ dtL(q + δq, q + δq, t) − ∫ dtL(q, q, t) = ∫ dt ∑ α ( ∂L ∂qα δqα + ∂L ∂qα δqα ) = ∫ dt ∑ α ( ∂L ∂qα δqα + ∂L ∂qα d dt δqα ) = ∫ dt ∑ α ( ∂L ∂qα − d dt ( ∂L ∂qα )) δqα = 0. (15) Because the infinitesimal variations δqα(t) are assumed to be arbitrary and independent within the domain of integration, we conclude that the factor in parentheses must be zero, so we end up with the Euler-Lagrange equations (10), as claimed. As an example, consider a Newtonian particle of mass m and position vector X ≡ (X,Y, Z) with kinetic energy T (Ẋ) = 1 2 mẊ2 = 1 2 m(Ẋ2 + Ẏ 2 + Ż2) (16) and subject to a conservative force F = −∇V = ( − ∂V ∂X ,−∂V ∂Y ,−∂V ∂Z ) (17) corresponding to a potential energy V (X) = V (X,Y, Z). If we choose the Lagrangian L(X, Ẋ) ≡ T − V = 1 2 mẊ2 − V (X), (18) then the Euler-Lagrange equations (10) with X = (X,Y, Z) = (qx, qy, qz) give ∂L ∂Xi − d dt ( ∂L ∂Ẋi ) = − ∂V ∂Xi −mẌi = 0, which replicate the three components of Newton's second law (6), F = ma. Notice also that the object's momentum p ≡ (px, py, pz) ≡ mẊ (19) is related to the Lagrangian (18) by pi = mẊi = ∂L ∂Ẋi , (20) and that the object's total mechanical energy E ≡ T + V (21) is related to p and L by E = 1 2 mẊ2 + V (X) = p2 2m + V (X) = p * Ẋ− L. (22) For a generic physical system that may not resemble a Newtonian object, we might not have an obvious choice for defining the system's momenta and energy. The formulas at the end of (20) and at the end of (22) have the virtue of being general and of leading to quantities pi and E that, as we will see shortly, are respectively conserved if the system's action functional (8) is symmetric under translations in space, Xi 7→ Xi + (constant), or under translations in time, t 7→ t+ (constant). Given a generic system with a Lagrangian formulation, we are therefore motivated to define the system's canonical momenta pα in terms of the system's Lagrangian L as the partial derivative of L with respect to the corresponding rates of change qα: pα ≡ ∂L ∂qα . (23) Recalling that the set of points labeled by particular values qα of a system's degrees of freedom define the system's configuration space, the set of points (q, p) labeled by particular values of the system's canonical variables qα and pα define the system's phase space. If we can solve the definitions (23) for the rates of change qα as functions of the canonical variables qα and pα, then the system's Hamiltonian H(q, p, t), which is a function on the system's phase space and roughly describes the system's energy, is defined as H ≡ ∑ α ∂L ∂qα qα − L, = ∑ α pαqα − L, (24) which is known as a Legendre transformation of L. In terms of the canonical momenta (23), we can recast the Euler-Lagrange equations (10) as dpα dt = ∂L ∂qα . (25) One can also use the chain rule together with the EulerLagrange equations to show that the time derivative of the Hamiltonian (24) is given by dH dt = −∂L ∂t . (26) These two equalities look very similar, apart from an overall minus sign that we will eventually see is not an accident but has an important physical significance. Moreover, we see right away from (25) that if the Lagrangian is invariant under constant translations along a specific degree of freedom, qα 7→ qα + (constant), so 4 that ∂L/∂qα = 0, then the corresponding canonical momentum pα is conserved, dpα/dt = 0. Similarly, we see from (26) that if the Lagrangian is invariant under constant translations in time, t 7→ t + (constant), so that ∂L/∂t = 0, then the Hamiltonian H is conserved, dH/dt = 0. These results are both special cases of Noether's theorem, which establishes a general correspondence between continuous symmetries of a classical system's dynamics and quantities that are conserved when the system follows its equations of motion. Taking partial derivatives of the Hamiltonian H with respect to the canonical variables qα and pα, now treated as independent variables, and regarding qα as a function of the canonical variables, it follows from a straightforward calculation that the Euler-Lagrange equations (10) imply the canonical equations of motion: qα = ∂H ∂pα , ṗα = − ∂H ∂qα .  (27) Going the other way, one can also show that the canonical equations of motion imply the Euler-Lagrange equations, so the two sets of equations are fully equivalent. The canonical equations of motion provide an alternative way to encode the system's dynamics, known as the Hamiltonian formulation. C. The Manifestly Covariant Lagrangian Formulation The standard Lagrangian formulation of classical physics treats time and energy differently from space and momentum, in tension with the spirit of special relativity. Fortunately, we can recast the Lagrangian formulation in a more elegant way that puts time and degrees of freedom on the same footing, with the result that energy and momentum will naturally also end up on the same footing [6]. To begin, we turn again to the case of a general classical system with degrees of freedom qα, Lagrangian L(q, q, t), and action functional (8), S[q] ≡ ∫ dtL(q, q, t). We carry out a smooth, strictly monotonic change of integration variable from t to a new parameter λ: t 7→ t(λ). (28) Letting dots now denote derivatives with respect to λ, ḟ ≡ df dλ , (29) we obtain the following differential relationships: dt = dλ ṫ, dqα dt = qα ṫ . (30) Our action functional then becomes S[q] ≡ ∫ dλ ṫ L(q, q/ṫ, t). (31) This formula for the system's action functional is reparametrization invariant, meaning that it would maintain its form if we were to carry out any subsequent smooth, strictly monotonic change of parametrization λ 7→ λ(λ′): S[q] ≡ ∫ dλ′ dt dλ′ L ( q, dq dλ′ / dt dλ′ , t ) . (32) Reparametrization invariance is an example of a gauge invariance, meaning a redefinition of the system's degrees of freedom that leaves all the system's physically observable features unchanged. A gauge invariance should be distinguished from a dynamical symmetry, which consists of transformations that alter the system's physical state but leave the system's dynamics unchanged. We can formally regard the reparametrizationinvariant formula (31) for the action functional as describing a system with an additional "degree of freedom" t and a modified Lagrangian L (q, q, t, ṫ) ≡ ṫ L(q, q/ṫ, t). (33) The system's new canonical momenta (23) conjugate to our original degrees of freedom qα are the same as before, Pα = pα, whereas the system's canonical momentum Pt conjugate to t is equal to minus the system's original Hamiltonian H: Pt ≡ ∂L ∂ṫ = −H, Pα ≡ ∂L ∂qα = pα.  (34) These formulas motivate introducing "upper-index" and "lower-index" versions of our canonical variables by mimicking the analogous rules for the components of the fourvectors that are used in special relativity: qt ≡ c t, qt ≡ −c t, qα ≡ qα, pt ≡ H/c, pt ≡ −H/c, pα ≡ pα.  (35) To ensure that we are using the same units for qt and qα and also the same units for pt and pα, we have introduced an arbitrary constant c with units of energy divided by momentum. (The constant c also has units of distance divided by time, or speed, but not all classical systems possess a notion of distance.) Note also that we have defined pt ≡Pt/c. Applying the extremization condition (9) to the action functional with respect to the new degrees of freedom qt 5 and qα, we obtain a new set of Euler-Lagrange equations given by ∂L ∂qt − d dλ ( ∂L ∂qt ) = 0, ∂L ∂qα − d dλ ( ∂L ∂qα ) = 0.  (36) The Euler-Lagrange equations for the degrees of freedom qα unsurprisingly give us back our original EulerLagrange equations (10), ∂L ∂qα − d dt ( ∂L ∂(dqα/dt) ) = 0, which, as we recall from (25), can be written more compactly as dpα dt = ∂L ∂qα . Meanwhile, the Euler-Lagrange equation for qt replicates the equation (26) that relates the total time derivative of the system's original Hamiltonian H to the partial time derivative of the system's original Lagrangian L, dH dt = −∂L ∂t . We can combine these results in terms of the raisedindex versions pt and pα of the canonical momenta defined in (35) as the symmetric-looking equations dpt dt = ∂L ∂qt , dpα dt = ∂L ∂qα ,  (37) or, equivalently, in terms of L and derivatives with respect to λ as ṗt ≡ dp t dλ = ∂L ∂qt , ṗα ≡ dp α dλ = ∂L ∂qα .  (38) Furthermore, and rather remarkably, we can write our action functional (31) in a form that resembles a Lorentzinvariant dot product, despite the fact that we have not assumed that our system has anything to do with special relativity or four-dimensional spacetime: S[q] = ∫ dλL = ∫ dλ ( ptq t + ∑ α pαq α ) . (39) We therefore refer to this framework as the manifestly covariant Lagrangian formulation for our classical system. Introducing a square matrix η ≡ diag(−1, 1, . . . ) that naturally generalizes the Minkowski metric tensor from special relativity, η ≡ −1 0 00 1 0 0 0 . . .  , (40) we can write the system's action functional (39) in matrix form as S[q] = ∫ dλ ( pt pα ) η ( qt qα ) , (41) where pα and qα here are notational abbreviations for their whole lists indexed by α. This expression for S[q] immediately suggests the consideration of systems whose action functionals have a symmetry under rigid linear transformations of the form( qt qα ) 7→ Λ ( qt qα ) , ( pt pα ) 7→ Λ ( pt pα ) (42) for constant matrices Λ that preserve the generalized Minkowski metric tensor η in the sense that ΛTηΛ = η. (43) The matrices Λ therefore represent generalizations of Lorentz transformations. By comparison with the group O(N) of orthogonal N× N matrices R, meaning matrices that preserve the N×N identity matrix 1 ≡ diag(1, 1, . . . ), RTR = RT1R = 1, (44) we refer to the set of generalized Lorentz-transformation matrices Λ, which preserve the (N + 1)× (N + 1) matrix η ≡ diag(−1, 1, . . . ), as making up the group O(1, N), where N is the system's original number of degrees of freedom qα. The formula (39) for the action functional also implies that the new "Hamiltonian" H , defined in line with (24), trivially vanishes, and therefore (at least classically) does not hold any physical meaning: H ≡ ptqt + ∑ α pαq α −L = 0. (45) This equation is closely related to the fact that arbitrary changes of parametrization represent a gauge invariance of the system and likewise do not have any physical meaning. III. SPACETIME IN SPECIAL RELATIVITY We now turn to a brief review of special relativity [7]. A. Spacetime and Four-Vectors In special relativity, time t and space x ≡ (x, y, z) join together to form four-dimensional spacetime coordinates, xμ ≡ (xt, xx, xy, xz)μ ≡ (c t,x)μ ≡ (c t, x, y, z)μ, (46) 6 where c is the speed of light. We will use Greek letters α, β, . . . , μ, ν, . . . for Lorentz indices, which each run through the four possible values t, x, y, z, and we will use Latin indices i, j, k, . . . for the spatial values x, y, z, where we will consistently employ Cartesian coordinate systems. Defining the (3+1)-dimensional Minkowski metric tensor by ημν ≡ ημν ≡ −1 0 0 00 1 0 00 0 1 0 0 0 0 1  μν , (47) and employing Einstein summation notation, we can raise and lower indices on the components of four-vectors according to vμ ≡ ημνvν and wμ ≡ ημνwν , with the following results: vt = −vt, vx = vx, vy = vy, vz = vz.  (48) We let Λμν be a 4 × 4 Lorentz-transformation matrix, meaning that Λμν is an element of O(1, 3) and therefore preserves the Minkowski metric tensor ημν in the sense that ΛμρημνΛ ν σ = ηρσ, (49) or, in matrix notation, ΛTηΛ = η. (50) Then Lorentz transformations of four-vectors vμ, meaning linear transformations of the form vμ 7→ Λμνvν , (51) preserve four-dimensional dot products defined by v * w ≡ vνwν = ημνvμwν . (52) Four-vectors vμ are classified as timelike, null, or spacelike according to whether the dot product of vμ with itself is respectively negative, zero, or positive: v2 ≡ v * v  < 0 timelike, = 0 null, > 0 spacelike. (53) The Lorentz invariance of the dot product (52) ensures that this classification is invariant and therefore welldefined under Lorentz transformations. B. The Spacetime Transformation Groups The collection O(1, 3) of all possible Lorentz transformations (51), vμ 7→ Λμνvν , is called the Lorentz group [8]. The largest subgroup that excludes parity transformations, Λparity = diag(1,−1,−1,−1) = 1 0 0 00 −1 0 00 0 −1 0 0 0 0 −1  , (54) is called the proper Lorentz group and is denoted by SO(1, 3), mirroring the notation SO(N) for N × N rotation matrices R that do not involve parity transformations. The largest subgroup of the Lorentz group that excludes time-reversal transformations, Λtime-reversal = diag(−1, 1, 1, 1) = −1 0 0 00 1 0 00 0 1 0 0 0 0 1  , (55) is called the orthochronous Lorentz group and is denoted by O+(1, 3) or O↑(1, 3). The set of all Lorentz transformations that can be reduced smoothly to the identity transformation Λ = 1 cannot include parity or time-reversal transformations and is called the proper orthochronous Lorentz group SO+(1, 3) or SO↑(1, 3). A simple calculation shows that for timelike and null four-vectors vμ, the sign of the temporal component vt is invariant under orthochronous Lorentz transformations vμ 7→ Λμνvν : v2 ≤ 0 =⇒ sign of vt is invariant under O+(1, 3). (56) As a consequence, future-directed (vt > 0) timelike and null four-vectors remain future-directed under orthochronous Lorentz transformations, with a similar statement for past-directed (vt < 0) timelike and null four-vectors. These properties ensure that if the displacement between two spacetime points is timelike or null, then their chronological ordering is an invariant fact of nature. By contrast, the temporal components vt of spacelike four-vectors (v2 > 0) can change sign under orthochronous Lorentz transformations, a behavior that is closely related to the breakdown of simultaneity in special relativity. We can also consider additive shifts in the fourdimensional coordinates (46) by constants aμ: xμ 7→ xμ + aμ. (57) These transformations make up the spacetimetranslation group, which is isomorphic to R4 but is denoted by R1,3 to emphasize the mathematical and physical distinctions between time and space. Combining spacetime translations with Lorentz transformations of the spacetime coordinates xμ gives the Poincaré group: xμ 7→ Λμνxν + aμ. (58) 7 Like the Lorentz group, the Poincaré group has proper and orthochronous subgroups that are respectively defined by dropping all Lorentz transformations that involve parity or time-reversal transformations [9]. IV. TRANSITIVE GROUP ACTIONS OF THE POINCARÉ GROUP The set of all physical transformations (q, p) 7→ (q′, p′) that can be carried out on a system's state (q, p) in its phase space are collectively called a group action on the system's phase space. If we include translations in time among these physical transformations, then by starting with a single convenient choice of reference state (q0, p0), we can reach every other possible state that the system can occupy. The group action provided by the system's phase space is therefore "irreducible," or, more precisely, transitive, referring to the fact that no proper subset of the system's phase space can be dropped without violating the group action. As we will show, the different possible transitive group actions of the Poincaré group turn out to provide a complete classification of the phase spaces of the different categories of particles in physics, in parallel with Wigner's method for classifying quantum particle-types by identifying their Hilbert spaces as irreducible representations of the Poincaré group [10]. A. Systems Singled Out by the Poincaré Group To start, we note that the Poincaré group (58) naturally singles out classical systems that have three physical degrees of freedom (qx, qy, qz) = X ≡ (X,Y, Z) and therefore three corresponding canonical momenta p = (px, py, pz), so the system's manifestly covariant Lagrangian formulation involves four spacetime degrees of freedom Xμ ≡ (qt, qx, qy, qz)μ = (c T,X, Y, Z)μ ≡ (c T,X)μ (59) and a canonical four-momentum pμ ≡ (pt, pX , py, pz)μ ≡ (E/c,p)μ (60) whose individual components, in lower-index form pμ, are defined in terms of the system's covariant Lagrangian L in accordance with (34), pμ ≡ ∂L ∂Ẋμ . (61) Here dots denote derivatives with respect to the arbitrary worldline parameter λ, Ẋμ ≡ dX μ dλ , (62) we have identified the system's energy E as E ≡ H ≡ ptc, (63) and candidate trajectories of the system are now called worldlines. B. Angular Momentum and Spin In analogy with the Newtonian definition L ≡ X×p of an object's orbital angular momentum, whose individual components are Lk = Xipj −Xjpi, with (i, j, k) = (x, y, z), (z, x, y), or (y, z, x), (64) we will find it convenient to introduce an antisymmetric tensor Lμν ≡ Xμpν −Xνpμ = −Lνμ (65) whose spatial components Lij (that is, for i, j each taking the values x, y, z) encode the components of L. We will accordingly refer to Lμν as the system's orbital angularmomentum tensor, although one should keep in mind that its temporal components Lti (for i a spatial index) are not angular momenta. Indeed, if the system's energy (63) is nonzero, E ≡ ptc 6= 0, then we can write these temporal components as Lti = Xtpi −Xipt = c T pi −XiE/c = −E c ( Xi − p ic2 E T ) . (66) We will see later that the factor pc2/E, which has units of distance divided by time, will typically yield the system's three-dimensional physical propagation velocity v ≡ dX/dt through space, so the quantity in parentheses will turn out to be related to the system's linear motion. To be as general as possible, we can also allow the system to possess an intrinsic notion of angular momentum, called spin, that does not involve the system's spacetime coordinates Xμ or its four-momentum pμ and that can be encoded in an antisymmetric tensor Sμν = −Sνμ, (67) called the system's spin tensor. The system's total angular momentum is then contained in the antisymmetric tensor defined as the sum of the tensors representing the orbital and spin contributions: Jμν ≡ Lμν + Sμν = −Jνμ. (68) We will refer to Jμν as the system's total angularmomentum tensor. 8 We can define the following three-vectors from the independent components of Jμν and Sμν : J ≡ (Jx, Jy, Jz) ≡ (Jyz, Jzx, Jxy), (69) K ≡ (Kx,Ky,Kz) ≡ (J tx, J ty, J tz), (70) S ≡ (Sx, Sy, Sz) ≡ (Syz, Szx, Sxy), (71) S ≡ (Sx, Sy, Sz) ≡ (Stx, Sty, Stz). (72) We will call S the system's spin three-vector and S its dual spin-three vector. We can now write the system's total angularmomentum tensor Jμν and its spin tensor Sμν as Jμν ≡  0 Kx Ky Kz−Kx 0 Jz −Jy−Ky −Jz 0 Jx −Kz Jy −Jx 0  μν , (73) Sμν ≡  0 Sx Sy Sz −Sx 0 Sz −Sy −Sy −Sz 0 Sx −Sz Sy −Sx 0  μν . (74) Note that if S = 0, then J = L = X × p reduces to the usual Newtonian definition (64) of orbital angular momentum. C. Defining a System by a Transitive Group Action of the Poincaré Group The state of our system in its phase space is fully determined by knowing the values of the system's spacetime coordinates Xμ, its four-momentum pμ, and its spin tensor Sμν , which together determine the orbital angularmomentum tensor Lμν and the total angular-momentum tensor Jμν . We can therefore define a transitive group action of the Poincaré group on the system's phase space by defining what Poincaré transformations do to the values of Xμ, pμ, and Sμν that define the system's state (X, p, S). Specifically, we define the action of Lorentz transformations on the system's state (X, p, S) by generalizing the transformation rule (51) to the statement that every free upper Lorentz index on Xμ, pμ, and Sμν receives a linear factor of a shared Lorentz-transformation matrix Λ: Xμ 7→ ΛμνXν , (75) pμ 7→ Λμνpν , (76) Sμν 7→ ΛμρΛνσSρσ = ΛμρSρσ(ΛT) νσ . (77) It follows from the definitions (65) of Lμν and (68) of Jμν that we have the subsidiary Lorentz-transformation rules Lμν 7→ ΛμρΛνσLρσ = ΛμρLρσ(ΛT) νσ , (78) Jμν 7→ ΛμρΛνσJρσ = ΛμρJρσ(ΛT) νσ . (79) Meanwhile, we define the action of spacetime translations on the system's state (X, p, S) solely as (57) for the spacetime coordinates Xμ, with the system's four-momentum pμ and spin tensor Sμν unchanged: Xμ 7→ Xμ + aμ, (80) pμ 7→ pμ, (81) Sμν 7→ Sμν . (82) These definitions then determine the additional translation rules Lμν 7→ Lμν + aμpν − aνpμ, (83) Jμ 7→ Jμν + aμpν − aνpμ. (84) We can then construct general Poincaré transformations from combinations of Lorentz transformations and spacetime translations. One can check that the three-vectors J, K, S, and S defined in (69)–(72) all indeed transform as three-vectors under proper rotations. One can also show that K and S transform as proper vectors under parity transformations (54), K 7→ −K, S 7→ −S, } (parity) (85) whereas J and S are pseudovectors (or axial vectors), meaning that they do not change sign under parity transformations: J 7→ J, S 7→ S. } (parity) (86) If the system's phase space provides a transitive group action of the Poincaré group, then, by construction, every state (X, p, S) can be reached by starting with an arbitrary choice of reference state (X0, p0, S0) and acting with every possible Poincaré transformation (a,Λ): (X, p, S) ≡ (ΛX0 + a,Λp0,ΛS0ΛT). (87) That is, X ≡ ΛX0 + a, (88) p ≡ Λp0, (89) S ≡ ΛS0ΛT, (90) or, displaying indices explicitly, Xμ ≡ ΛμνXν0 + aμ, (91) pμ ≡ Λμνpν0 , (92) Sμν ≡ ΛμρS ρσ 0 (Λ T) νσ . (93) Without loss of generality, we will always take the reference value of the system's spacetime point to be at the origin: Xμ0 ≡ 0. (94) 9 In light of (91), the system's spacetime point Xμ in any other state (X, p, S) can then be identified with the translation-group four-vector aμ: Xμ ≡ aμ. (95) We will choose the reference values pμ0 and S μν 0 in (87) on a case-by-case basis later. D. The Pauli-Lubanski Pseudovector Introducing the totally antisymmetric, four-index Levi-Civita symbol, εμνρσ ≡  +1 for μνρσ an even permutation of txyz, −1 for μνρσ an odd permutation of txyz, 0 otherwise = −εμνρσ, (96) we can form a convenient mathematical object, called the Pauli-Lubanski pseudovector Wμ, by contracting the Lorentz indices of the system's four-momentum pμ and the total angular-momentum tensor Jμν with the indices of εμνρσ [11]: Wμ ≡ −1 2 εμνρσpνJρσ. (97) Decomposing the angular-momentum tensor as in (68) into its orbital (65) and spin (67) contributions, Jρσ = Lρσ + Sρσ = Xρpσ −Xσpρ + Sρσ, the contributions from the orbital-angular momentum tensor Lρσ cancel out of the definition of W μ, so we can replace the total angular-momentum tensor Jρσ with just its spin contribution Sρσ in the formula for W μ: Wμ = −1 2 εμνρσpνSρσ. (98) It follows from a straightforward calculation that we can express the Pauli-Lubanski pseudovector in terms of the spin three-vector S defined in (71), the dual spin threevector S defined in (72), and the components of the system's four-momentum pμ = (E/c,p)μ as Wμ = (p * S, (E/c)S− p× S)μ. (99) The formula (98) makes manifest that the PauliLubanski pseudovector does not involve the spacetime coordinates Xμ, so under translation transformations (80)–(84), it is invariant: Wμ 7→Wμ (spacetime translations). (100) On the other hand, under Lorentz transformations of pν and Sρσ, W μ transforms as Wμ 7→ det(Λ)ΛμνW ν , (101) where det(Λ) is the determinant of Λμν . Hence, under parity transformations Λparity, for which det(Λparity) = −1, Wμ transforms oppositely to the way that ordinary four-vectors transform: W t 7→ −W t, W i = W i (parity). (102) It is because of this transformation behavior that Wμ is called a pseudovector. E. Invariant Quantities of a Transitive Group Action of the Poincaré Group Notice that the quantities p2 ≡ pμpμ, W 2 ≡ WμWμ, and S2 ≡ SμνSμν are invariant under Poincaré transformations, meaning that they are invariant under all Lorentz transformations (whether or not parity and timereversal transformations are involved) as well as under all spacetime translations. These quantities therefore each have a single, constant value for all states in any phase space that constitutes a transitive group action of the Poincaré group, and so, in particular, have constant values along the system's worldline [12]. We name these invariant quantities according to p2 ≡ pμpμ ≡ −m2c2, (103) W 2 ≡WμWμ ≡ w2, (104) 1 2 S2 ≡ 1 2 SμνS μν ≡ s2. (105) The scalar constant m has units of momentum-squared divided by energy (that is, units of mass), the scalar constant w has units of momentum multiplied by energy multiplied by time, and the scalar constant s has units of energy multiplied by time (that is, units of angular momentum). Note that w2 and s2 having fixed values does not imply any sort of quantization, any more than m2 being fixed implies quantization. In our classical context, we are essentially working in the limit of large quantum numbers in which w2 and s2 are invariant but are otherwise permitted to take on any one of a continuous range of possible real values. In terms of the spin three-vector S defined in (71) and the dual spin three-vector S defined in (72), we can write the invariant quantity s2 as s2 ≡ 1 2 SμνS μν = S2 − S2. (106) We can also contract two copies of the spin tensor Sμν with the Levi-Civita symbol (96) to obtain another quantity with the same units as s2: s2 ≡ 1 8 εμνρσS μνSρσ = S * S. (107) This quantity is invariant under spacetime translations and also under proper orthochronous Lorentz transformations. However, in light of the transformation rules 10 (85)–(86), s2 changes by an overall sign under parity transformations, so it is called a pseudoscalar. As was true for the scalar invariant quantities m2, w2, and s2, the pseudoscalar quantity s2 cannot change in value under smooth evolution along the system's worldline. To understand why, observe that if s2 = 0, then it is invariant under parity and time-reversal transformations, whereas if s2 6= 0, then our transitive group action of the Poincaré group can contain only the values ±s2, and no smooth evolution can take the system from s2 > 0 to s2 < 0 or vice versa. (In all our examples, ahead, we will end up finding that s2 = 0.) Classifying the possible systems whose phase spaces provide transitive group actions of the Poincaré group now reduces to selecting mutually consistent values for the invariant quantities m2, w2, s2, and s2, and then choosing a convenient reference state (X0, p0, S0) that is compatible with those fixed values. Note again that the constancy of m2, w2, s2, and s2-including the constancy of the system's invariant spin-squared s2-is entirely classical and has nothing to do with quantization or quantum theory. As an aside, observe that the only other candidate invariant quantities that are derivable from the system's phase-space variables are pμW μ = 0, pμpνS μν = 0, WμWνS μν = 0, WμpνS μν = m2c2s2, εμνρσWμpνSρσ = −2w2. None of these expressions represent fundamentally new quantities independent ofm2, w2, s2, and s2, so we do not need to specify values for them as part of the definition of our transitive group action of the Poincaré group. F. The Generators of the Lorentz Group Observe that the system's phase space (87) is fully parametrized by the values aμ and Λμν that make up the Poincaré transformation (a,Λ), where aμ encodes the system's spacetime location and Λμν encodes the system's motion and angular orientation. Lorentz-transformation matrices are difficult to manipulate directly, due to the constraint ΛTηΛ = η from (50), so we will find it useful to decompose them into simpler ingredients [13]. We start by considering a Lorentz transformation Λ(ε) = 1 + ε that differs only infinitesimally from the identity 1: Λαβ(ε) = δ α β + ε α β . (108) Here εαβ represents a collection of infinitesimal parameters and δαβ is the four-dimensional Kronecker delta, δαβ ≡ { 1 if α = β, 0 if α 6= β, (109) which represents the components of the identity matrix. The constraint ΛTηΛ = η then yields the equation (δαβ + ε α β)ηαγ(δ γ δ + ε γ δ) = ηβδ. Working to first order, we see that the infinitesimal tensor εαβ obtained from εαβ by raising its second index using the Minkowski metric tensor is antisymmetric: εαβ = −εβα. (110) The tensor εαβ therefore has six independent components, with εyz, εzx, εxy respectively parametrizing rotations around the x, y, z axes and with εtx, εty, εtz respectively parametrizing Lorentz boosts in the x, y, z directions. We can write any two-index, antisymmetric Lorentz tensor Aαβ = −Aβα as Aαβ = 1 2 (Aαβ −Aβα) = 1 2 Aμν(δαμδ β ν − δβμδαν ), so the tensors defined by [σμν ] αβ ≡ −iδαμδβν + iδβμδαν (111) form a basis for all two-index, antisymmetric tensors: Aαβ = i 2 Aμν [σμν ] αβ . (112) We can therefore write our infinitesimal Lorentz transformation (108) as Λαβ(ε) = δ α β + i 2 εμν [σμν ] α β , (113) or, in matrix notation, with the free indices α and β suppressed, as Λ(ε) = 1 + i 2 εμνσμν . (114) The tensors [σμν ] α β are called the Lorentz generators and are obtained by lowering the β index in the definition (111) using the Minkowski metric tensor: [σμν ] α β = −iδαμηνβ + iημβδαν . (115) We will often suppress the "additional" α, β indices for notational economy. Note that with our overall sign convention for (115), the Lorentz generators describe active Lorentz transformations (114) in which four-vectors and Lorentz tensors are transformed and our coordinate axes remain fixed. If we instead wish to describe passive Lorentz transformations, then we could either replace σμν 7→ −σμν or εμν 7→ −εμν . By straightforward calculations, one can show that the Lorentz generators satisfy the commutation relations [σμν , σρσ] ≡ σμνσρσ − σρσσμν = iημρσνσ − iημσσνρ − iηνρσμσ + iηνσσμρ, (116) 11 and that the matrix product of two Lorentz generators σμν and σρσ on their additional α, β indices, traced over those additional indices, yields 1 2 Tr[σμνσρσ] ≡ 1 2 [σμν ]αβ [σρσ] β α = δμρ δ ν σ − δμσδνρ (117) = i[σρσ] μν . (118) This last formula implies that antisymmetric tensors Aμν satisfy the identity 1 2 Tr[σμνA] = iAμν . (119) Using this formalism, we can rewrite our system's spin tensor (93) as Sμν = − i 2 Tr[σμνS] = − i 2 Tr[σμνΛS0Λ −1]. (120) G. The Manifestly Covariant Action Functional In the absence of spin, the system's manifestly covariant action functional takes the form (39): Sno spin[X,Λ] = ∫ dλLno spin = ∫ dλ pμẊ μ. (121) Here Xμ(λ) and pμ(λ) are functions of the worldline parameter λ, and dots, as usual, denote derivatives with respect to λ. We will eventually see that this action functional is capable of accommodating particle types regardless of their mass-and, in particular, works just as well for massless particles as it does for particles with nonzero mass. In order to include spin in the system's action functional, we will need to develop a framework for taking derivatives of the variable Lorentz-transformation matrix Λμν(λ) with respect to the worldline parameter λ in a manner that is consistent with the constraint ΛTηΛ = η. To this end, we examine what happens if we shift slightly forward along the system's worldline, so that λ→ λ+ dλ. (122) Then using the fact that successive Lorentz transformations compose, Λ′′ = Λ′Λ, and recalling the formula (114) for a Lorentz transformation that differs infinitesimally from the identity, with dθμν ≡ −εμν corresponding to passive Lorentz-boost and angular parameters, we have Λ(λ+ dλ) = Λ(dλ)Λ(λ) = (1− (i/2)dθμν(λ)σμν)Λ(λ). (123) We can rearrange this formula to obtain the derivative of Λ(λ) with respect to λ in terms of θμν(λ) ≡ dθμν(λ)/dλ: Λ(λ) ≡ Λ(λ+ dλ)− Λ(λ) dλ = − i 2 θμν(λ)σμνΛ(λ). (124) Hence, Λ(λ)Λ−1(λ) = − i 2 θμν(λ)σμν , and so, invoking the trace identity (119), we obtain an important formula for the rates of change θμν(λ) in the Lorentz-transformation parameters: θμν(λ) = i 2 Tr[σμνΛ(λ)Λ−1(λ)]. (125) Despite the factor of i, this expression is purely real, due to the additional factor of i in the definition (111) of σμν . We now look back at the manifestly covariant Lagrangian appearing as the integrand of our action functional (121): Lno spin = pμẊ μ. (126) Using the product rule in reverse (that is, "integration by parts" without an integration), we can move the derivative from Xμ(λ) to pμ(λ) at the cost of an overall minus sign and an additive total derivative that does not affect the system's equations of motion. The result is Lno spin = −Xμṗμ + (total derivative). Remembering that the system's four-momentum pμ(λ) here is fundamentally defined according to (92) in terms of its fixed reference value pμ0 and the variable Lorentztransformation matrix Λμν(λ), pμ(λ) ≡ Λμν(λ)pν0 , and relabeling indices for later convenience, we have Lno spin = −XαΛαγp γ 0 + (total derivative). Invoking (124) for the derivative of the Lorentztransformation matrix yields Lno spin = −Xα ( − i 2 θμν [σμν ] α βΛ β γ ) pγ0 + (total derivative) = 1 2 Xαi[σμν ] α βp β θμν + (total derivative). Recalling our formula (115) for the Lorentz generators [σμν ] α β , this expression simplifies to Lno spin = 1 2 Xα(δ α μηνβ − ημβδαν )pβ θμν + (total derivative) = 1 2 (Xμpν −Xνpμ)θμν + (total derivative). 12 The quantity in parentheses is precisely the system's orbital angular-momentum tensor Lμν , as defined in (65), so we end up with Lno spin = 1 2 Lμν θ μν + (total derivative). (127) The first term in (127) has precisely the form of a canonical momentum contracted with the rates of change of its corresponding canonical coordinates, where the factor of 1/2 naturally prevents the implicit summation from double-counting independent terms in the contraction of the two antisymmetric tensors Lμν = −Lνμ and θμν = −θνμ. It may seem surprising that we have managed to rewrite the system's kinetic Lagrangian Lno spin = pμẊμ in terms of what looks superficially like purely orbital angular momentum, but remember that the temporal components Lti of the orbital angular-momentum tensor are not angular momenta-in light of (66), they actually encode linear motion. Including the system's spin in the dynamics means generalizing the orbital angular-momentum tensor Lμν in (127) to the total angular-momentum tensor Jμν defined in (68), Lμν 7→ Jμν ≡ Lμν + Sμν , where Sμν is the system's spin tensor. The system's manifestly covariant Lagrangian correspondingly becomes Lno spin 7→ L ≡ 1 2 Jμν θ μν + (total derivative) = 1 2 Lμν θ μν + 1 2 Sμν θ μν + (total derivative). (128) At this point, we are free to recombine the first and last terms to get back the expression pμẊ μ that we started with. On the other hand, contracting both sides of our formula (125) for θμν with the system's spin tensor Sμν and using (i/2)Sμν [σ μν ]αβ = S α β from (112), we can write the second term in (128) as 1 2 Sμν(λ)θ μν(λ) = 1 2 Tr[S(λ)Λ(λ)Λ−1(λ)]. (129) Hence, as originally shown in [1, 14–16], the complete action functional for the system is S[X,Λ] = ∫ dλL = ∫ dλ ( pμẊ μ + 1 2 Tr[SΛΛ−1] ) . (130) In using the action functional (130), keep in mind that the four-momentum pμ(λ) and the spin tensor Sμν(λ) are given respectively by (92) and (120) in terms of their constant reference values pμ0 and S μν 0 together with the variable Lorentz-transformation matrix Λμν(λ): pμ(λ) ≡ Λμν(λ)pν0 , (131) Sμν(λ) ≡ Λμρ(λ)S ρσ 0 (Λ T) νσ (λ) = − i 2 Tr[σμνΛ(λ)S0Λ −1(λ)]. (132) Consequently, before the equations of motion are imposed, neither pμ(λ) nor Sμν(λ) depends on the spacetime degrees of freedom Xμ(λ). H. The Equations of Motion To obtain the system's equations of motion, we apply the extremization condition (9) by varying the action functional (130) with respect to its fundamental variables Xμ and Λμν . The spin term (1/2)Tr[SΛΛ −1] does not involve the spacetime coordinatesXμ, so varying the action functional with respect to Xμ yields δXS = ∫ dλ (pμδẊ μ + 0) = ∫ dλ pμ d dλ δXμ = − ∫ dλ ṗμδX μ, where we have dropped a boundary term. Setting this variation equal to zero for arbitrary δXμ leads to the system's first equation of motion, which we see describes conservation of energy-momentum: ṗμ = 0. (133) Notice that this equation of motion, by itself, does not determine the system's four-velocity Ẋμ ≡ dXμ/dλ, or even establish any sort of relationship between pμ and Ẋμ. We will return to this issue later. Varying the action functional with respect to the variable Lorentz-transformation matrix Λμν is more complicated, due to its appearance in both terms in the integrand. As our first step, we find δΛS = ∫ dλ ( (δpμ)Ẋμ + 1 2 Tr[δ(SΛΛ−1)] ) . (134) Invoking our formula (131) for the four-momentum pμ in terms of its reference value pμ0 and the Lorentztransformation matrix Λμν , the first term in (134) gives (δpμ)Ẋμ = (δΛ μ ν)p ν 0Ẋμ = (−(i/2)δθρσσρσΛ)μνpν0Ẋμ = − i 2 δθρσ[σρσ] μ νp νẊμ = − i 2 δθρσ(−iδμρ ησν + iηρνδμσ)pνẊμ = 1 2 (−Ẋρpσ + Ẋσpρ)δθρσ. Meanwhile, using Sαβ = (ΛS0Λ −1)αβ , the second term 13 in (134) gives 1 2 Tr[δ(SΛΛ−1)] = 1 2 Tr[S0δ(Λ −1Λ)] = 1 2 Tr[S0Λ −1(−(i/2)δθρσσρσ)Λ] = − i 4 Tr[S0Λ −1σρσΛ]δθ ρσ = 1 2 Sρσ d dλ δθρσ, where we have invoked (132) in the last step. Thus, dropping a boundary term, we see that the overall variation (134) in the action functional reduces to δΛS = ∫ dλ 1 2 (−Ẋρpσ + Ẋσpρ − Ṡρσ)δθρσ. Setting this variation equal to zero for arbitrary δθρσ leads to the system's second equation of motion: Ṡμν = −Ẋμpν + Ẋνpμ. (135) To provide an interpretation for this equation of motion, we recall again the definition (65) of the tensor Lμν that encodes the system's orbital angular momentum: Lμν ≡ Xμpν −Xνpμ. Because the system's four-momentum pμ is conserved, (133), we see that the rate of change in Lμν is given by Lμν = Ẋμpν − Ẋνpμ, (136) so we can recast the equation of motion (135) for the spin tensor Sμν as the statement that the system's total angular momentum Jμν ≡ Lμν + Sμν is conserved: Jμν = 0. (137) Combining ṗμ = 0 and Jμ = 0, it follows immediately that the system's Pauli-Lubanski pseudovector (97) is likewise constant in time: Ẇμ = 0. (138) At a deeper level, the system's two equations of motion (133), ṗμ = 0, and (137), Jμν = 0, are consequences of Noether's theorem together with the fact that the system's action functional (130) has continuous symmetries under spacetime translations and Lorentz transformations. I. Self-Consistency Conditions on the Phase Space Now that we know the system's equations of motion, we will need to ensure that they are consistent with the invariance of the fixed quantities m2, w2, s2, and s2 from (103)–(107). For our first check of self-consistency, we note that the invariance of p2 = −m2c2 is compatible with the equation of motion (133), ṗμ = 0: d dλ (p2) = 2pμṗ μ = 0. (139) Similarly, the constancy of W 2 = w2 is compatible with the constancy (138) of the Pauli-Lubanski pseudovector: d dλ (W 2) = 2WμẆ μ = 0. (140) On the other hand, the constancy of the spin-squared scalar (1/2)SμνS μν ≡ s2, combined with the equation of motion (135), Ṡμν = −Ẋμpν + Ẋνpμ, requires that d dλ ( 1 2 SμνS μν ) = Sμν Ṡ μν = 2ẊνpμSμν = 0. (141) Again, keep in mind that we have not yet established a definite relationship between the system's fourmomentum pμ and its four-velocity Ẋμ ≡ dXμ/dλ. In particular, it is not clear at this point whether or not pμ is proportional to Ẋμ, so the condition (141) is not trivial. Because the condition (141) must hold for all solution trajectories, it imposes an additional requirement on the system's phase space: The system's reference fourmomentum pμ0 and its reference spin tensor S μν 0 must satisfy p0,μS μν 0 = 0, (142) where because this contraction vanishes in one inertial reference frame, it remains zero under all Poincaré transformations and therefore represents a Poincaré-invariant statement about the system's phase space [17]: pμS μν = 0. (143) Our final self-consistency condition is that the derivative of the pseudoscalar invariant quantity (1/8)εμνρσS μνSρσ ≡ s2 must vanish: d dλ ( 1 8 εμνρσS μνSρσ ) = 1 4 εμνρσṠ μνSρσ = −1 2 εμνρσẊ μpνSρσ = ẊμWμ = 0. (144) We will need to verify in the explicit examples ahead that this condition is indeed satisfied. J. The Four-Velocity The self-consistency condition (143), pμS μν = 0, will play an important role in our work ahead. As we will now investigate, its implications include a general set of 14 relationships between the system's four-momentum pμ and its four-velocity Ẋμ. Taking a derivative of both sides of pμS μν = 0 with respect to the worldline parameter λ and invoking the equations of motion (133), ṗμ = 0, and (135), Ṡμν = −Ẋμpν + Ẋνpμ, we obtain pμṠ μν = −(p * Ẋ)pν + (−m2c2)Ẋν = 0, (145) which gives us an equation that relates pμ and Ẋμ: (p * Ẋ)pμ = (−m2c2)Ẋμ. (146) Contracting both sides with Ẋμ, we find (p * Ẋ)2 = m2c2(−Ẋ2), (147) and thus we arrive at the following pair of equations: p * Ẋ = ±mc2 √ −Ẋ2/c2, (148) m √ −Ẋ2/c2 pμ = ∓m2Ẋμ. (149) V. CLASSIFICATION OF THE TRANSITIVE GROUP ACTIONS OF THE ORTHOCHRONOUS POINCARÉ GROUP We are now ready to apply the preceding framework to classifying systems whose phase spaces provide transitive group actions of the Poincaré group. For simplicity, we will focus our attention on transitive group actions of the orthochronous Poincaré group, putting aside timereversal transformations (55) until our paper's conclusion. Notice then that for m2 ≥ 0, the system's fourmomentum pμ is either timelike or null, p2 ≤ 0, and so (56) implies that the sign of pμ is an invariant property of the system. When we consider transitive group actions having m2 ≥ 0, we will assume the positive-energy case pt > 0 on physical grounds. We will address the "negative-energy" case pt < 0 in our conclusion. A. Massive, Positive-Energy Particles As our first example, we consider a transitive group action of the orthochronous Poincaré group for which m > 0 is real and positive and the system's energy E = ptc > 0 is likewise positive. Then pμ is a timelike four-vector, so we know from (56) that the sign of pt is invariant under orthochronous Lorentz transformations and thus our choice of positive energy is well-defined. Given that p2 = −m2c2 for m > 0 with positive pt, we can express the system's energy E = ptc in terms of its three-dimensional momentum p = (px, py, pz) as E = √ p2c2 +m2c4, (150) a formula known as the system's mass-shell relation because it takes the visual form of a hyperboloid (a "shell") when plotted in terms of the four variables E, px, py, pz. Furthermore, there exists a state of the system in which the four-momentum pμ takes the specific value (mc,0)μ, which we will choose to be its reference value: pμ0 ≡ (mc,0)μ = mc δ μ t . (151) Due to the condition m > 0, the four-momentum pμ cannot vanish, and under our assumption of a strictly monotonic parametrization Xμ(λ), the four-velocity Ẋμ cannot vanish either, so the relation (149), m √ −Ẋ2/c2 pμ = ∓m2Ẋμ, implies that Ẋ2 6= 0. We therefore have pμ = m Ẋμ√ −Ẋ2/c2 , where we have taken the positive sign by choosing our parametrization Xμ(λ) such that Ẋμ is future-directed. We therefore learn that the system's four-momentum pμ is given by pμ = muμ, (152) where uμ is the system's normalized four-velocity: uμ ≡ Ẋ μ√ −Ẋ2/c2 , u2 = −c2. (153) We can interpret the equation (152) as supplying our definition of Ẋμ (or uμ) in terms of pμ and m. Furthermore, because pμ is parallel to uμ, we see that the self-consistency condition (144), ẊμWμ = 0, is satisfied. As a consequence of (153), we also see that when the system is in its reference state with pμ = pμ0 = (mc,0) μ, the four-velocity describes the system at rest, with uμ0 = (c,0) μ = uμrest. (154) For general states, the equation of motion (133) for the system's four-momentum, ṗμ = 0, tells us that the system's normalized four-velocity is constant, uμ = 0, (155) so the system describes a pointlike particle that travels along a straight, timelike path in spacetime. Defining the particle's three-dimensional velocity v = (vx, vy, vz) as v ≡ dX dt = Ẋ Ṫ , (156) and using (152), pμ = muμ, together with E = ptc and the mass-shell relation (150) between E and p, we also 15 obtain an important equation connecting the system's three-dimensional velocity v and its three-dimensional momentum p: v = pc2 E = p |p| c√ 1 +m2c2/p2 . (157) We see right away from this equation that the particle's speed |v| is always slower than the speed of light c: |v| < c. (158) Moreover, when the particle is in motion, its normalized four-velocity is uμ = (γc, γv)μ, (159) where the Lorentz factor γ is defined by γ ≡ 1√ 1− v2/c2 ≥ 1. (160) We next examine the particle's orbital and spin angular momentum. The relation (152), pμ = muμ = mẊμ/ √ −Ẋ2/c2, immediately implies that the particle's orbital angular momentum (65) is conserved: Lμν = Ẋμpν − Ẋνpμ = 0. (161) Remembering our formula (66) for the temporal components Lti of the orbital angular-momentum tensor, Lti = −E c ( Xi − p ic2 E T ) , and invoking the constancy of E and pi from the equation of motion (133) for pμ, we see that Lti = 0 gives the relation pc2 E = Ẋ Ṫ , which is just our earlier equation (157) connecting the particle's three-dimensional velocity v to its threedimensional momentum p [18]. Combining the conservation equation (161) for the particle's orbital angular-momentum tensor Lμν with the equation of motion (135) for the particle's spin tensor Sμν tells us that the particle's spin is separately conserved: Ṡμν = 0. (162) Furthermore, the condition (142), p0,μS μν 0 = 0, becomes mcStν0 = 0, (163) so only the purely spatial components of the particle's reference spin tensor Sμν0 are nonzero, Sμν0 = 0 0 0 00 0 S0,z −S0,y0 −S0,z 0 S0,x 0 S0,y −S0,x 0  μν , (164) where the particle's spin three-vector S ≡ (Syz, Szx, Sxy) was defined in (71). Thus, the invariant quantity s2 defined in (106) and characterizing the system's overall spin is non-negative: s2 = S2 − S2 = S20 = S 2 0,x + S 2 0,y + S 2 0,z ≥ 0. (165) The corresponding reference value Wμ0 of the PauliLubanski pseudovector (98) is then Wμ0 = (0,mcS0) μ. (166) The Lorentz dot product of Wμ with itself therefore has the non-negative, Lorentz-invariant value W 2 ≡ w2 = m2c2s2 ≥ 0. (167) Notice that the reference value of the particle's dual spin three-vector S ≡ (Stx, Sty, Stz), as defined in (72), vanishes in this case: S0 = 0. (168) It follows that the pseudoscalar invariant quantity s2 defined in (107) likewise vanishes: s2 = S * S = S0 * S0 = 0. (169) On physical grounds, a localized system at fixed energy should have a compact (that is, closed and bounded) set of states, because otherwise its Boltzmann entropy under any equitable choice of coarse-graining of the system's phase space would be infinite and thus the system would exhibit an infinite heat capacity [19]. The compactness of a system's phase space at fixed energy in any one inertial reference frame determines the compactness of the system's phase space in any other inertial reference frame at the correspondingly Lorentz-transformed energy, so it suffices to study the compactness of our particle's phase space at the fixed reference energy E0 = p t 0c = mc 2 corresponding to the reference value (151) of the particle's four-momentum. The size of this subset of the particle's phase space is determined by the set of all orthochronous Poincaré transformations that leave the particle's reference four-momentum pμ0 ≡ (mc,0)μ fixed. This collection of transformations is called the little group of pμ0 . In the present case, in which pμ0 = (mc,0) μ, this little group consists solely of the group O(3) of three-dimensional rotations and parity transformations, which collectively form a compact set, so we are assured that the particle's phase space at any fixed energy is likewise compact, as required. To summarize, we see that a transitive group action of the orthochronous Poincaré group for the case of a real and positive m > 0 and positive energy E = ptc > 0 describes a massive pointlike particle of inertial mass m, non-negative spin-squared s2 = S20 ≥ 0, non-negative squared Pauli-Lubanski pseudovector w2 = m2c2s2 ≥ 0, and timelike four-momentum pμ = muμ. The particle 16 moves along a straight worldline in spacetime characterized by a normalized four-velocity uμ ≡ Ẋμ/ √ −Ẋ2/c2 and a three-dimensional velocity v = pc2/E that is always slower than the speed of light, |v| < c, and the particle has a compact phase space at any fixed value of its energy E. B. Massless, Positive-Energy Particles As our second example, we consider the case of m = 0 and positive energy E = ptc > 0. Because the system's four-momentum pμ is therefore null, p2 = 0, we again have from (56) that the condition pt > 0 is invariant under orthochronous Lorentz transformations and thus our positivity condition on E is well-defined. We can use p2 = 0 to express the system's energy E = ptc in terms of its three-dimensional momentum p as the mass-shell relation E = |p|c. (170) There exists a state in the system's phase space in which the four-momentum pμ has no x or y components, and we take that value of the four-momentum to be its reference value: pμ0 ≡ (E0/c, 0, 0, E0/c)μ = E0 c (δμt + δ z t ). (171) The positive-energy condition E > 0 implies that the four-momentum pμ cannot vanish, and under our assumption of a strictly monotonic parametrization Xμ(λ), the four-velocity Ẋμ also cannot vanish. With m = 0, the relation (148) degenerates to p * Ẋ = 0. We can therefore take the four-velocity Ẋμ to be a null vector that is parallel to the four-momentum pμ, pμ ∝ Ẋμ, (172) which then ensures that the self-consistency condition (144), ẊμWμ = 0, is satisfied. The equation of motion (133), ṗμ = 0, implies that pμ is constant along the system's worldline, so we can always choose our parametrization Xμ(λ) to make the proportionality factor in (172) equal to a constant: pμ = (const)Ẋμ. (173) We then have Ẍμ = 0, (174) so we see that the system describes a pointlike particle that travels along a straight, null path in spacetime. In addition, invoking the mass-shell relation (170) between the particle's energy E and its three-dimensional momentum p, we see that the particle's threedimensional velocity v is related to its three-dimensional momentum p according to v = dX dt = Ẋ Ṫ = pc2 E = p |p| c. (175) Hence, the particle's speed |v| is always equal to the speed of light c: |v| = c. (176) Turning to the particle's spin, we will find a much more nuanced story than in the massive case. The proportionality relationship (172) together with the equation of motion (135) for the particle's spin tensor Sμν again imply that the particle's angular momentum (65) and the particle's spin are separately conserved, Lμν = Ẋμpν − Ẋνpμ = 0, (177) Ṡμν = 0. (178) As in the massive case, the conservation law for Lti gives back the formula (175) relating the particle's threedimensional velocity v to its three-dimensional momentum p. However, the condition (142), p0,μS μν 0 = 0, is more complicated than it was in the massive case: −E0 c Stν0 + E0 c Szν0 = 0. (179) This equation implies that Stν0 = S zν 0 , (180) or, equivalently, that the quantities A ≡ Sx + Sy, B ≡ Sy−Sx, and Sz all vanish in the particle's reference state: A0 ≡ S0,x + S0,y = 0, B0 ≡ S0,y − S0,x = 0, S0,z = 0.  . (181) The reference value of the system's spin tensor is therefore Sμν0 =  0 S0,y −S0,x 0−S0,y 0 S0,z −S0,yS0,x −S0,z 0 S0,x 0 S0,y −S0,x 0  μν . (182) In other words, the reference values of the particle's spin three-vector S ≡ (Syz, Szx, Sxy), as defined in (71), and the reference value of the particle's dual spin three-vector S ≡ (Stx, Sty, Stz), as defined in (72), are mutually perpendicular and are related explicitly by S0 = S0 × ez, (183) where ez ≡ (0, 0, 1) is the Cartesian unit vector pointing along the positive z axis. It follows that the pseudoscalar 17 invariant quantity s2 defined in (107) vanishes, as we also saw was true in the massive case: s2 = S * S = S0 * S0 = 0. (184) Meanwhile, the invariant quantity s2 defined in (106) is non-negative, as in the massive case, but is now determined solely by the z component S0,z of the reference value of the particle's spin three-vector S0: s2 = S2 − S2 = S20,z ≥ 0. (185) In general, the projection of the particle's spin threevector S onto the particle's three-dimensional momentum p ≡ (px, py, pz) is called the particle's helicity σ: σ ≡ p |p| * S. (186) The massless particle's helicity is insensitive to our reference choice of energy E0 and is invariant under proper rotations, so we see that σ represents a fundamental feature of the particle in the m = 0 case that can only change under parity transformations (54): σ 7→ −σ (parity). (187) We can use σ to write our expression (185) for the invariant quantity s2 as s2 = σ2 ≥ 0. (188) The reference value Wμ0 of the particle's PauliLubanski pseudovector (98) is parallel to the particle's reference four-momentum (171): Wμ0 = ( S0,z E0 c , 0, 0, S0,z E0 c ) = S0,zp μ 0 . (189) More generally, Wμ is given in terms of the particle's helicity (186) by Wμ = σpμ. (190) As a consequence, we see that the invariant quantity w2 defined in (104) vanishes: W 2 ≡ w2 = 0. (191) As in the massive case, we will need to examine the compactness of the subset of the particle's phase space at the fixed reference energy E0 = p t 0c. Again, this subspace is determined by the little group of the particle's reference four-momentum (171), meaning the set of all orthochronous Poincaré transformations that leave pμ0 ≡ (E0/c, 0, 0, E0/c)μ invariant. As a trick for finding these little-group transformations [20], let Λ be a little-group transformation, so that Λp0 = p0, and let v μ ≡ (1,0)μ be a purely timelike four-vector. Then (Λv) * p0 = −(Λv)t E0 c + (Λv)z E0 c also = (Λv) * (Λp0) = v * p0 = − E0 c , from which we conclude that (Λv)t = 1 + (Λv)z and thus that (Λv)μ has the form (Λv)μ = (1 + ζ, α, β, ζ)μ for real-valued parameters α, β, and ζ, where the normalization condition (Λv)2 = v2 = −1 implies that ζ = α2 + β2 2 . The effect of the little-group Lorentz-transformation matrix Λ on vμ ≡ (1,0)μ fixes Λ up to an overall threedimensional rotation, and the little-group requirement Λp0 = p0 further fixes Λ up to a rotation specifically around the z axis. Hence, the most general such Lorentztransformation matrix Λ has the form Λ(θ, α, β) = R(θ)L(α, β), (192) where R(θ) ≡ 1 0 0 00 cos θ sin θ 00 − sin θ cos θ 0 0 0 0 1  (193) is a pure rotation by an angle θ around the z axis and where L(α, β) ≡ 1 + ζ α β −ζα 1 0 −αβ 0 1 −β ζ α β 1− ζ  (194) is a complicated combination of Lorentz boosts and rotations satisfying the required condition ΛTηΛ = η from (50). By straightforward calculations, one can show that R(θ1)R(θ2) = R(θ1 + θ2), (195) L(α1, β1)L(α2, β2) = L(α1 + α2, β1 + β2), (196) so rotationsR(θ) around the z axis and the Lorentz transformations L(α, β) respectively form a pair of commutative subgroups of the particle's little group. Furthermore, we have R(θ)L(α, β)R−1(θ) = L(α cos θ + β sin θ,−α sin θ + β cos θ), (197) so we see that rotating L(α, β) itself around the z axis has the effect of rotating the two-dimensional vector (α, β). The little group is therefore the group ISO(2) of translations and rotations in the two-dimensional Euclidean plane. The subgroup SO(2) consisting purely of rotations R(θ) in the two-dimensional plane is compact, but the subgroup R2 consisting of two-dimensional translations 18 L(α, β) is noncompact. The consequence is that the particle's phase space at the fixed reference four-momentum pμ0 would seem to be noncompact as well, leading to the thermodynamic problems that we discussed earlier, as well as various issues that arise in the corresponding quantum field theory, such as those that are explored in [21], for example [22]. The particle's reference spacetime coordinates Xμ0 ≡ 0, four-momentum pμ0 ≡ (E0/c, 0, 0, E0/c)μ, helicity σ = S0,z, and Pauli-Lubanski pseudovector W μ 0 = σp μ 0 , are all invariant under the little group, and are therefore insensitive to the noncompact transformations L(α, β). But the particle's reference spin tensor (182) transforms nontrivially under the action of L(α, β): L(α, β)S0L T(α, β) = S0 +  0 −βS0,z αS0,z 0βS0,z 0 0 βS0,zαS0,z 0 0 −αS0,z 0 −βS0,z αS0,z 0  . (198) Notice that the discrepant spin components represented by the matrix in the second term are guaranteed to be perpendicular to the particle's three-velocity p0 = (0, 0, E0/c) by the invariance of the helicity σ ≡ (p/|p|) * S, as defined in (186). Hence, the only way to ensure that the particle's phase space at the fixed reference energy E = pt0c is compact is to institute an equivalence relation in which we declare that two states (X0, p0, S0) and (X0, p0, S ′) that differ solely in their spin components are to be regarded as the same physical state: (X0, p0, S0) ∼= (X0, p0, S′). (199) This equivalence relation immediately generalizes to arbitrary states as (X, p, S) ∼= (X, p, S′), (200) meaning that the two states have the same spacetime coordinates Xμ and four-momentum pμ.[23] The equivalence relation (200), another important example of a gauge invariance, is a new result, and it naturally extends to the particle's entire phase space by acting on it with orthochronous Poincaré transformations. A space with an equivalence relation is known as a quotient space, and so we see that the phase space of a massless m = 0 particle with nonzero spin s2 6= 0 is a quotient space under the gauge invariance (200). All physical observables must therefore be gauge invariant, as is indeed the case for the particle's spacetime coordinates Xμ, its four-momentum pμ, its helicity σ, and its Pauli-Lubanski pseudovector Wμ = σpμ. By contrast, components of the particle's spin tensor Sμν that are perpendicular to the particle's three-momentum p-such as Syz = Sx and Szx = Sy if p points along the z direction-are not gauge invariant, and are consequently not physical observables. As an aside, we note that in the counterpart quantum theory, spin components that are perpendicular to the particle's direction of motion correspond to linear polarizations that are longitudinal, meaning that they are parallel to the particle's direction of motion. Accordingly, spin components that are parallel to the particle's direction of motion correspond to transverse linear polarizations. So in the quantum version of this story, gaugeinvariant observables are those that are insensitive to the particle's longitudinal linear polarizations. Summarizing our results, we see that a transitive group action of the orthochronous Poincaré group with m = 0 and positive energy E = ptc > 0 describes the phase space of a massless particle with null four-momentum pμ, helicity σ, non-negative spin-squared s2 = σ2 ≥ 0, and a null Pauli-Lubanski pseudovector Wμ = σpμ. The particle moves at the speed of light c along a null worldline in spacetime with null four-velocity Ẋμ, and the particle's spin tensor Sμν is uniquely defined only up to gauge transformations Sμν 7→ S′μν for which S′μν differs from Sμν solely by components perpendicular to the particle's three-momentum p. This gauge invariance has nontrivial implications for interactions that the particle can have with other systems, as any such interactions must be insensitive to quantities that are not gauge invariant. Interaction terms involving the particle's four-momentum pμ or PauliLubanski pseudovector Wμ = σpμ would both be permitted, although they get weak for small momentum, corresponding in quantum mechanics to large distances. We therefore anticipate that massless particles with classically large total spin s ~ cannot mediate long-range interactions, and, indeed, a quantum version of our classification of particle-types suggests that long-range interactions are mediated only by massless particles with total spin less than or equal to 2~ [24]. C. The Massless Limit It is an enlightening exercise to re-examine the massless case m = 0 from the perspective of the massive case m > 0 in the limit m→ 0. Along the way, we will provide a deeper explanation for the emergence of gauge invariance, as well as derive a classical version of the Higgs mechanism. To start, notice that our original choice (151) of reference four-momentum in the massive case, pμ0 ≡ (mc,0)μ, does not have an appropriate massless limit. But our choice of reference four-momentum is entirely arbitrary apart from the condition that p2 = −m2c2 from (103), so we can instead choose it to be pμ ≡ (pt, 0, 0, pz)μ = ( √ (pz)2 +m2c2, 0, 0, pz)μ. (201) The massless limit m → 0 of this reference fourmomentum replicates the reference four-momentum (171) that we chose for the case of a massless particle: lim m→0 pμ = (E0/c, 0, 0, E0/c) μ, E0 ≡ pzc. (202) 19 Moreover, the choice (201) is related to our original reference four-momentum (151), pμ0 = (mc,0) μ, by a simple Lorentz boost Λ along the z direction, pμ = Λμνp ν 0 , (203) where Λ ≡  pt mc 0 0 pz mc 0 1 0 0 0 0 1 0 pz mc 0 0 pt mc  . (204) It follows that the new reference value Sμν of the massive particle's spin tensor is related to its old reference value Sμν0 from (164) according to Sμν ≡ (ΛS0ΛT)μν =  0 pz mc S0,y − pz mc S0,x 0 − p z mc S0,y 0 S0,z − pt mc S0,y pz mc S0,x −S0,z 0 pt mc S0,x 0 pt mc S0,y − pt mc S0,x 0  μν . (205) Both pt and pz approach the finite, nonzero value E0/c > 0 in the massless limit m→ 0, so the components of Sμν that involve factors of pt/mc or pz/mc diverge in that limit. Furthermore, the particle's spin-squared scalar s2 continues to have its invariant value (165), which, despite remaining well-defined in the limit m→ 0, does not end up agreeing with the corresponding massless particle's spin-squared scalar (185): s2 = S20,x + S 2 0,y + S 2 0,z (massive) 6= S20,z (massless). (206) On the other hand, the new reference value Wμ of the particle's Pauli-Lubanski pseudovector is related to its old reference value Wμ0 ≡ (0,mcS0)μ from (166) according to Wμ = ΛμνW μ 0 = (pz S0,z,mcS0,x,mcS0,y, p t S0,z) μ. (207) This expression has a well-defined massless limit that precisely agrees with the reference value (189) of the PauliLubanski pseudovector for a massless particle: lim m→0 Wμ = ( S0,z E0 c , 0, 0, S0,z E0 c )μ . (208) To make contact with the massless case, we can therefore focus our efforts on the spin tensor (205). An important hint is the discrete discrepancy (206) between the spin-squared scalar s2 in the massive and massless cases, signaling that the massive case features spin degrees of freedom that need to be removed before taking the massless limit. As we will see, removing these extraneous spin degrees of freedom will require formally enlarging our massive particle's phase space while simultaneously introducing a compensating equivalence relation to ensure that we are not adding any physically new states to the system, in close correspondence with an analogous construction in quantum field theory whose origins go back to the work of Stueckelberg in [25]. We will then be able to isolate and eliminate the extraneous spin degrees of freedom, and we will end up finding that the equivalence relation will become the gauge invariance (200) in the massless limit. We begin by redefining the x and y components of the reference value S = (Sx, Sy, Sz) of the massive particle's spin three-vector according to( Sx Sy ) 7→ mc pt ( Sx + p tφx Sy + p tφy ) = mc pt ( Sx Sy ) +mc ( φx φy ) , (209) where φx(λ) and φy(λ) are arbitrary new functions on the particle's worldline. The particle's spin tensor (205) is then Sμν =  0 pz pt S0,y − pz pt S0,x 0 − p z pt S0,y 0 S0,z −S0,y pz pt S0,x −S0,z 0 S0,x 0 S0,y −S0,x 0  μν +  0 p zφy −pzφx 0 −pzφy 0 0 −ptφy pzφx 0 0 p tφx 0 ptφy −ptφx 0  μν , (210) where the various factors of m, c, pt, and pz have been chosen in the redefinition (209) to ensure that the two tensors appearing in (210) separately satisfy the fundamental condition pμ(* * * )μν = 0 from (143). The particle's spin-squared scalar s2 now becomes s2 = ( 1− ( pz pt )2)( (S0,x + p tφx) 2 + (S0,y + p tφy) 2 ) + S20,z. (211) Notice that the particle's spin tensor (210) is invariant under the simultaneous transformations( Sx Sy ) 7→ ( Sx Sy ) − pt ( fx fy ) , (212)( φx φy ) 7→ ( φx φy ) + ( fx fy ) , (213) where fx(λ), fy(λ) are arbitrary functions on the particle's worldline. We claim that our massive particle's 20 original phase space, with states denoted by (X, p, S), is equivalent to a formally enlarged phase space consisting of states (X, p, S, φ) under the equivalence relation (X, p, S, φ) ∼= (X, p, S − ptf, φ + f), suitably generalized from the reference state (X, p, S, φ) to general states (X, p, S, φ) of the system. To see why, observe that the specific choice ( fx fy ) ≡ − ( φx φy ) (214) makes clear that the state (X, p, S, φ) is equivalent to the state (X, p, S+ ptφ, 0), which gives us back the state (X, p, S) after undoing the redefinition (209) of Sμν . The system's redefined spin tensor (210) now has a nice massless limit, lim m→0 Sμν =  0 S0,y −S0,x 0−S0,y 0 S0,z −S0,yS0,x −S0,z 0 S0,x 0 S0,y −S0,x 0  μν + E c  0 φy −φx 0−φy 0 0 −φyφx 0 0 φx 0 φy −φx 0  μν , (215) as does the particle's spin-squared scalar (211), lim m→0 s2 = S20,z. (216) Our system fundamentally has the same number of degrees of freedom as it had before we took the massless limit, but we see that the degrees of freedom describing spin components perpendicular to the particle's reference three-momentum p no longer contribute to the particle's spin-squared scalar s2, which agrees with the spin-squared scalar (185) of the massless case. If we now remove the spin degrees of freedom φx, φy by setting them equal to zero, then the particle's spin tensor (215) reduces to the reference value of the massless spin tensor (182), and our equivalence relation (212) reduces to the gauge invariance (200). Notice that if we run all the arguments of this section in reverse, then we can convert a massless particle with spin into a massive particle by introducing additional spin degrees of freedom. We therefore obtain a classical version of the celebrated Higgs mechanism. D. Tachyons The case m2 < 0 is also interesting. The invariant quantity m is now purely imaginary and is therefore of the form m = iμ for a real constant μ. The system's four-momentum pμ is spacelike, p2 = μ2c2 > 0, so its temporal component pt does not have a definite sign under orthochronous Lorentz transformations and we cannot impose a positivity condition on the system's energy. We can use p2 = μ2c2 to express the system's energy E = ptc in terms of its three-dimensional momentum p as the mass-shell relation E = √ p2c2 − μ2c2. (217) For convenience, we will take the system's reference fourmomentum to be pμ0 ≡ (0, 0, 0, μc)μ = μc δμz . (218) Once again, the four-momentum pμ and the fourvelocity Ẋμ are non-vanishing, and so the relation (149), m √ −Ẋ2/c2 pμ = ∓m2Ẋμ, becomes √ −Ẋ2/c2 pμ = ∓iμẊμ. (219) Because the right-hand side is imaginary, this equality implies that Ẋ2 > 0, so the four-velocity Ẋμ is likewise spacelike and is related to the four-momentum pμ by pμ = μ Ẋμ√ Ẋ2/c2 , (220) where we have taken the positive sign by assuming that our parametrization Xμ(λ) points in the positive direction along pμ. This relation between pμ and Ẋμ again ensures that the self-consistency condition (144), ẊμWμ = 0, is satisfied. The equation of motion (133) for the system's fourmomentum, ṗμ = 0, then tells us that the system's path has a fixed, spacelike direction in spacetime, and a calculation of the system's three-dimensional velocity v using the mass-shell relation (217) yields the result v = dX dt = Ẋ Ṫ = pc2 E = p |p| c√ 1− μ2c2/p2 . (221) Hence, the system's speed |v| is always greater than the speed of light c: |v| > c. (222) Such a system is appropriately called a tachyon, from the Greek for "swift." By the same reasoning as in the massive and massless cases, a tachyon's orbital and spin angular momenta are separately conserved, Lμν = 0, (223) Ṡμν = 0. (224) The condition (142), p0,μS μν 0 = 0, now gives μcSzν0 = 0, (225) 21 so the reference value of the system's spin tensor is Sμν0 =  0 S0,x S0,y 0 −S0,x 0 S0,z 0 −S0,y −S0,z 0 0 0 0 0 0  μν . (226) The system's spin-squared scalar (106) and spin-squared pseudoscalar (107) have respective values s2 = S20,z − S20,x − S20,y, (227) s2 = 0, (228) and the reference value of the system's Pauli-Lubanski pseudovector (98) is Wμ0 = μc(S0,z, S0,y,−S0,x, 0)μ. (229) The little group of orthochronous Poincaré transformations that preserve the value of the reference fourmomentum (218), pμ0 ≡ (0, 0, 0, μc)μ, and therefore describes the set of all states that share that same fourmomentum, includes rotations around the z axis as well as Lorentz boosts along the x and y directions. If the system is to have a compact set of states at any fixed four-momentum, then its spin tensor (226) and PauliLubanski pseudovector (229) must be invariant under these noncompact Lorentz transformations. However, we see right away that Wμ0 transforms nontrivially under Lorentz transformations along the x or y directions if any of its components are nonzero, so our system's phase space at fixed four-momentum can be compact only if all the components of Wμ0 vanish: S0,z = 0, S0,x = 0, S0,y = 0.  (230) The tachyon's spin tensor and Pauli-Lubanski pseudovector therefore vanish identically, Sμν = 0, Wμ = 0, } (231) so s2 = 0, W 2 ≡ w2 = 0, } (232) and we see that a tachyon cannot have any intrinsic spin at all. E. The Vacuum Finally, we consider the case in which pμ0 = 0, in which case the system's four-momentum vanishes for all the system's possible states: pμ = 0. (233) The system then has no energy or momentum. The kinetic term pμẊ μ in the system's action functional (130) vanishes, and we do not get a meaningful equation describing the behavior of Xμ(λ). The system's orbital angular momentum vanishes, Lμν = 0, (234) and its spin angular momentum is conserved, Ṡμν = 0. (235) The little group of Poincaré transformations that leave pμ0 = 0 invariant consists of all Poincaré transformations, and so the only way to obtain a compact phase space at fixed four-momentum is for the spin tensor to vanish for all the system's states: Sμν = 0. (236) We conclude that our system is entirely devoid of energy, momentum, and angular momentum, and therefore describes an empty vacuum. VI. CONCLUSION In this paper, we reviewed a general method for making the standard Lagrangian formulation manifestly covariant. We employed this framework to develop a classical counterpart of Wigner's classification of quantum particle-types in terms of the structure of the orthochronous Poincaré group. We also showed that classical massless particles with spin exhibit a novel manifestation of gauge invariance, and used the massless limit to derive a classical version of the Higgs mechanism. An interesting way to extend our approach is to consider phase spaces that provide transitive group actions of the full Poincaré group, including time-reversal transformations (55). This generalization does not affect our analysis of tachyons or of the vacuum, which do not feature a definite sign for pt. But in the case of a system with non-negative mass, m ≥ 0, enlarging the system's phase space so that it provides a transitive action of the full Poincaré group means doubling the phase space to include "negative-energy" states with pt < 0. Because the four-momentum pμ is timelike or null when m ≥ 0, we know from (56) that the sign of pt is invariant under all physically realizable Lorentz transformations, which are smoothly connected with the identity transformation and therefore do not include time-reversal transformations. Hence, a system with m ≥ 0 cannot evolve from states with pt > 0 to states with pt < 0 or vice versa. We are therefore free to define the physical energy of the additional pt < 0 states to be E ≡ −ptc > 0, and regard them as states not of our original particle, but of its corresponding antiparticle. In this way, we can classically unify particles with their antiparticles. 22 ACKNOWLEDGMENTS J. A. B. has benefited tremendously from personal communications with Howard Georgi, Andrew Strominger, David Griffiths, David Kagan, David Morin, Logan McCarty, Monica Pate, and Alex Lupsasca. [1] A. P. Balachandran, G. Marmo, B.-S. Skagerstam, and A. Stern, Gauge Symmetries and Fibre Bundles Applications to Particle Dynamics, 1st ed. (Springer-Verlag Berlin Heidelberg, 1983) arXiv:1702.08910 [quant-ph]. [2] J.-M. Souriau, Structure of Dynamical Systems, 1st ed. (Birkhäuser, 1997). [3] M. Rivas, Kinematical Theory of Spinning Particles (Springer, 2002). [4] E. P. Wigner, Annals of Mathematics 40, 149 (1939). [5] For a more extensive introduction, see [26]. [6] For an early example of this formalism, see [27]. For more modern reviews, see [2, 28]. [7] For a more extensive introduction, see the opening chapters of [29]. [8] For a comprehensive presentation of the group theory underlying special relativity, see [30]. [9] As a mathematical aside, the Poincaré group is formally denoted by the semi-direct product R1,3 oO(1, 3), which generalizes the notion of a direct product G = H1×H2 to the case in which the second factor H2 is not necessarily a normal subgroup of the overall group G. [10] For alternative classical approaches to this classification problem, see [1–3]. [11] The minus sign in this definition is a reflection of our metric sign conventions. [12] Quantities that have fixed values in a transitive group action or an irreducible representation of a given transformation group are formally called Casimir invariants. [13] For a more extensive review of the mathematical details ahead, see [30]. [14] A. J. Hanson and T. Regge, Annals of Physics 87, 498 (1974). [15] B.-S. Skagerstam and A. Stern, Physica Scripta 24, 493 (1981). [16] A. Frydryszak, ""Lagrangian Models of Particles with Spin: The First Seventy Years"," in From Field Theory to Quantum Groups (World Scientific, 1996) pp. 151–172, arXiv:hep-th/9601020 [hep-th]. [17] This condition, which is also introduced in [15], is closely related to the momentum-space version of the Lorenz equation ∂μA μ = 0 that appears both in the Proca theory of a massive spin-one bosonic field and as the condition for Lorenz gauge in electromagnetism. Like the Lorenz equation in those field theories, we will eventually see that the condition (143) ends up eliminating unphysical spin states. [18] More generally, for a system of multiple particles labeled by α = 1, 2, . . . , the spatial components Lti generalize to the system's center-of-mass-energy XiCM =∑ αEαX i initial,α/Etotal, and so their conservation implies the constancy of XCM. [19] For related arguments, see [30, 31]. [20] See, for example, [30]. [21] L. F. Abbott, Physical Review D 13, 2291 (1976). [22] For an optimistic alternative perspective, see [32]. [23] Note that if we permit parity transformations, which map σ 7→ −σ, then we must require that the equivalence relation (200) hold only for states that share the same helicity σ. [24] Again, for an alternative point of view, see [32]. [25] E. C. G. Stueckelberg, Helvetica Physica Acta 11, 225 (1938). [26] H. Goldstein, J. L. Safko, and C. P. Poole Jr., Classical Mechanics, 3rd ed. (Pearson, 2001). [27] P. A. M. Dirac, Proceedings of the Royal Society A 111, 405 (1926). [28] A. Deriglazov and B. Rizzuti, American Journal of Physics 79, 882 (2011), arXiv:1105.0313 [math-ph]. [29] B. Schutz, A First Course in General Relativity, 2nd ed. (Cambridge University Press, 2009). [30] S. Weinberg, The Quantum Theory of Fields, 1st ed., Vol. 1 (Cambridge University Press, 1996). [31] E. P. Wigner, ""Invariant Quantum Mechanical Equations of Motion"," in Theoretical Physics (International Atomic Energy Agency, Vienna Austria, 1963) pp. 161– 184. [32] P. Schuster and N. Toro, Journal of High Energy Physics 2013 (2013), 10.1007/JHEP09(2013)104, arXiv:1302.1198 [hep-th].