1. Quaternion Algebra
Quaternions are introduced as an extension of complex numbers where, instead of one imaginary unit
, there are three of them, denoted
and
; we relate them to
,
and
directions of a three-dimensional space respectively. The square of each of these units equals to −1, while any two of them anti-commute, e.g.
etc.; furthermore
, the small circle indicating quaternion multiplication (an associative operation) [1]. A quaternion is thus a quantity with four components, namely
(1)
where
is its scalar part and
its vector part (we use the blackboard type for full quaternions and boldface for vectors). Adding two quaternions is a component-wise operation, while multiplication is carried out (consistently with the previous rules) by
(2)
where
and
denote the dot and vector product, respectively; quaternion multiplication is associative (as stated already) but not commutative. The proof of both statements is routine; it is based on the following two identities
(3)
Quaternion conjugation is defined (and denoted) by
(4)
implying that
yields the length of vector
Note that
(5)
and that
yields the sum of squares of
’s four components (we may then drop the
symbol and use
or
to emphasize that the result is a scalar and the order irrelevant). Using the conjugate, we then build the inverse of
by
(6)
so that
.
Finally
, where
is a positive integer, implies
multiplied by itself
times (calling the result the
power of
); this means that we can now define a function of a quaternion, based on its Maclaurin expansion. The most useful of these is
(7)
where
is the unit direction of
; note that powers of
follow the
cycle, due to
.
Rotation
Fixed rotation: Let
be a quaternion of unit magnitude, i.e.
where
(equivalent to
). This further implies that
can be uniquely written as
(8)
where
is a vector whose direction is
and whose magnitude meets
and
. When
represents a point in a three-dimensional (3D) space (using an inertial frame for its coordinates), it is easy to show that
(9)
where
represents
rotated by angle
around the axis whose direction is
. Here
is the component of
parallel to
, while
is perpendicular to it. Note that
commutes with
(and therefore with
) while
anti-commutes (implying that
).
When
is a parametric description of a 3D object (such as an ellipse), the same operation similarly rotates the whole object, without changing its size or shape.
Uniform rotation: Changing
to
(10)
(where
is interpreted as time) makes the rotation time dependent;
then becomes the corresponding (constant) angular velocity.
Time-dependent rotation: Assuming that all components of
are now arbitrary functions of
(yet preserving the
property), we now find the corresponding instantaneous angular velocity at time
; it follows from
(11)
since
(showing that
is a vector) and since
; this implies that
is the resulting velocity.
Kepler frame: Letting the original coordinate frame itself rotate with
, and expressing the vector of the instantaneous angular velocity of the
rotation in this new (no longer inertial, but very convenient) Kepler frame, we get
(12)
Euler angles: The usual way of parametrizing
is done [2] by
(13)
i.e. first rotating with respect to the
axis by angle
, then rotating with respect to the (original)
axis by angle
, and finally rotating with respect to the
axis again by angle
; the three angles are referred to as Euler angles [3].
Using this form of
and letting each of the three angles be a function of
, the angular velocity (12), i.e. its Kepler-frame representation, becomes
(14)
We can then express each of the three time derivatives in terms of components of
by solving the corresponding three equations, thus getting
(15)
2. Kepler Problem
To find the motion of a satellite (of negligible mass) orbiting a primary of gravitational mass
requires solving the following differential equation
(16)
where
is a vector with three components (
,
and
), each a function of time
,
is the length of
, the double dot implies taking a second derivative of each of the
components with respect to
, and
is a perturbing force per unit mass of the satellite, assumed to be relatively small (i.e.
) [3]; the factor
then appears in all quantities proportional to
; higher powers of ε are thus used to indicate the degree of smallness. When building a solution to (16), it will be important to distinguish between terms of the
type (solving the equation with no perturbing force), those proportional to
, and finally terms proportional to second or higher power of
(ultimately to be ignored, as explained shortly).
The article considers only autonomous perturbing forces, meaning that
is a function of
but not explicitly of time.
To utilize quaternion algebra for solving (16), we introduce a new dependent variable
(a quaternion) and a new independent variable
(a scalar) called modified time [4] and [5]. These are related to the old variables by
(17a)
(17b)
where
is assumed (by subsequent proof) to be an arbitrary scalar function of
, and
. We also define the following scalar function of
(18)
where
indicates differentiating with respect to
. Note that Γ is thus twice the scalar part of
(as an alternate definition).
One can then show that, using the new variables, (16) reads
(19)
Proof:
(20)
due to (17b). Post-multiplying by
results in
(21)
Differentiating each side with respect to
yields
(22)
which can be further simplified (note that
) to get
(23)
which agrees with (19).
Note that post-multiplying
by
, where
is any scalar function of
, does not change the resulting
and
; this implies that
must solve (19) whenever
does (as can be explicitly verified). Nevertheless, the new solution does change the value of Γ to
(24)
This implies that we can always find a solution which meets
at all values of
by a proper choice of
(referred to as gauge). Having done that, (19) is thus reduced to
(25)
while meeting
. Since (25) and the last condition imply that
is also identically equal to 0 (just post-multiply each term of (25) by
to show that
has zero scalar part), it is then sufficient to make Γ equal to 0 initially; solving (25) then assures that Γ remains constant, thus maintaining the
condition automatically at any future time
.
Unperturbed solution: When
, we choose
to be a positive constant and, assuming the solution to start with
, the equation to solve is then
(26)
where
(27)
is a constant of motion, proportional to the total (kinetic and gravitational) energy of the satellite [3].
Proof: Differentiating the LHS of (27) yields
(28)
due to (26) and confirming that is identically zero. Furthermore
(29)
verifies the rest.
Assuming that
is negative (to make the satellite orbit the primary), we may now choose
so that
. The equation to solve is then
(30)
(
denoting a zero quaternion) while
(31)
Since there is no normal (perpendicular to the plane defined by the initial
and
directions) force, it is obvious that the motion, and the corresponding solution to (16), must be planar. It is thus sufficient to start with a solution that keeps
in the x-y plane; this solution then can be transformed (by applying an
fixed rotation to it) to any other specific attitude (the transformation preserves the
condition and the value of
).
To construct the most general solution to (30) [6], we use the following notation
(32)
and
(33)
where
is a constant. From now on, quantities of this type (i.e. having only scalar and
components) are called complex (with
representing the purely-imaginary unit) and can be multiplied using rules of complex algebra (no need for the
operation, unless one of the factors is a quaternion); their complex nature is emphasized by using a different font.
The most general form of
that keeps (17a) in the
-
plane and meets (30) is then
(34)
with
and
being real constants; this implies that
(35)
(36)
and
(37)
further implying, due to (31), that
(38)
It is easy to see that then has only
and
components, implying that
and confirming our original assumption.
The fully general solution is then
(39)
where
is an arbitrary rotation. This implies that
(40)
We already know that
; introducing
, the same solution can be written as
(41)
which is easily recognized as an equation of an ellipse with semi-major axis equal to
and eccentricity equal to
. The remaining parameters (called orbital elements) of the general solution are the three Euler angles of the ellipse’s attitude and
, the value of
at apocenter (
with the largest magnitude).
Finally, since
(42)
we get, based on (17b),
(43)
relating real time
to
; the latter being a modified version (after subtracting
) of so called eccentric anomaly.
3. Perturbed Equation
When
is nonzero, finding a solution becomes substantially more challenging; we can achieve it only to
accuracy (when terms proportional to higher powers of
are ignored, as done from now on). Nevertheless, once such first-order solution is found, we can use the same formulas to build a
-accurate (and higher) solution, by iterating.
To find
-accurate solution [6], we must abandon the
gauge and return to solving (19) directly; to make the solution unique, a different gauge will offer itself in the process. We must also allow all six orbital elements of the unperturbed solution
(44)
where
(a more convenient parametrization) to be slowly varying (their derivatives proportional to
) functions of
(this implies that their second derivatives are
-proportional and therefore ignored). The 0 subscript of
indicates that (i) the corresponding quantity is in Kepler’s frame and (ii) it is evaluated to
(i.e. unperturbed-solution) accuracy. Note that now
(45)
(recall that
multiplies all
-small quantities, while
-small and higher order terms are discarded).
Finally,
itself needs to be extended to (and replaced by)
(46)
where
is a Laurent-series function of
[7], with missing
and
terms, i.e.
(47)
where the
coefficients are (yet to be solved for) complex quantities.
Similarly,
is another such complex function which must furthermore meet the condition (our new gauge), and is missing
(consequently
) and
terms. The latter one is actually needed to complete our solution, but is kept separate from
(we have denoted it
, where
is real). Note that
,
and
are thus complex quantities as well. The solution we hope to construct should then results in unique formulas for
,
,
and for the
derivatives of six orbital elements; we now proceed to find these by substituting the proposed solution (namely
) into (19), and matching its two sides.
Once done, (17a) would then convert the solution to
(as a function of
), while (17b) then relates
to real time
; this is a routine final step of the procedure.
3.1. Further Simplification
We start by simplifying the solution’s first two
derivatives, getting
(48)
and
(49)
We then simplify the resulting , which turns out to be nonzero and
-small. More specifically
(50)
where , and a bar now implies complex conjugation. Note that contributions from the
part of the solution have cancelled out, the individual terms are complex, but the final Γ is real.
Proof:
(51)
Adding the corresponding complex conjugate removes the first three terms, as contributes the unperturbed Γ (which equals to 0). and conjugating the next two term reverses their sign. The last two terms (plus their conjugates) then yield the first four terms of (50). Similarly
(52)
Adding its conjugate, namely
(53)
then verifies the remaining terms of (50).
We are now ready to substitute
into (19); matching the equation’s two sides then yields formulas for the
-derivatives of orbital elements and for the coefficients of
and
. The resulting set of first-order differential equations for orbital elements (still truly autonomous, implying no need for averaging, and unaffected by one-orbit oscillations) can then be integrated, yielding invaluable insights into their long-range behaviour. On the other hand,
and
and
components of the solution provide parallel and perpendicular (to Kepler frame) distortions, respectively, of a single orbit. Since no approximation is used to build an
- accurate solution, the same procedure can be used (iteratively) to construct
-accurate (and higher-order) solutions, thus achieving arbitrary accuracy. The main advantage over similar techniques such as Lie-Deprit transformation [8] and Hamiltonian canonical perturbation theory [9] is in relative simplicity of our (yet to be derived) formulas. Admittedly, this derivation is rather involved (as seen shortly), but once found, the same formulas can be routinely applied to any autonomous perturbation and carried out to any order of
accuracy. The only mathematical tool needed is already mentioned Laurent-series expansion of complex functions.
To find explicit formulas for all unknowns of the proposed solution, we first pre-multiply each term of (19) by
; this means that all quantities are transformed into the orbit’s Kepler frame, correspondingly simplifying the
solution and its two
derivatives (48) and (49), while the RHS becomes
(54)
where
and
are evaluated using the unperturbed solution
. We now proceed to de-couple the resulting equation by converting it into two complex equations (easier to deal with, as multiplication becomes commutative).
3.2. Building a Solution
To derive first of these equations, we start with the LHS of the Kepler-frame version of (19), namely (expressed to the
accuracy, and utilizing the fact that Γ,
and
are
small)
(55)
Post-multiplying by and keeping complex components of each term only, we get (to the same
accuracy)
(56)
(where
), while the RHS becomes
(57)
(the cx subscript implies keeping only scalar and
components), and
is correspondingly defined Laurent series. The Mathematica program of Figure 1 completes the task of converting the LHS (short of the
factor) into a function of
and of
derivatives of four orbital elements (highlighted); note that
terms have cancelled out, as expected. The code is reasonably self-explanatory, even to people not familiar with Mathematica.
Explicit formulas for the corresponding unknowns of the perturbed solution are then found (see Figure 2) by matching coefficients of all powers of
in (57). The program first solves for the
coefficient of the
expansion while ignoring the
,
,
,
terms (contributing to
,
and
powers only). This requires simultaneously solving for
,
and their complex conjugates, while considering them to be algebraically independent (they must and do turn out to be consistent with each other). To build the first of these four equations, we collect (the first line of code) coefficients of
which
contributes to the LHS of Figure 1, e.g. contributes
, contributes
, etc., and subtract the coefficient of
after expanding the RHS of (57); the second equation is just the
counterpart of the first one, further including complex conjugates of both. The four equations are then expanded in powers of
and solved in an iterative manner; a careful choice of the expression multiplying
on the RHS of (57) has made the iterative solution terminate upon reaching
-proportional terms (see the first part of the program, ending with a highlighted
solution). Note that the formula does not solve for
and
, implying that
(58)
Figure 1. (56) further simplified.
Figure 2. Solving (56)=(57).
This makes
match coefficients of all powers of
, with the exception of
,
and
; these are then used to solve for
,
,
and
; this is further simplified by solving, separately (yet another helpful de-coupling) the real (yielding
and
) and purely imaginary (yielding
and
) parts of the corresponding three equations. Note that in both cases the equations are linearly dependent, thus allowing for a unique solution of each pair of unknowns (highlighted at the end of Figure 3).
To find
,
,
and
, we now keep only the
and
terms of (55), thus getting
Post-multiplying by results in
Figure 3. LHS of (59).
(59)
where (note that each term of the equation has this property). We then proceed to simplify the LHS (short of a
factor) of (59) by Mathematica program of Figure 4.
And then to find
(now much easier, as only one equation needs to be solved), followed by deriving formulas for
,
and
(by matching coefficients of
and
); this is done in Figure 5.
We now have a complete set of formulas needed to find (based on the perturbing force) all ingredients of our solution. Using this (
-accurate) solution, we can then evaluate both sides of (19) to the
accuracy and use the same formulas to construct
-accurate solution, and so on. Post-multiplying the final solution by
, where
is a properly chosen constant, we can then impose the
gauge, if desired.
3.3. Oblateness Example
To illustrate how to apply these formulas, we consider the perturbing force experienced by an artificial satellites due to Earth’s oblateness, given by (using Kepler’s frame)
Figure 4. Solving (59).
Figure 5. Orbital-element derivatives under oblateness perturbations (Part 1).
(60)
where
is the second zonal-harmonic coefficient,
is the Earth’s equatorial radius, and
(equal to
, by our choice of coordinates) is the unit direction of the Earth’s axis. To convert
to Kepler’s frame, we post-multiply it by
of (13) and pre-multiply by
, thus getting
(61)
The key quantities needed to build the corresponding perturbed solution are
(62)
and
(63)
Converting these into derivatives of the orbital elements is done by Mathematical programs of Figure 5 and Figure 6; while both
and
turn out to be equal to zero, further application of (15) results in the following classical formulas
(64)
Figure 6. Orbital-element derivatives under oblateness perturbation (Part 2).
which translate into a slow, uniform precession of the orbital plane around the Earth’s axis (with angular speed of
) and a similar rotation of the orbit’s perigee within the orbital plane with angular speed of
, while keeping the inclination angle fixed. Note that the perigee rotation ceases when
, called the critical angle; the
-accurate solutions reveals that, instead of fully stopping, the rotation turns into a libration around the critical angle [10] and [11], but pursuing this goes beyond the scope of this article (consult the two references for further details). The first of these also derives expressions for resulting distortions of the Kepler’s frame
; we do not quote them here due to their complexity.
4. Conclusions
In this final section we want to mention a few potential extensions of the new technique (excluded from our basic presentation). Firstly, the technique is easily capable of dealing with more than one autonomous force at a time and (at the same time) reaching an arbitrarily accurate solution by iterating; this simply requires attaching a different small parameter (
) to each individual perturbing force and applying existing formulas to individual terms of the corresponding expansion. A more difficult task (even in a single-force situation) is to establish the largest value of
which forces the iteration process to converge; this remains an open question. But luckily, in typical applications, perturbing forces are small enough to guarantee fast convergence.
The most obvious modification of our formulas is clearly needed when the perturbing force is no longer autonomous; this has been done in [12]. The solution can no longer avoid introducing fast (i.e. one orbit) oscillations into the resulting differential equations for orbital elements, thus making it more difficult to maintain high accuracy when exploring their long-term (millions of years) behavior. Nevertheless, in these cases, getting the correct qualitative picture is often the main goal, amply met by the extended formulas.
When all parameters of the perturbing force are numerical, it becomes beneficial (when constructing higher-order solution) to modify the iteration process not just to find the next-order solution, but also to update the existing one (till no changes are observed); this has been demonstrated in [13] and offers a way of properly dealing with small divisors.
The final challenge to solving a perturbed Kepler problem is due to so called resonances, happening when a ratio of the satellite’s (used in a generic sense) orbital period to the period of a (cyclic) perturbing force becomes a simple fraction (such as 2:1, 3:2 etc.); the new technique is well equipped to elucidate main features of the corresponding solution, as corroborated by [14] and several related articles.
The new technique has thus a large range of applications (well beyond what we could cover in this article) due to its ability to resolve several potential issues adversely affecting many traditional methods of solution. Yet, further advancement is still possible, but requires a solid background in the technique’s fundamentals; it is hoped that our article has provided it.