Solving Perturbed Kepler Problem by Quaternion Algebra

Jan Vrbik

doi:10.4236/am.2026.176021

Applied Mathematics > Vol.17 No.6, June 2026

Solving Perturbed Kepler Problem by Quaternion Algebra

Jan Vrbik
Department of Mathematics, Brock University, St. Catharines, Canada.
DOI: 10.4236/am.2026.176021 PDF HTML XML 1 Downloads 22 Views

Abstract

In this article, we use quaternions together with Kustaanheimo-Stiefel transformation to solve the Kepler problem, first its unperturbed version and then when adding a small perturbing force. We assume that the perturbing force is autonomous, which enables us to derive a set of first-order differential equations for the corresponding orbital elements, while keeping them fully autonomous as well. This is achieved without compromising the resulting accuracy; there is no need for averaging out fast oscillations or involving a fast-angle variable. As a consequence, the equations can be solved to arbitrary accuracy by iterative and routine application of the new formulas.

Keywords

Quaternions, Kustaanheimo-Stiefel Transformation, Kepler Problem, Perturbation Theory, Orbital Elements

Share and Cite:

Vrbik, J. (2026) Solving Perturbed Kepler Problem by Quaternion Algebra. Applied Mathematics, 17, 337-353. doi: 10.4236/am.2026.176021.

1. Quaternion Algebra

Quaternions are introduced as an extension of complex numbers where, instead of one imaginary unit $i$ , there are three of them, denoted $i,$ $j$ and $k$ ; we relate them to $x$ , $y$ and $z$ directions of a three-dimensional space respectively. The square of each of these units equals to −1, while any two of them anti-commute, e.g. $i \circ j = - j \circ i$ etc.; furthermore $i \circ j = k$ , the small circle indicating quaternion multiplication (an associative operation) [1]. A quaternion is thus a quantity with four components, namely

$A : = A + a_{1} i + a_{2} j + a_{3} k = A + a$ (1)

where $A$ is its scalar part and $a$ its vector part (we use the blackboard type for full quaternions and boldface for vectors). Adding two quaternions is a component-wise operation, while multiplication is carried out (consistently with the previous rules) by

$(A + a) \circ (B + b) = A B - a \cdot b + A b + B a + a \times b$ (2)

where $\cdot$ and $\times$ denote the dot and vector product, respectively; quaternion multiplication is associative (as stated already) but not commutative. The proof of both statements is routine; it is based on the following two identities

$(a \times b) \cdot c = a \cdot (b \times c)$ (3)

$(a \times b) \times c = a \times (b \times c)$

Quaternion conjugation is defined (and denoted) by

$\bar{A} : = A - a$ (4)

implying that $a : = \sqrt{a \circ \bar{a}}$ yields the length of vector $a .$ Note that

$\bar{A \circ B} = \bar{B} \circ \bar{A}$ (5)

and that $A \circ \bar{A} = \bar{A} \circ A$ yields the sum of squares of $A$ ’s four components (we may then drop the $\circ$ symbol and use $A \bar{A}$ or $\bar{A} A$ to emphasize that the result is a scalar and the order irrelevant). Using the conjugate, we then build the inverse of $A$ by

$A^{- 1} : = \frac{\bar{A}}{A \bar{A}}$ (6)

so that $A \circ A^{- 1} = A^{- 1} \circ A = 1$ .

Finally $A^{n}$ , where $n$ is a positive integer, implies $A$ multiplied by itself $n$ times (calling the result the $n^{th}$ power of $A$ ); this means that we can now define a function of a quaternion, based on its Maclaurin expansion. The most useful of these is

$exp (A) = exp (A) exp (a) = exp (A) exp (a \hat{a}) = exp (A) (\cos a + \hat{a} \sin a)$ (7)

where $\hat{a} : = \frac{a}{a}$ is the unit direction of $a$ ; note that powers of $\hat{a}$ follow the $\hat{a}, - 1, - \hat{a}, 1, \hat{a}, \dots$ cycle, due to $a \circ a = - a^{2}$ .

Rotation

Fixed rotation: Let $ℝ$ be a quaternion of unit magnitude, i.e. $ℝ : = R + r \hat{r}$ where $R^{2} + r^{2} = 1$ (equivalent to $ℝ \bar{ℝ} = 1$ ). This further implies that $ℝ$ can be uniquely written as

$ℝ = exp (\frac{w}{2})$ (8)

where $w$ is a vector whose direction is $\hat{r}$ and whose magnitude meets $cos (\frac{w}{2}) = R$ and $sin (\frac{w}{2}) = r$ . When $x$ represents a point in a three-dimensional (3D) space (using an inertial frame for its coordinates), it is easy to show that

$\begin{array}{l} x_{R} : = ℝ \circ x \circ \bar{ℝ} = exp (\frac{w}{2}) \circ x \circ exp (- \frac{w}{2}) \\ = x_{∥} + (\cos w + \hat{w} \sin w) \circ x_{⊥} = x_{∥} + x_{⊥} \cos w + \hat{w} \times x_{⊥} \sin w \end{array}$ (9)

where $x_{R}$ represents $x$ rotated by angle $w$ around the axis whose direction is $\hat{w}$ . Here $x_{∥} : = (x \cdot \hat{w}) \cdot \hat{w}$ is the component of $x$ parallel to $\hat{w}$ , while $x_{⊥} : = x - (x \cdot \hat{w}) \cdot \hat{w}$ is perpendicular to it. Note that $x_{∥}$ commutes with $w$ (and therefore with $ℝ$ ) while $x_{⊥}$ anti-commutes (implying that $x_{⊥} \circ \bar{ℝ} = ℝ \circ x_{⊥}$ ).

When $x (p)$ is a parametric description of a 3D object (such as an ellipse), the same operation similarly rotates the whole object, without changing its size or shape.

Uniform rotation: Changing $ℝ$ to

$ℝ = exp (\frac{w}{2} s)$ (10)

(where $s$ is interpreted as time) makes the rotation time dependent; $w$ then becomes the corresponding (constant) angular velocity.

Time-dependent rotation: Assuming that all components of $ℝ$ are now arbitrary functions of $s$ (yet preserving the $ℝ \bar{ℝ} = 1$ property), we now find the corresponding instantaneous angular velocity at time $s$ ; it follows from

$\begin{matrix} {x^{'}}_{R} (s) = ℝ^{'} \circ x \circ \bar{ℝ} + ℝ \circ x \circ {\bar{ℝ}}^{'} \\ = ℝ^{'} \circ \bar{ℝ} \circ x_{R} (s) + x_{R} (s) \circ ℝ \circ {\bar{ℝ}}^{'} \\ = 2 (ℝ^{'} \circ \bar{ℝ}) \times x_{R} (s) \end{matrix}$ (11)

since $ℝ \circ {\bar{ℝ}}^{'} = - ℝ^{'} \circ \bar{ℝ}$ (showing that $ℝ^{'} \circ \bar{ℝ}$ is a vector) and since $a \circ b - b \circ a = 2 a \times b$ ; this implies that $2 ℝ^{'} \circ \bar{ℝ}$ is the resulting velocity.

Kepler frame: Letting the original coordinate frame itself rotate with $ℝ$ , and expressing the vector of the instantaneous angular velocity of the $ℝ$ rotation in this new (no longer inertial, but very convenient) Kepler frame, we get

$Z : = 2 \bar{ℝ} \circ (ℝ^{'} \circ \bar{ℝ}) \circ ℝ = 2 \bar{ℝ} \circ ℝ^{'}$ (12)

Euler angles: The usual way of parametrizing $ℝ$ is done [2] by

$ℝ = exp (k \frac{ϕ}{2}) \circ exp (i \frac{θ}{2}) \circ exp (k \frac{ψ}{2})$ (13)

i.e. first rotating with respect to the $z$ axis by angle $ψ$ , then rotating with respect to the (original) $x$ axis by angle $θ$ , and finally rotating with respect to the $z$ axis again by angle $ϕ$ ; the three angles are referred to as Euler angles [3].

Using this form of $ℝ$ and letting each of the three angles be a function of $s$ , the angular velocity (12), i.e. its Kepler-frame representation, becomes

$Z = i (θ^{'} \cos ψ + ϕ^{'} \sin ψ \sin θ) + j (ϕ^{'} \cos ψ \sin θ - θ^{'} \sin ψ) + k (ψ^{'} + ϕ^{'} \cos θ)$ (14)

We can then express each of the three time derivatives in terms of components of $Z$ by solving the corresponding three equations, thus getting

$ϕ^{'} = \frac{Z_{1} \sin ψ + Z_{2} \cos ψ}{\sin θ}$ (15)

$θ^{'} = Z_{1} \cos ψ - Z_{2} \sin ψ$

$ψ^{'} = Z_{3} - ϕ^{'} \cos θ$

2. Kepler Problem

To find the motion of a satellite (of negligible mass) orbiting a primary of gravitational mass $μ$ requires solving the following differential equation

$\ddot{r} + μ \frac{r}{r^{3}} = ε f$ (16)

where $r$ is a vector with three components ( $x$ , $y$ and $z$ ), each a function of time $t$ , $r$ is the length of $r$ , the double dot implies taking a second derivative of each of the $r$ components with respect to $t$ , and $ε f$ is a perturbing force per unit mass of the satellite, assumed to be relatively small (i.e. $ε ≪ \frac{μ}{r^{2}}$ ) [3]; the factor $ε$ then appears in all quantities proportional to $ε$ ; higher powers of ε are thus used to indicate the degree of smallness. When building a solution to (16), it will be important to distinguish between terms of the $ε^{0}$ type (solving the equation with no perturbing force), those proportional to $ε$ , and finally terms proportional to second or higher power of $ε$ (ultimately to be ignored, as explained shortly).

The article considers only autonomous perturbing forces, meaning that $f$ is a function of $r$ but not explicitly of time.

To utilize quaternion algebra for solving (16), we introduce a new dependent variable $U$ (a quaternion) and a new independent variable $s$ (a scalar) called modified time [4] and [5]. These are related to the old variables by

$r : = U \circ i \circ \bar{U}$ (17a)

$\frac{d t}{d s} : = 2 r \sqrt{\frac{a}{μ}}$ (17b)

where $a > 0$ is assumed (by subsequent proof) to be an arbitrary scalar function of $s$ , and $r = \sqrt{- U \circ i \circ \bar{U} U \circ i \circ \bar{U}} = U \bar{U}$ . We also define the following scalar function of $s$

$Γ : = U^{'} \circ i \circ \bar{U} - U \circ i \circ {\bar{U}}^{'}$ (18)

where $'$ indicates differentiating with respect to $s$ . Note that Γ is thus twice the scalar part of $U^{'} \circ i \circ \bar{U}$ (as an alternate definition).

One can then show that, using the new variables, (16) reads

$\begin{array}{l} 2 U^{″} - \frac{2 U^{'} {\bar{U}}^{'} - 4 a}{r} U - \frac{a^{'}}{a} (U^{'} - \frac{Γ}{2 r} U \circ i) + \frac{2 Γ}{r} U^{'} \circ i + {(\frac{Γ}{r})}^{'} U \circ i \\ = - 4 r \frac{a}{μ} ε f \circ U \circ i \end{array}$ (19)

Proof:

$\dot{r} = \frac{2 U^{'} \circ i \circ \bar{U} - Γ}{2 r \sqrt{\frac{a}{μ}}} = \frac{2 U \circ i \circ {\bar{U}}^{'} + Γ}{2 r \sqrt{\frac{a}{μ}}}$ (20)

due to (17b). Post-multiplying by $2 \sqrt{\frac{a}{μ}} U \circ i$ results in

$2 \sqrt{\frac{a}{μ}} \dot{r} \circ U \circ i = - 2 U^{'} - \frac{Γ}{r} U \circ i$ (21)

Differentiating each side with respect to $s$ yields

$\begin{array}{l} \frac{a^{'}}{a} \sqrt{\frac{a}{μ}} \dot{r} \circ U \circ i + 2 \sqrt{\frac{a}{μ}} \dot{r} \circ U^{'} \circ i + 4 r \frac{a}{μ} (ε f - μ \frac{r}{r^{3}}) \circ U \circ i \\ = - 2 U^{″} - {(\frac{Γ}{r})}^{'} U \circ i - \frac{Γ}{r} U^{'} \circ i \end{array}$ (22)

which can be further simplified (note that $r \circ U \circ i = \frac{r \circ U \circ i \circ \bar{U} \circ U}{r} = r U$ ) to get

$\begin{array}{l} - \frac{a^{'}}{a} (U^{'} + \frac{Γ}{2 r} U \circ i) - \frac{2}{r} {\bar{U}}^{'} U^{'} \circ U - \frac{Γ}{r} U^{'} \circ i + 4 r \frac{a}{μ} ε f \circ U \circ i + \frac{4 a}{r} U \\ = - 2 U^{″} - {(\frac{Γ}{r})}^{'} U \circ i - \frac{Γ}{r} U^{'} \circ i \end{array}$ (23)

which agrees with (19). $■$

Note that post-multiplying $U$ by $exp (i δ)$ , where $δ$ is any scalar function of $s$ , does not change the resulting $r$ and $r$ ; this implies that $U \circ exp (i δ)$ must solve (19) whenever $U$ does (as can be explicitly verified). Nevertheless, the new solution does change the value of Γ to

$Γ + U \circ i δ^{'} \circ i \circ \bar{U} - U \circ i \circ (- i δ^{'}) \circ \bar{U} = Γ - 2 r δ^{'}$ (24)

This implies that we can always find a solution which meets $Γ = 0$ at all values of $s$ by a proper choice of $δ$ (referred to as gauge). Having done that, (19) is thus reduced to

$2 U^{″} - \frac{2 U^{'} {\bar{U}}^{'} - 4 a}{r} U - \frac{a^{'}}{a} U^{'} = - 4 r \frac{a}{μ} ε f \circ U \circ i$ (25)

while meeting $Γ = 0$ . Since (25) and the last condition imply that $Γ^{'}$ is also identically equal to 0 (just post-multiply each term of (25) by $i \circ \bar{U}$ to show that $2 U^{″} \circ i \circ \bar{U}$ has zero scalar part), it is then sufficient to make Γ equal to 0 initially; solving (25) then assures that Γ remains constant, thus maintaining the $Γ = 0$ condition automatically at any future time $s$ .

Unperturbed solution: When $f = 0$ , we choose $a$ to be a positive constant and, assuming the solution to start with $Γ = 0$ , the equation to solve is then

$U^{″} = \frac{U^{'} {\bar{U}}^{'} - 2 a}{r} U$ (26)

where

$\frac{U^{'} {\bar{U}}^{'} - 2 a}{r} : = E = \frac{2 a}{μ} (\frac{\dot{r} \cdot \dot{r}}{2} - \frac{μ}{r})$ (27)

is a constant of motion, proportional to the total (kinetic and gravitational) energy of the satellite [3].

Proof: Differentiating the LHS of (27) yields

$\begin{array}{l} \frac{U^{″} \circ {\bar{U}}^{'} + U^{'} \circ {\bar{U}}^{″}}{r} - \frac{E}{r} r^{'} \\ = \frac{E U \circ {\bar{U}}^{'} + U^{'} \circ E \bar{U}}{r} - \frac{E}{r} (U^{'} \circ \bar{U} + U \circ {\bar{U}}^{'}) = 0 \end{array}$ (28)

due to (26) and confirming that $E'$ is identically zero. Furthermore

$\dot{r} \cdot \dot{r} = - \frac{U^{'} \circ i \circ \bar{U}}{r \sqrt{\frac{a}{μ}}} \circ \frac{U \circ i \circ {\bar{U}}^{'}}{r \sqrt{\frac{a}{μ}}} = \frac{μ}{a} \frac{U^{'} {\bar{U}}^{'}}{r}$ (29)

verifies the rest. $■$

Assuming that $E$ is negative (to make the satellite orbit the primary), we may now choose $a$ so that $E = - 1$ . The equation to solve is then

$U^{″} + U = O$ (30)

( $O$ denoting a zero quaternion) while

$a = \frac{U^{'} {\bar{U}}^{'} + r}{2}$ (31)

Since there is no normal (perpendicular to the plane defined by the initial $r$ and $\dot{r}$ directions) force, it is obvious that the motion, and the corresponding solution to (16), must be planar. It is thus sufficient to start with a solution that keeps $r$ in the x-y plane; this solution then can be transformed (by applying an $ℝ = exp (i \frac{θ}{2}) \circ exp (k \frac{ϕ}{2})$ fixed rotation to it) to any other specific attitude (the transformation preserves the $Γ = 0$ condition and the value of $E$ ).

To construct the most general solution to (30) [6], we use the following notation

$q : = exp (k (s - σ))$ (32)

and

$z : = q^{2} = exp (2 k (s - σ))$ (33)

where $σ$ is a constant. From now on, quantities of this type (i.e. having only scalar and $k$ components) are called complex (with $k$ representing the purely-imaginary unit) and can be multiplied using rules of complex algebra (no need for the $\circ$ operation, unless one of the factors is a quaternion); their complex nature is emphasized by using a different font.

The most general form of $U$ that keeps (17a) in the $x$ - $y$ plane and meets (30) is then

$U = exp (k \frac{ψ}{2}) (A q + B q^{- 1})$ (34)

with $A, B, σ$ and $ψ$ being real constants; this implies that

$r = A^{2} + B^{2} + A B (z + z^{- 1})$ (35)

$U' = k e x p k ψ 2 A q - B q - 1$ (36)

and

$U' U ̄' = A 2 + B 2 - A B z + z - 1$ (37)

further implying, due to (31), that

$a = A^{2} + B^{2}$ (38)

It is easy to see that $U' ∘ i ∘ U ¯$ then has only $i$ and $j$ components, implying that $Γ = 0$ and confirming our original assumption.

The fully general solution is then

$U = ℝ \circ (A q + B q^{- 1})$ (39)

where $ℝ$ is an arbitrary rotation. This implies that

$\begin{matrix} r = ℝ \circ {(A q + B q^{- 1})}^{2} \circ i \circ \bar{ℝ} \\ = ℝ \circ (A^{2} z + B^{2} z^{- 1} + 2 A B) \circ i \circ \bar{ℝ} \\ = ℝ \circ (((A^{2} + B^{2}) cos (2 s - 2 σ) + 2 A B) i + (A^{2} - B^{2}) sin (2 s - 2 σ) j) \circ \bar{ℝ} \end{matrix}$ (40)

We already know that $A^{2} + B^{2} = a$ ; introducing $e : = \frac{2 A B}{A^{2} + B^{2}}$ , the same solution can be written as

$r = ℝ \circ a ((cos (2 s - 2 σ) + e) i + \sqrt{1 - e^{2}} \sin (2 s - 2 σ) j) \circ \bar{ℝ}$ (41)

which is easily recognized as an equation of an ellipse with semi-major axis equal to $a$ and eccentricity equal to $e$ . The remaining parameters (called orbital elements) of the general solution are the three Euler angles of the ellipse’s attitude and $σ$ , the value of $s$ at apocenter ( $r$ with the largest magnitude).

Finally, since

$r = a (1 + e \cos (2 s - 2 σ))$ (42)

we get, based on (17b),

$t = \frac{2 a^{3 / 2}}{\sqrt{μ}} (2 s - 2 σ + e \cos (2 s - 2 σ))$ (43)

relating real time $t$ to $2 s - 2 σ$ ; the latter being a modified version (after subtracting $π$ ) of so called eccentric anomaly.

3. Perturbed Equation

When $ε f$ is nonzero, finding a solution becomes substantially more challenging; we can achieve it only to $ε$ accuracy (when terms proportional to higher powers of $ε$ are ignored, as done from now on). Nevertheless, once such first-order solution is found, we can use the same formulas to build a $ε^{2}$ -accurate (and higher) solution, by iterating.

To find $ε$ -accurate solution [6], we must abandon the $Γ = 0$ gauge and return to solving (19) directly; to make the solution unique, a different gauge will offer itself in the process. We must also allow all six orbital elements of the unperturbed solution

$U = ℝ \circ \sqrt{\frac{a}{1 + β^{2}}} (q + β q^{- 1}) : = ℝ \circ U_{0}$ (44)

where $β : = \frac{B}{A}$ (a more convenient parametrization) to be slowly varying (their derivatives proportional to $ε$ ) functions of $s$ (this implies that their second derivatives are $ε^{2}$ -proportional and therefore ignored). The 0 subscript of $U_{0}$ indicates that (i) the corresponding quantity is in Kepler’s frame and (ii) it is evaluated to $ε^{0}$ (i.e. unperturbed-solution) accuracy. Note that now

$q' = k q (1 - ε σ')$ (45)

$z' = 2 k z (1 - ε σ')$

(recall that $ε$ multiplies all $ε^{1}$ -small quantities, while $ε^{2}$ -small and higher order terms are discarded).

Finally, $U_{0}$ itself needs to be extended to (and replaced by)

$U_{p} : = U_{0} + ε U_{D} + ε U_{S} \circ i : = \sqrt{\frac{a}{1 + β^{2}}} (q + β q^{- 1} + q ε D (z) + q \frac{ε S (z) + k ε b}{1 + β z} \circ i)$ (46)

where $D (z)$ is a Laurent-series function of $z$ [7], with missing $D_{- 1} z^{- 1}$ and $D_{0} z^{0}$ terms, i.e.

$D (z) : = \dots + D_{- 4} z^{- 4} + D_{- 3} z^{- 3} + D_{- 2} z^{- 2} + D_{1} z + D_{2} z^{2} + D_{3} z^{3} + D_{4} z^{4} + \dots$ (47)

where the $D_{j}$ coefficients are (yet to be solved for) complex quantities.

Similarly, $S (z)$ is another such complex function which must furthermore meet the $S (z) = - \bar{S (z)}$ condition (our new gauge), and is missing $S_{1} z$ (consequently $S_{- 1} z^{- 1}$ ) and $S_{0}$ terms. The latter one is actually needed to complete our solution, but is kept separate from $S$ (we have denoted it $k b$ , where $b$ is real). Note that $U_{0}$ , $U_{D}$ and $U_{S}$ are thus complex quantities as well. The solution we hope to construct should then results in unique formulas for $D_{j}$ , $S_{j}$ , $b$ and for the $s$ derivatives of six orbital elements; we now proceed to find these by substituting the proposed solution (namely $ℝ \circ U_{p}$ ) into (19), and matching its two sides.

Once done, (17a) would then convert the solution to $r$ (as a function of $s$ ), while (17b) then relates $s$ to real time $t$ ; this is a routine final step of the procedure.

3.1. Further Simplification

We start by simplifying the solution’s first two $s$ derivatives, getting

${(ℝ \circ U_{p})}^{'} = ℝ \circ {U^{'}}_{p} + ℝ^{'} \circ U_{p} = ℝ \circ ({U^{'}}_{p} + \frac{ε Z}{2} \circ U_{0})$ (48)

and

$R ∘ U p'' = R ∘ U p'' + ε Z ∘ U 0'$ (49)

We then simplify the resulting $Γ = {U^{'}}_{p} \circ i \circ \bar{U_{p}} - U_{p} \circ i \circ \bar{{U^{'}}_{p}}$ , which turns out to be nonzero and $ε$ -small. More specifically

$\begin{matrix} Γ = {U^{'}}_{p} \circ i \circ \bar{U_{p}} - U_{p} \circ i \circ \bar{{U^{'}}_{p}} + ε \frac{Z \circ r_{0} + r_{0} \circ Z}{2} \\ = ε ({U^{'}}_{0} \bar{U_{S}} - U_{0} \bar{{U^{'}}_{S}} - {U^{'}}_{S} \bar{U_{0}} + U_{S} \bar{{U^{'}}_{0}} - \frac{Z_{1} + k Z_{2}}{2} {\bar{U_{0}}}^{2} - \frac{Z_{1} - k Z_{2}}{2} U_{0}^{2}) \end{matrix}$ (50)

where $r_{0} : = U_{0} \circ i \circ \bar{U_{0}}$ , and a bar now implies complex conjugation. Note that contributions from the $U_{D}$ part of the solution have cancelled out, the individual terms are complex, but the final Γ is real.

Proof:

$\begin{matrix} {U^{'}}_{p} \circ i \circ \bar{U_{p}} = ({U^{'}}_{0} + ε {U^{'}}_{D} + ε {U^{'}}_{S} \circ i) \circ i \circ (\bar{U_{0}} + ε \bar{U_{D}} - i \circ ε \bar{U_{S}}) \\ = {U^{'}}_{0} \circ i \circ \bar{U_{0}} + ε {U^{'}}_{0} U_{D} \circ i + ε {U^{'}}_{D} U_{0} \circ i + ε U_{0} \bar{U_{S}} - ε {U^{'}}_{S} \bar{U_{0}} \end{matrix}$ (51)

Adding the corresponding complex conjugate removes the first three terms, as $U 0' ∘ i ∘ U 0 ¯$ contributes the unperturbed Γ (which equals to 0). and conjugating the next two term reverses their sign. The last two terms (plus their conjugates) then yield the first four terms of (50). Similarly

$\begin{matrix} Z \circ r_{0} = ((Z_{1} + k Z_{2}) \circ i + k Z_{3}) \circ U_{0} \circ i \circ \bar{U_{0}} \\ = - (Z_{1} + k Z_{2}) {\bar{U_{0}}}^{2} + j \circ Z_{3} {\bar{U_{0}}}^{2} \end{matrix}$ (52)

Adding its conjugate, namely

$- (Z_{1} - k Z_{2}) U_{0}^{2} - Z_{3} U_{0}^{2} \circ j = - (Z_{1} - k Z_{2}) U_{0}^{2} - j \circ Z_{3} {\bar{U_{0}}}^{2}$ (53)

then verifies the remaining terms of (50). $■$

We are now ready to substitute $ℝ \circ U_{p}$ into (19); matching the equation’s two sides then yields formulas for the $s$ -derivatives of orbital elements and for the coefficients of $D$ and $S$ . The resulting set of first-order differential equations for orbital elements (still truly autonomous, implying no need for averaging, and unaffected by one-orbit oscillations) can then be integrated, yielding invaluable insights into their long-range behaviour. On the other hand, $D$ and $S$ and $b$ components of the solution provide parallel and perpendicular (to Kepler frame) distortions, respectively, of a single orbit. Since no approximation is used to build an $ε$ - accurate solution, the same procedure can be used (iteratively) to construct $ε^{2}$ -accurate (and higher-order) solutions, thus achieving arbitrary accuracy. The main advantage over similar techniques such as Lie-Deprit transformation [8] and Hamiltonian canonical perturbation theory [9] is in relative simplicity of our (yet to be derived) formulas. Admittedly, this derivation is rather involved (as seen shortly), but once found, the same formulas can be routinely applied to any autonomous perturbation and carried out to any order of $ε$ accuracy. The only mathematical tool needed is already mentioned Laurent-series expansion of complex functions.

To find explicit formulas for all unknowns of the proposed solution, we first pre-multiply each term of (19) by $\bar{ℝ}$ ; this means that all quantities are transformed into the orbit’s Kepler frame, correspondingly simplifying the $ℝ \circ U_{p}$ solution and its two $s$ derivatives (48) and (49), while the RHS becomes

$- 4 r_{0} \frac{a}{μ} ε f_{0} \circ U_{0} \circ i$ (54)

where $r_{0}$ and $f_{0} : = \bar{ℝ} \circ f \circ ℝ$ are evaluated using the unperturbed solution $U_{0}$ . We now proceed to de-couple the resulting equation by converting it into two complex equations (easier to deal with, as multiplication becomes commutative).

3.2. Building a Solution

To derive first of these equations, we start with the LHS of the Kepler-frame version of (19), namely (expressed to the $ε^{1}$ accuracy, and utilizing the fact that Γ, $Γ^{'}$ and $a^{'}$ are $ε$ small)

$2 U p'' + ε Z ∘ U 0' - 2 U p' + ε Z 2 ∘ U 0 ∘ U ¯ p' - U ¯ 0 ∘ ε Z 2 - 4 a r U p$

$- ε a' a U 0' + 2 ε Γ r 0 U 0' ∘ i + ε Γ r 0' U 0 ∘ i$ (55)

Post-multiplying by $\bar{U_{p}}$ and keeping complex components of each term only, we get (to the same $ε^{1}$ accuracy)

$2 U 1'' + k ε Z 3 U 0' U 1 ¯ - 2 U 1' + k ε Z 3 2 U 0 U 1' ¯ - k ε Z 3 2 U 0 ¯ + 4 a - ε a' a U 0' U 0 ¯$ (56)

(where $U_{1} : = U_{0} + ε U_{D}$ ), while the RHS becomes

$- 4 r_{0} \frac{a}{μ} {(ε f_{0} \circ r_{0})}_{cx} : = - 4 ε r_{0} (1 + β z) Q (z)$ (57)

(the cx subscript implies keeping only scalar and $k$ components), and $Q (z)$ is correspondingly defined Laurent series. The Mathematica program of Figure 1 completes the task of converting the LHS (short of the $\frac{a}{1 + β^{2}}$ factor) into a function of $D (z)$ and of $s$ derivatives of four orbital elements (highlighted); note that $ε^{0}$ terms have cancelled out, as expected. The code is reasonably self-explanatory, even to people not familiar with Mathematica.

Explicit formulas for the corresponding unknowns of the perturbed solution are then found (see Figure 2) by matching coefficients of all powers of $z$ in (57). The program first solves for the $D_{n}$ coefficient of the $D (z)$ expansion while ignoring the $a^{'}$ , $β^{'}$ , $σ^{'}$ , $Z_{3}$ terms (contributing to $z^{- 1}$ , $z^{0}$ and $z^{1}$ powers only). This requires simultaneously solving for $D_{n}$ , $D_{- n}$ and their complex conjugates, while considering them to be algebraically independent (they must and do turn out to be consistent with each other). To build the first of these four equations, we collect (the first line of code) coefficients of $z^{n}$ which $D (z)$ contributes to the LHS of Figure 1, e.g. $\bar{D (z)}$ contributes $D_{- n}$ , $2 z 2 (1 + β z) D'' (z)$ contributes $2 n (n - 1) D_{n} + 2 β (n - 1) (n - 2) D_{n - 1}$ , etc., and subtract the coefficient of $z^{n}$ after expanding the RHS of (57); the second equation is just the $n \to - n$ counterpart of the first one, further including complex conjugates of both. The four equations are then expanded in powers of $β$ and solved in an iterative manner; a careful choice of the expression multiplying $Q (z)$ on the RHS of (57) has made the iterative solution terminate upon reaching $β^{2}$ -proportional terms (see the first part of the program, ending with a highlighted $D_{n}$ solution). Note that the formula does not solve for $D_{- 1}$ and $D_{0}$ , implying that

$D (z) : = \sum_{n = - \infty}^{- 2} D_{n} z^{n} + \sum_{n = 1}^{\infty} D_{n} z^{n}$ (58)

Figure 1. (56) further simplified.

Figure 2. Solving (56)=(57).

This makes $D (z)$ match coefficients of all powers of $z$ , with the exception of $z^{- 1}$ , $z^{0}$ and $z^{1}$ ; these are then used to solve for $a^{'}$ , $β^{'}$ , $σ^{'}$ and $Z_{3}$ ; this is further simplified by solving, separately (yet another helpful de-coupling) the real (yielding $Z_{3}$ and $σ^{'}$ ) and purely imaginary (yielding $a^{'}$ and $β^{'}$ ) parts of the corresponding three equations. Note that in both cases the equations are linearly dependent, thus allowing for a unique solution of each pair of unknowns (highlighted at the end of Figure 3).

To find $S (z)$ , $Z_{1}$ , $Z_{2}$ and $b$ , we now keep only the $i$ and $j$ terms of (55), thus getting

$2 U S'' ∘ i + ε (Z 1 i + Z 2 j) ∘ U 0' + 2 U S ∘ i + 2 ε Γ r 0 U 0' ∘ i + ε Γ r 0' U 0 ∘ i$

Post-multiplying by $i \circ \bar{U_{0}}$ results in

Figure 3. LHS of (59).

$- 2 ε U S'' + U S U 0 ¯ - 2 ε Z 1 + k Z 2 U 0' U 0 ¯$

$- 2 ε Γ r 0 U 0' U 0 ¯ - ε Γ r 0' r 0 = 4 k r 02 a μ (ε f 0) 3 : = 4 ε r 0 W (z)$ (59)

where $\bar{W (z)} = - W (z)$ (note that each term of the equation has this property). We then proceed to simplify the LHS (short of a $- \frac{a}{1 + β^{2}}$ factor) of (59) by Mathematica program of Figure 4.

And then to find $S_{n}$ (now much easier, as only one equation needs to be solved), followed by deriving formulas for $Z_{1}$ , $Z_{2}$ and $b$ (by matching coefficients of $z^{0}$ and $z^{1}$ ); this is done in Figure 5.

We now have a complete set of formulas needed to find (based on the perturbing force) all ingredients of our solution. Using this ( $ε$ -accurate) solution, we can then evaluate both sides of (19) to the $ε^{2}$ accuracy and use the same formulas to construct $ε^{2}$ -accurate solution, and so on. Post-multiplying the final solution by $exp (i δ \cdot s)$ , where $δ$ is a properly chosen constant, we can then impose the $Γ = 0$ gauge, if desired.

3.3. Oblateness Example

To illustrate how to apply these formulas, we consider the perturbing force experienced by an artificial satellites due to Earth’s oblateness, given by (using Kepler’s frame)

Figure 4. Solving (59).

Figure 5. Orbital-element derivatives under oblateness perturbations (Part 1).

$f_{0} : = - \frac{ε μ R^{2}}{r_{0}^{5}} (\frac{3}{2} r_{0} - \frac{15}{2} \frac{{(r_{0} \cdot u_{0})}^{2} r_{0}}{r_{0}^{2}} + 3 (r_{0} \cdot u_{0}) u_{0})$ (60)

where $ε$ is the second zonal-harmonic coefficient, $R$ is the Earth’s equatorial radius, and $u$ (equal to $k$ , by our choice of coordinates) is the unit direction of the Earth’s axis. To convert $u$ to Kepler’s frame, we post-multiply it by $ℝ$ of (13) and pre-multiply by $\bar{ℝ}$ , thus getting

$u_{0} = i \sin θ \sin ψ + j \sin θ \cos ψ + k \cos θ$ (61)

The key quantities needed to build the corresponding perturbed solution are

$Q (z) = \frac{ε a R^{2}}{r_{0}^{5} (1 + β z)} (\frac{3}{2} r_{0}^{2} - \frac{9}{2} {(r_{0} \cdot u_{0})}^{2} - 3 k (r_{0} \cdot u_{0}) {(u_{0} \times r_{0})}_{3})$ (62)

and

$W (z) = - \frac{ε a R^{2}}{r_{0}^{6}} 3 k (r_{0} \cdot u_{0}) \cos θ$ (63)

Converting these into derivatives of the orbital elements is done by Mathematical programs of Figure 5 and Figure 6; while both $a^{'}$ and $β^{'}$ turn out to be equal to zero, further application of (15) results in the following classical formulas

$ϕ^{'} = - \frac{2 ε}{a^{2}} \frac{{(1 + β^{2})}^{4}}{{(1 - β^{2})}^{4}} \cos θ$ (64)

$θ^{'} = 0$

$ψ^{'} = \frac{ε}{a^{2}} \frac{{(1 + β^{2})}^{4}}{{(1 - β^{2})}^{4}} (5 \cos^{2} θ - 1)$

Figure 6. Orbital-element derivatives under oblateness perturbation (Part 2).

which translate into a slow, uniform precession of the orbital plane around the Earth’s axis (with angular speed of $ϕ^{'}$ ) and a similar rotation of the orbit’s perigee within the orbital plane with angular speed of $ψ^{'}$ , while keeping the inclination angle fixed. Note that the perigee rotation ceases when $θ = arcsin \frac{2}{\sqrt{5}}$ , called the critical angle; the $ε^{2}$ -accurate solutions reveals that, instead of fully stopping, the rotation turns into a libration around the critical angle [10] and [11], but pursuing this goes beyond the scope of this article (consult the two references for further details). The first of these also derives expressions for resulting distortions of the Kepler’s frame $r$ ; we do not quote them here due to their complexity.

4. Conclusions

In this final section we want to mention a few potential extensions of the new technique (excluded from our basic presentation). Firstly, the technique is easily capable of dealing with more than one autonomous force at a time and (at the same time) reaching an arbitrarily accurate solution by iterating; this simply requires attaching a different small parameter ( $ε_{1}, ε_{2}, \dots$ ) to each individual perturbing force and applying existing formulas to individual terms of the corresponding expansion. A more difficult task (even in a single-force situation) is to establish the largest value of $ε$ which forces the iteration process to converge; this remains an open question. But luckily, in typical applications, perturbing forces are small enough to guarantee fast convergence.

The most obvious modification of our formulas is clearly needed when the perturbing force is no longer autonomous; this has been done in [12]. The solution can no longer avoid introducing fast (i.e. one orbit) oscillations into the resulting differential equations for orbital elements, thus making it more difficult to maintain high accuracy when exploring their long-term (millions of years) behavior. Nevertheless, in these cases, getting the correct qualitative picture is often the main goal, amply met by the extended formulas.

When all parameters of the perturbing force are numerical, it becomes beneficial (when constructing higher-order solution) to modify the iteration process not just to find the next-order solution, but also to update the existing one (till no changes are observed); this has been demonstrated in [13] and offers a way of properly dealing with small divisors.

The final challenge to solving a perturbed Kepler problem is due to so called resonances, happening when a ratio of the satellite’s (used in a generic sense) orbital period to the period of a (cyclic) perturbing force becomes a simple fraction (such as 2:1, 3:2 etc.); the new technique is well equipped to elucidate main features of the corresponding solution, as corroborated by [14] and several related articles.

The new technique has thus a large range of applications (well beyond what we could cover in this article) due to its ability to resolve several potential issues adversely affecting many traditional methods of solution. Yet, further advancement is still possible, but requires a solid background in the technique’s fundamentals; it is hoped that our article has provided it.

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this paper.

References

[1]	Serre, J.P. (1973) A Course in Arithmetic, Graduate Texts in Mathematics. Springer.
[2]	Chelnokov, Y.N. (2022) Quaternion Methods and Models of Regular Celestial Mechanics and Astrodynamics. Applied Mathematics and Mechanics, 43, 21-80.[CrossRef]
[3]	Goldstein, H. (1980) Classical Mechanics. 2nd Edition, Addison-Wesley.
[4]	Kustaanheimo, P., Schinzel, A., Davenport, H. and Stiefel, E. (1965) Perturbation Theory of Kepler Motion Based on Spinor Regularization. Journal für die reine und angewandte Mathematik, 1965, 204-219.[CrossRef]
[5]	Stiefel, E.L. and Scheifele, G. (1971) Linear and Regular Celestial Mechanics. Springer-Verlag.
[6]	Vrbik, J. (2023) New Methods of Celestial Mechanics. Bantham Science Publishers.
[7]	Rudin, W. (1987) Real and Complex Analysis. 3rd Edition, McGraw-Hill.
[8]	Cary, J.R. (1981) Lie Transform Perturbation Theory for Hamiltonian Systems. Physics Reports, 79, 129-159.[CrossRef]
[9]	Boccaletti, D. and Pucacco, G. (1999) Theory of Orbits. Volume 2: Perturbative and Geometrical Methods. Springer-Verlag.
[10]	Vrbik, J. (1997) Oblateness Perturbations to Fourth Order. Monthly Notices of the Royal Astronomical Society, 291, 65-70.[CrossRef]
[11]	Vrbik, J. (2009) Second Erratum: Oblateness Perturbations to Fourth Order. Monthly Notices of the Royal Astronomical Society, 399, 1088.[CrossRef]
[12]	Vrbik, J. (1995) Perturbed Kepler Problem in Quaternionic Form. Journal of Physics A: Mathematical and General, 28, 6245-6252.[CrossRef]
[13]	Vrbik, J. (2001) Quaternionic Processor. Celestial Mechanics and Dynamical Astronomy, 80, 111-118.[CrossRef]
[14]	Vrbik, J. (1996) Resonance Formation of Kirkwood Gaps and Asteroid Clusters. Journal of Physics A: Mathematical and General, 29, 3311-3316.[CrossRef]

	[email protected]
	+86 18163351462 (WhatsApp)
	1655362766
	SCIRP WeChat

Journals Menu

Home

About SCIRP

Service

Policies