David's Blog

Living a quiet life in Coquitlam, B.C.

Name:
Location: Coquitlam, British Columbia, Canada

Friday, March 08, 2019

Multi-dimensional Root-finder - Two Nonlinear Equations in Two Unknowns


The multi-dimensional root-finder to which the title of this post links seeks a root of a system of two non-linear equations in two unknowns:

f1(x, y) = 0
f2(x, y ) = 0


Specifically,


where the aij and ci are constant coefficients.

The interface for the program offers fields for the input of up to sixteen coefficients (eight for each function), although it is unlikely that non-zero values for all coefficients will be needed for a given problem. Sixteen fields are included to make the program applicable to a wider range of problems than if, say, only three fields for each were provided. In other words, instead of being restricted to a function definition such as
f1 = a00 x + a01 xy + a02 x², users are given the capability to form many more definitions of f1. The same goes for f2. In fact, the system presented is just one example of the type of problem that can be solved by the underlying computer routine. The underlying routine can be used to seek the roots of many different systems of N Non-linear Equations in N Unknowns.

The routine underlying the program is a translation of the Fortran routine SNSQE, written by K. L. Hiebert (SNLA), which is posted on the SLATEC site as SNSQE. SNSQE was translated into C++ and then into JavaScript for execution on a web page. It should be noted that this version is not a verbatim transliteration of Fortran into C++. Perhaps the most significant difference between the two programs is the definition of the Jacobian Matrix: the Fortran program uses a Jacobian Matrix whose elements are defined as Jij = ∂fi / ∂xj whereas this C++ version defines the elements as Jij = ∂fj / ∂xi (i.e., the transpose. More on that below.)

In addition, several loops were removed, operations taken out of loops, and several variables eliminated (e.g., SING, NJEV, ML, MU, MODE, FACTOR, NPRINT, DIAG.) SING is a Boolean variable that is set to TRUE if a particular array element is zero--but it is never used. It neither influences program flow nor is it passed out of the routine as a parameter. So it was removed from this C++ version of the program and all references to it deleted. NJEV keeps track of how many times the Jacobian matrix is computed, but it is not used so it, too, was eliminated. And SNSQE passes many variables into sub-routines with the same--unchanging--values, so these values were hard-coded into the program and the variables, themselves, deleted. Overall, I tried to tighten up the code.

My C++ translation of SNSQE is posted on GitHub, incorporated into a main program for a similar system.

Solving Systems of Non-linear Equations

A few previous blog posts have examined root-finding for single non-linear equations in one variable, for example, Kepler's Equation of Elliptical Motion, or solving the equations cos(x) = x or tan(x) = x, to name a few. The equations are rearranged such that all terms are on one side of the equation and equated to 0 (i.e., tan(x) - x = 0.) The task then is to find a zero, or root, of the equation. Note that in some cases, like
tan(x) - x = 0, multiple solutions exist. Because the function tan(x) is periodic, its plot intersects the line f(x) = x an infinite number of times. So, if a particular point of intersection is sought, a specific interval of examination needs to be input to the computer program--or it may converge to the wrong zero.

The same concepts that are employed to solve non-linear equations in one dimension are also applicable to solving systems of non-linear equations. And the same caveats apply:

      (i) the program may not find a solution, or

      (ii) if the system has multiple solutions, the program might converge to one that is not being sought. Generally, if a user knows approximately where the desired root exists, a point close by is supplied to the computer program as the starting point with the intent that the program will converge to the desired root.

The process of solving multi-dimensional systems is rather complicated and involves many steps; the entire process will not be reviewed in this post. However, I would like to briefly elaborate on the computation of the Jacobian Matrix. Algorithms for solving non-linear equations in one dimension often use the derivative of the function. To solve multi-dimensional non-linear systems, the Jacobian Matrix, which is made up of the partial derivatives of the functions with respect to (w.r.t.) each variable, plays an analogous role. In the akiti.ca program, the Jacobian matrix is computed as Jij = ∂fj / ∂xi:


This definition of the Jacobian Matrix is the transpose of most other definitions, which define it as Jij = ∂fi / ∂xj. However, C++ and JavaScript use row-major memory allocation, so defining the Jacobian in this way allowed certain blocks of code in the program to be simplified.

Also note that, for human readers, systems are often described in terms of x, y, z, w, etc. However, when converting the systems to matrix formats, especially for use within computer programs, the independent variables are instead referred to as x1, x2, x3, x4, etc. and saved in an array, [x]T = [x1, x2, x3, x4, ...].

In general, there are two approaches for computing the Jacobian Matrix:
(i) differentiating the functions to yield analytic expressions, and
(ii) computing discrete partial derivatives.

Option 1 - Differentiating the Functions

Consider an example. Say a system is described by the following equations:

0.3 * exp(0.2 * x) + 0.5 * x * sin(2.0 * y) + 0.75 * cos (z) = 0.739
0.333 * x2 * z - cos(0.95 * y) - 1.5 * exp(-0.25 * z) = 1.5
tanh(x) - 0.5 * sin(y) * z + exp(-0.5 * z) = 4.5

This system would be rearranged so that each equation is equal to 0:

f1 = 0.3 * exp(0.2 * x) + 0.5 * x * sin(2.0 * y) + 0.75 * cos (z) - 0.739 = 0
f2 = 0.333 * x2 * z - cos(0.95 * y) - 1.5 * exp(-0.25 * z) - 1.5 = 0
f3 = tanh(x) - 0.5 * sin(y) * z + exp(-0.5 * z) - 4.5 = 0

The system now represents a problem of finding a root of three non-linear equations in three unknowns.

The Jacobian Matrix for this system is formed by differentiating each function w.r.t. each variable:

J11 = ∂f1 / ∂x1 = ∂f1 / ∂x = 0.3 * 0.2 * exp(0.2 * x) + 0.5 * sin(2.0 * y)

J12 = ∂f2 / ∂x1 = ∂f2 / ∂x = 2 * 0.333 * x * z

J13 = ∂f3 / ∂x1 = ∂f3 / ∂x = sech2(x)

-----------------

J21 = ∂f1 / ∂x2 = ∂f1 / ∂y = 2.0 * 0.5 * x * cos(2.0 * y)

J22 = ∂f2 / ∂x2 = ∂f2 / ∂y = 0.95 * sin(0.95 * y)

J23 = ∂f3 / ∂x2 = ∂f3 / ∂y = -0.5 * cos(y) * z

-----------------
J31 = ∂f1 / ∂x3 = ∂f1 / ∂z = -0.75 * sin(z)

J32 = ∂f2 / ∂x3 = ∂f2 / ∂z = 0.333 * x2 + 1.5 * 0.25 * exp(-0.25 * z)

J33 = ∂f3 / ∂x3 = ∂f3 / ∂z = -0.5 * sin(y) - 0.5 * exp(-0.5 * z)

In this example, explicit expressions for the partial derivatives of each function w.r.t. each independent variable are easily found. Writing a small sub-routine that computes each Jij at a given (x, y, z) point to form the complete Jacobian Matrix would be simple.

However, some systems may be comprised of functions whose partial derivatives are not found easily, if at all. Or the partial derivatives may be so expensive, computationally, to calculate that it is easier to compute discrete partial derivatives instead.

Option 2 - Computing Discrete Partial Derivatives

Continuing with the present example, say we have values for the present xi points and the corresponding values of the functions, fi:

[x0]T = [x0, y0, z0]
[F0]T = [f10, f20, f30]

To compute discrete partial derivatives, we basically ask the question, how does a small change of each independent variable xi affect each function fi? In this case, we will compute the discrete partial derivatives by Forward Difference Approximation. (There are other methods for computing discrete derivatives.)

To compute the first row of the Jacobian Matrix, the x independent variable is incremented a small amount Δx, the new values of the functions are calculated, and then the slopes between the new points and the original points are computed.
(Δx is usually scaled according to x0:   Δx = abs(x0) * √DBL_EPSILON.)

xnew = x0 + Δx

f1new = f1(xnew, y0, z0)

J11 = (f1new - f10) / Δx

This process is repeated for f2 and f3:

f2new = f2(xnew, y0, z0)

J12 = (f2new - f20) / Δx

f3new = f3(xnew, y0, z0)

J13 = (f3new - f30) / Δx

At the point [x0, y0, z0], we now have an approximate rate of change w.r.t. x for each function. The process is then repeated for y and z, to yield the complete Jacobian Matrix.

Similar to Option 1, the block of code controlling this process can be written into a sub-routine that loops through each variable xi, makes the appropriate increment Δxi, calls the sub-routine that evaluates each fi, computes the slopes, and returns the full Jacobian. Note that this technique requires the computation of the system N times (assuming a System of N Non-linear Equations in N Unknowns) each time the Jacobian Matrix is computed. In general, Option 1 is more accurate and executes faster. However, Option 2 offers the convenience of being more general (i.e., if you don't know what the function is ahead of time, or the program uses one of several functions.)

Labels: , , , , , ,

Sunday, September 17, 2017

Cholesky Decomposition Program


A Cholesky Factorization of a real, symmetric, positive-definite, matrix A is the decomposition of the matrix into either
(i) a lower triangular matrix, L, such that A = L * LT, or
(ii) an upper triangular matrix, U, such that A = UT * U.

(Note that the terms matrix factorization and matrix decomposition are interchangeable.)

Computing the Lower Triangular Matrix, L

Consider a symmetric 3 x 3 matrix, A, on which we would like to perform a Cholesky decomposition, finding the lower triangular matrix. i.e.,


Expanding the right hand side (RHS) of this equation, it becomes


Working through several of the components, a couple patterns can be detected.

l11 = √ a11

l-components in the same column and below l11 are found by scaling the a-components by l11:

l21 = a21 / l11
l31 = a31 / l11

Moving down to the diagonal in the second row,

l22 = √ (a22 - l212)

The component below l22 is

l32 = (a32 - l31 l21) / l22

Finally, moving down to the third row:

l33 = √ (a33 - (l322 + l312))

From this point, a rough algorithm can be developed.
Given an N x N symmetric matrix, A, work through A from the first row (k = 1) to the last row (k = N), first computing the diagonal of that row, and then elements of the column under it.

1) Compute the diagonal element of the k-th row according to the following formula:


2) Compute elements below the diagonal for the k-th column according to the following formula (where i > k):


Computing the Upper Triangular Matrix, U

Repeating the process to find the upper triangular matrix:


Expanding the RHS of this equation, it becomes


Again,

u11 = √ a11

u-components in the same row and to the right of u11 are found by scaling the a-components by u11:

u12 = a12 / u11
u13 = a13 / u11

Moving down to the diagonal in the second row,

u22 = √ (a22 - u122)

The component to the right of u22 is

u23 = (a23 - u12 u13) / u22

Finally, moving down to the third row

u33 = √ (a33 - (u132 + u232))

Again, from this point, a rough algorithm can be developed (but with different summations).
Given an N x N symmetric matrix, A, work through A from the first row (k = 1) to the last row (k = N), first computing the diagonal of that row, and then elements of the row to the right of it.

1) Compute the diagonal element of the k-th row according to the following formula:


2) Compute elements to the right of the diagonal for the k-th row according to the following formula (where i > k):


Whichever approach is taken for the decomposition, computing the L matrix or the U matrix, the numbers are the same; the transpose of L is the same as U, and vice versa.

Significance of the Cholesky Decomposition

Among its several applications, a Cholesky Factorization can be used to determine if a symmetric matrix is positive-definite. (The positive-definiteness of a matrix could be checked by computing its eigenvalues: if they are all greater than zero, the matrix is positive-definite. However, computing the eigenvalues of a matrix is computationally a much more expensive process than a Cholesky decomposition.) Given a symmetric matrix, A, we initially assume A is positive-definite and pass it into a Cholesky Decomposition routine. If the routine fails, the initial assumption is proven wrong. On the other hand, if the decomposition succeeds, the positive-definiteness of A is confirmed.

Knowing whether a matrix is positive-definite or not is an important bit of information to have when working in the field of Quadratic Programming (minimizing a specific type of function, F.) When F is stated as a matrix-vector equation, the matrix involved is a symmetric matrix, called the Hessian matrix. If the Hessian matrix is not only symmetric, but also positive-definite, F has a global minimum. If the Hessian matrix is not positive-definite, F does not have a global minimum. Because Quadratic Programming problems can be extremely difficult to solve, knowledge about the existence of a global minimum-- before starting the task of solving for it-- would indicate if efforts would be better spent directed toward a different goal. In other words, if a function is known to not have a global minimum, but does have local minima, it might be desirable to direct the search program to find a local minimum. Parameters and inputs to the Quadratic Programming routine could be modified so that the algorithm is not seeking a global minimum that does not exist.

The Cholesky Decomposition program to which the title of this post hot-links computes U. Note that entry of the full symmetric matrix, A, is expected, even though only the upper triangular half of it is used by the decomposition routine; the strictly lower triangular half of A is not referenced. The symmetry of A is confirmed by the main program before being passed into the decomposition routine. This JavaScript program is a translation of the LAPACK routine DPOTRF.F, which is posted on the NETLIB website. DPOTRF.F includes a multitude of options to customize the program: whether the code is to be blocked or unblocked, whether to compute the upper triangular U or lower triangular L matrix, whether to use loop unrolling, and whether loop increments are the same, to name a few. This JavaScript translation offers none of these options. It computes U. That's it.

A C++ translation of this program is available on GitHub.

Labels: , , , , ,

Monday, February 20, 2017

Multi-dimensional Optimization Routine


The multi-dimensional optimization routine to which the title of this post links seeks a minimum of a multivariate polynomial of four independent variables with no constraints.

F = F(w, x, y, z)


The interface for the routine offers fields for the input of up to twenty coefficients, although it is unlikely that non-zero values for all coefficients will be needed for a given problem. Twenty fields are included to make the program applicable to a wider range of problems than if, say, only five fields were provided. In other words, instead of being restricted to a function definition such as F = A w² + B x² + C y² + D z² + E wxyz, users are given the capability to form many more definitions of F. In fact, the function presented is just one example of the type of problem that can be solved by the underlying computer sub-routine. The underlying routine can be used to minimize a function with multiple independent variables defined by polynomials, transcendental operators, a combination of the two, or defined in yet some other way. The routine can be applied to any problem that fits the program's purpose: to seek a minimum of a multi-dimensional function with no constraints.

The routine underlying the program is a translation of the FORTRAN sub-routine UNCMND, written by Stephen Nash. UNCMND was translated into C++ and then into JavaScript for execution on a web page. Many test cases were run, with different numbers of independent variables, and a variety of functions, and the program returned correct results. My C++ translation of UNCMND is posted on GitHub, incorporated into a main program for which the Objective Function is the General Rosenbrock Function for N = 10. The General Rosenbrock Function is a well-known function that is often used to test optimization routines. It is generally defined as


The program correctly finds the optimal solution x* = [1, 1, 1, ... , 1]T, at which point the function value is F(x*) = 1.

Some Optimization Background

In the field of mathematics and computer programming, optimization refers to the minimization or maximization of a function, F, often called the Objective Function.

In practice, most computer algorithms designed for this task seek a minimum of a function; however, these programs are equally effective for seeking a maximum of a function because maximizing F is the same as minimizing -F.

In the case of one independent variable (i.e., one dimension), many techniques have been developed. In the simplest cases, techniques of calculus may be applied: the function may be differentiated with respect to the independent variable, the derivative set to zero, and the resulting equation solved for the solution.

Consider an example. Say f = -2 x ² + 3 x + 4
A portion of this function is shown below:



Differentiating this function: df/dx = -4 x + 3
Setting the derivative equal to 0:   4x = 3
Solving yields the result: x = 0.75

So, this function takes its maximum value at x = 0.75.
At this point, f = f(0.75) = 5.125
We have now determined where this function takes its maximum value, and its value at this point.

Most optimization problems cannot be solved so easily. An analytic derivative of the function may not exist. Or, if it does, it may not be possible to analytically solve it for a zero. So, many sophisticated techniques have been developed to numerically find the minimum of a function using iterative methods.

When a function is defined by more than one independent variable (i.e., multi-dimensional), the problem becomes more complicated. Some of the basic concepts from the one dimensional case can be transferred over to multi-dimensional cases, but other factors come into play.

Much information is available on the topic of optimization, so won't be repeated here, but I will mention two of my favorite first visits when seeking information on mathematics topics: Wikipedia and MathWorld. Instead, I will highlight some differences between the original FORTRAN version of UNCMND and my versions.

Changes:

In addition to translating UNCMND source code from FORTRAN to C++ and JavaScript, several changes were made to the structure of the program itself.

1) The options for providing an analytic Gradient Vector and Hessian Matrix have been removed.

Say the Objective Function is one of three dimensions, e.g.,

f = f(x1, x2, x3) = x²1 + 2 x1x2 - 0.5 x13 - 1.6 x²2x3

In this case, partial derivatives of this function could be found analytically by the standard definition of the derivative of f:


The Gradient Vector would be

Gradient Vector = [∂f/x1, ∂f/x2, ∂f/x3,]T

with each gradient component found as follows:

∂f/x1 = 2x1 + 2 x2 - 0.5 x²3
∂f/x2 = 2 x1 - 2 * 1.6 x2x3
∂f/x3 = -2 * 0.5 x1 x3 - 1.6 x²2

However, I decided that the underlying routine will rarely be applied to functions with analytic derivatives available--and at low computational cost. In general, the Objective Function is not known ahead of time; it may or may not have an analytic derivative. If it does, a new sub-routine for that particular function would have to be supplied by the user: written and debugged. To keep the routine as generic as possible, the option of computing the Gradient Vector by analytic derivatives was removed; instead, it is computed by finite difference only.

Likewise, the Hessian Matrix is defined by


Again, instead of requiring users to input another sub-routine (if an analytic definition were to exist), the Hessian Matrix is computed by secant updates.

Note that the JavaScript program uses different notation for the independent variables: w, x, y, z instead of x1, x2, x3, x4. This change was made to further clarify terms.

2) In the original FORTRAN code, one of the possible values the error code INFO can return with is 0: "INFO = 0: Optimal solution found." However, I saw no way for the algorithm to actually return with INFO having a zero value, so I removed mention of the INFO = 0 option.

3) A few Boolean variables were added to the program to avoid some computations and speed up execution. For example, one Boolean variable keeps track of whether program execution is in the first loop or not. Since the Hessian Matrix is computed by secant updates, and initialized to the Identity Matrix, the first time through the main loop several vectors can be assigned explicitly; computations through a Forward Solve and Backward Solve can be skipped.

4) Some computations were taken out of loops. New constants were declared and passed into routines to avoid computing the same quantities repeatedly.

Labels: , , , , , ,

Monday, April 27, 2015

Discrete Fourier Transform

The Discrete Fourier Transform (DFT) transforms a data sequence--representing motion that presumably includes periodic components,  and sampled at uniform intervals--into a sequence of cyclical functions. Specifically, given a sequence of samples, g0, g1, g2, g3, . . . gN - 1, taken at uniform time intervals, t0, t1, t2, t3, . . . tN - 1, the DFT computes coefficients a0, a1, a2, a3, . . . a[N/2] and b1, b2, b3, . . . b[N/2] that best satisfy the following equation:


for j = 0, 1, 2, 3, ... N - 1

where the upper limit [N/2] refers to the largest integer in N:
if N is even, [N/2] refers to N/2;
if N is odd, [N/2] refers to (N - 1)/2.

Note that the a-coefficients include an a0 term; the b-coefficients do not. The a0 term is simply the average value of the data points, and is called the DC component:

The balance of the coefficients are computed by taking one of two paths, depending upon whether N is odd or even:

i) N is odd. The next two equations apply for 1 ≤ k ≤ [N/2]:



ii) N is even. The preceding two equations apply for 1 ≤ k < N/2. The coefficients for k = N/2 are not included.
Instead,


and
bN/2 = 0

When N is even, the value for aN/2 is computed via a simpler expression and the computation for bN/2 is eliminated altogether, being replaced by a simple assignment.

The Sample Data

The sample data used on the akiti.ca web page comes from elliptical motion: eccentricity, e = 0.7861513777574233 and T = 1. Elliptical motion is a fundamental motion of nature, it is periodic, and it is sine-like, so it should serve as an informative example. The motion is divided into a specified number of intervals, and the sample data represents values of sin(E) at the interval endpoints.

Note from the example that the a0 term is zero, as expected--elliptical motion does not have a DC component.

Consider another data sequence. Again, sin(E) values over one complete period are used, but this time the motion is divided into twenty intervals, giving twenty-one data points (i.e., N = 21.)

The sin(E) values:
0
0.817269695391509
0.986037450748215
0.988903757460498
0.918078911674818
0.805917329559989
0.667985913253395
0.513259356200768
0.34762268970459
0.175375726392183
0
-0.175375726392183
-0.347622689704591
-0.513259356200768
-0.667985913253395
-0.805917329559989
-0.918078911674817
-0.988903757460498
-0.986037450748215
-0.817269695391509
0

Some notes:
i) The two endpoints are 0, as they should be:
sin(E) = 0 at t = 0 and t = T.
ii) sin(E) = 0 at the mid-point (t = T/2), as it should be.
iii) t5 is T/4 (T/20 * 5 = T/4), so the sixth value in this data set represents sin(ET/4).

Performing a DFT on this sequence yields the following results:

The ak values follow:
a0 = 0
0.137715757665909
0.0774652995660697
0.0390981194174445
0.0103228847004814
-0.0123006115818896
-0.0301489143722025
-0.0439017247009599
-0.053947159831484
-0.0605247240627844
-0.063778926800584

The bk values follow:
0.913684352303341
0.251136333078314
0.081188081536064
0.0151408910090107
-0.0132569064357189
-0.0240429569155258
-0.0253466725739878
-0.0211727000178036
-0.0138143733065952
-0.00477956981523944

This time, since N is odd, b[N/2] is non-zero.

Complex Variable Form

The results of Fourier Transforms are often stated in complex variable form: z = u + i v, where i indicates the imaginary component. This format can be derived if the original transform definition is expressed in complex format:


for j = 0, 1, 2, 3, ... N - 1.

The Gk are computed as


for k = 0, 1, 2, 3, ... N - 1.

The G, a, and b values are related as follows:

G0 = a0.

Gk = (ak - i bk)/ 2   for 1 ≤ k ≤ [N/2] except if N is even, in which case,
GN/2 = aN/2.

For [N/2] < k < N, the Gk are complex conjugates of the first half.

Using data from the example above:

G0 = 0
0.0688578788329544 - 0.45684217615167i
0.0387326497830349 - 0.125568166539157i
0.0195490597087223 - 0.040594040768032i
0.00516144235024068 - 0.00757044550450537i
-0.00615030579094482 + 0.00662845321785944i
-0.0150744571861012 + 0.0120214784577629i
-0.0219508623504799 + 0.0126733362869939i
-0.026973579915742 + 0.0105863500089018i
-0.0302623620313922 + 0.00690718665329759i
-0.031889463400292 + 0.00238978490761972i
-0.031889463400292 - 0.00238978490761972i
-0.0302623620313922 - 0.00690718665329759i
-0.026973579915742 - 0.0105863500089018i
-0.0219508623504799 - 0.0126733362869939i
-0.0150744571861012 - 0.0120214784577629i
-0.00615030579094482 - 0.00662845321785944i
0.00516144235024068 + 0.00757044550450537i
0.0195490597087223 + 0.040594040768032i
0.0387326497830349 + 0.125568166539157i
0.0688578788329544 + 0.45684217615167i


Comparing Results Between Various Programs

Different programs scale the Fourier coefficients differently, so results will differ. For example, Mathcad and Mathematica use a definition of the transform such that the coefficients are computed as follows:


for k = 0, 1, 2, 3, ... N - 1.

Note that the ck coefficients are scaled by a factor of 1/√ N instead of 1/N, so the ak and bk values must be multiplied by a factor of √ N to make the numerical values of the coefficients the same. Furthermore, the definition of the ck coefficients does not include a negative factor in the exponential; multiplying the bk coefficients by -1 corroborates the ck values output by Mathcad for 0 ≤ [N/2] (and the balance are the complex conjugates):

0
0.3155464419461 + 2.0935138528634i
0.177495299497 + 0.5754256280425i
0.0895850458804 + 0.1860252645836i
0.0236527002651 + 0.0346921395689i
-0.0281842418341 - 0.0303753886113i
-0.0690798411157 - 0.055089334998i
-0.1005914882906 - 0.0580765228428i
-0.1236084717278 - 0.0485127502491i
-0.138679564717 - 0.0316527056779i
-0.1461358799034 - 0.0109513702338i
-0.1461358799034 + 0.0109513702338i
-0.138679564717 + 0.0316527056779i
-0.1236084717278 + 0.0485127502491i
-0.1005914882906 + 0.0580765228428i
-0.0690798411157 + 0.055089334998i
-0.0281842418341 + 0.0303753886113i
0.0236527002651 - 0.0346921395689i
0.0895850458804 - 0.1860252645836i
0.177495299497 - 0.5754256280425i
0.3155464419461 - 2.0935138528634i


In fact, the bk coefficients do not have to be multiplied by -1 to yield a valid result; either the first set of coefficients or their complex conjugates are valid results, so the Gk values for 0 < k < [N/2] can just as well be compared to the ck values for [N/2] < k < N in the first place (and eliminate the need to multiply by -1.)

In this example, the two definitions of the transform are known; the differences in scaling can be taken into account, and the results are seen to verify each other. However, the definition of the transform used internally by some programs may not be known, so it may not be straightforward to corroborate results among different programs. In such cases, the coefficient values must be plugged back into the original definition of the transform to confirm that the original data points are returned. In fact, this is the only way to truly put your mind at ease that a program works. (Always a good idea in any case.)
 

Labels: , , , , , , ,

Saturday, April 07, 2012

Kepler’s Equation of Elliptical Motion Solver

Kepler’s Equation of Elliptical Motion is a famous example of an equation for which an exact, analytical, solution is not possible. In its most succinct form, it is stated as follows:

E - e sin(E) = M

where e is the eccentricity of the ellipse described by the motion
(0 < e < 1);
E is the eccentric anomaly; and
M is the mean anomaly, 2 π /T (t is the time since periapse, and T is the period of the motion).
Note that there is no way to express E explicitly (i.e. – E = a simple expression). Solutions are found either (i) by making approximations, or (ii) computing numerical solutions to a desired degree of accuracy. This post considers some aspects of the second option: computing numerical solutions.

The task of computing numerical solutions to Kepler’s Equation of Elliptical Motion (KE) falls into the general category of root-finding. The equation is arranged as follows

E - e sin(E) – M = 0

which is of the general form f(x) = 0.
It is now in the form applicable to the general task of root-finding (i.e. – finding values of x that make the function f(x) equal to 0, called the roots).

The task of root-finding is a well-developed field of mathematics and computer science. As far as I know, all root-finding algorithms take an iterative approach to computing a solution to a desired degree of accuracy: first, an initial guess for the root is made and checked, then a closer solution is approximated and checked, and this process is repeated until the desired accuracy is obtained.

A number of numerical techniques exist for computing solutions to KE. Some are rather simple approaches, such as the bisection method; others are more sophisticated. In fact, the computing power of today’s computers provides the capability for even less-than-optimum techniques to compute solutions in the blink of an eye. However, motivated by a desire to be more pedantic, a few aspects of motion defined by KE can be highlighted, and their influence on the approaches taken to KE’s solution can be observed. First, consider the two figures below, which plot the motion of two variables defined by KE over one period, T. Figure 1 is a graph of E and M versus t; E is the solid red line and M is the dashed blue line. Figure 2 is a graph of sin(E) and sin(M) versus t; sin(E) is the solid red line and sin(M) is the dashed blue line.

 

 
Three factors come into play for optimizing an approach for solving KE:
1) the size of the interval;
2) the algorithm;
3) the initial guess for a root.

The Size of the Interval

The interval in which a root exists is usually input to a root-finding algorithm. If the interval over which a solution will be sought is small, the computer program has less work to do. The smaller the interval is to begin with, the faster it will be narrowed to an acceptable result.

Rearranging KE, we get
E = M + e sin(E)

Because sin(E) alternates between 1 and -1, the magnitude of the second term on the right side of the equation is restricted to e. In other words, E will never be larger than
E = M + e
and never be smaller than
E = M - e

In fact, in most real-world situations, e is quite small; the eccentricities of the orbits of most of the planets around the Sun are much smaller than 1. For example, the eccentricity of the orbit of Mars, which has a comparatively large eccentricity, is about 0.093. In other words, to compute a point in the orbit of Mars using Kepler’s Equation, the first guess would already be accurate to one decimal point because the interval itself is already less than 0.1. Not much more can be done to reduce the size of the interval for the purpose of hastening a solution of KE.

The Algorithm

Picking an appropriate algorithm is another factor that determines how fast a solution to KE is computed. As can be seen from the graphs above, the motion is smooth and there are no irregularities; first--and higher-order--derivatives exist and are continuous. This characteristic makes algorithms that use these derivatives applicable to the task of solving
KE--qualifying it for a wide range of fast-converging root-finding algorithms. The Newton-Raphson method, which uses the function and its first derivative, is often used for this purpose. However, care must be taken to ensure the algorithm does not diverge away from the solution; otherwise, it converges to a solution at a quadratic rate. Furthermore, other algorithms offer quick convergence, for example, Halley's Method converges at a cubic rate and yet other algorithms converge at a quartic rate. These faster-converging algorithms offer these capabilities by including higher-order terms of the Taylor series when forming the approximation for successive iterations.

(Note that the utility to which the title of this post links uses a combination of the bisection and secant method instead of one of these more sophisticated methods. When the utility was first posted, I planned to write other utilities and wanted an algorithm general enough to use for other root-finding problems; not one that was particularly well-suited to a specific problem.)

The Initial Guess for a Solution

The initial estimate for a root could be anywhere within the interval; however, the closer the first guess is to the actual root, the fewer iterations the algorithm has to make to find the actual root (i.e. - the quicker the program will execute).
Consider the first half of the period:     0 < t < T/2.
For this portion of the motion, sin(E) > 0 and M <= E <= M + e. So the initial estimate for E, Eo, should be in this interval.

One often-recommended initial value for Eo is the following:


Another approach for picking Eo would be to compute some points of E and save them in a table. An interpolating function could then be created and used to provide Eo values for future computations of E. This approach would be especially applicable if E values for a particular ellipse, or over a certain range only, were to be computed.

Other schemes for picking for Eo exist, but many of them get more complicated than I would like to examine in this post.

To summarize, consideration of the characteristics of motion defined by KE yields approaches for its solution that are much more efficient than a simple approach that, on the contrary, takes no account of KE’s characteristics. By (i) minimizing the initial size of the interval over which a solution is to be sought, (ii) applying an appropriate algorithm, and (iii) making an excellent initial estimation for a root, solutions can be computed extremely fast. A variety of methods--the simple Bisection Method, the Newton-Raphson Method, Halley’s Method, and even faster converging iterative approaches--are available to suit a user’s needs, or simply to satisfy a desire to be more pedantic. In conclusion, although an exact solution of Kepler’s Equation does not yet exist, many powerful tools and technique are available that allow numerical solutions to be computed quickly and accurately.

This post presented a brief overview of existing information on the topic of solving KE. For the sake of intellectual curiosity, I do enjoy exploring KE; any work that could be considered research-ish (i.e. – any work done toward the goal of discovering new information) will be posted on my other blog: RocketScientists.ca.

Edit: 20 October 2012
My WordPress blog on rocketscientists.ca was hacked so I deleted the entire site, had my host reset the account, and started over. The main site is back up, but I do not know when--or if--the blog side of the site will ever be re-posted, so I will copy some of those posts over to this blog. And the hotlink given in the paragraph above has been edited to point to the home page.
 

Labels: , , , ,

Sunday, July 18, 2010

Eigenvalue and Eigenvector Calculator

The web page to which the title of this post is hot linked is the most recent utility I have added for computing the eigenvalues and eigenvectors of a real general matrix. It accepts square matrices of up to size twelve by twelve (this limit was arbitrarily picked by me, because that was the largest size matrices with which I worked during my undergraduate engineering studies).

As cited on the page itself, this eigenvalue and eigenvector solver is a translation of routines that come from the EISPACK collection of routines, specifically, balanc, hqr2, elmhes, eltran, and balbak.

Note that the results of the computation may be complex, but the inputs are all assumed to be real entries.

Also note that this utility can solve symmetric matrices, but using a program specifically written for symmetric matrices would be more efficient.

Labels: , , , , , , ,

Sunday, June 06, 2010

Online Matrix Calculator

The web page to which the title of this post is hot-linked is the most recent utility I have added for solving simultaneous equations. It features a few changes and additions compared to the simultaneous equation solvers already posted on the site.

First of all, as mentioned in a previous post, the method of accepting input has been changed; instead of accepting input from an individual text box for each entry, data is now input all at once via a textarea box. Similarly, instead of outputting results to individual text boxes, results are output to the body of the web page. These changes are meant to make it more convenient to copy data to and from the utility, thereby reducing the possibility of introducing errors into the input and output. Note that the output can look confusing for larger matrices; lines wrap to stay within the body of the web page, breaking up rows of the matrix. For example, in the case of a 12 x 12 matrix, on my computer each row of the matrix is output on three lines. However, if the entire output section is copied to a plaintext document, the line wraps go back to where they belong (i.e. - the matrix displays as expected: twelve lines of twelve entries each).

The program, itself, offers some additional capabilities compared to the other simultaneous equation solvers on this site: in addition to being able to solve larger systems (up to twelve equations in twelve unknowns), this utility computes the determinant, the LU-Decomposition, and the inverse of the [A] matrix. In fact, much of this additional capability is included in the existing programs, but not used. The algorithm underlying the program for solving N Equations in N Unknowns performs an LU-Decomposition as one of the steps toward solving the system, but the results of the LU-Decomposition are not output by the other Javascript pages. Furthermore, once an LU-Decomposition has been performed on a matrix, computing the determinant and inverse of that matrix is quite simple. These matrix properties, too, are not output by the other Javascript pages presently on site. This latest utility takes advantage of the fact that, since most of the work is already done (the LU-Decomposition), computing and outputting a few additional properties of the [A] matrix is quick and relatively easy. So, this Javascript utility does compute and output those properties:
1) the [P], [L], and [U] matrices are output, where [P] is the permutation matrix (which keeps track of the permutations that were involved in the decomposition), [L] is the Lower Matrix, and [U] is the Upper Matrix;
2) the determinant of the [A] matrix is computed and output; and
3) the inverse of the [A] matrix is computed and output, [A]-1.

Note that this utility may be used to compute these properties of the [A] matrix without solving a valid system of simultaneous equations. For example, say a user only wants to compute the matrix determinant. In this case, the user would enter a dummy b vector. I use (1 1 1 . . . )T, but it doesn’t matter what the b vector is, as long as it has the appropriate number of entries. The utility would then be executed and the results for the x vector ignored. The properties of [A] are not effected by b, so entering a fake b vector doesn’t change the determinant, LU-Decomposition, or inverse of [A].

Labels: , , , , , , , , , , ,

Sunday, July 19, 2009

Free Math Tools for Science and Engineering Students

On the page in the title of this post are presented links to several numerical mathematics tools, programs written in Javascript for performing specific numerical tasks. I have been working on these math programs for several years but never explained why I do it. So, here goes . . .

When I was an engineering undergraduate, commercial mathematics software was not provided by my school. When the need arose for computer programs that solved math problems, we either searched for existing code or wrote the programs ourselves. At the time, I found good-quality source code written in FORTRAN but had a “C” compiler. So I translated the programs from FORTRAN into “C”. Then, instead of letting the programs sit on a floppy disk in the closet doing nothing, I translated them from “C” into Javascript. (In fact, I have received several requests for “C” source code, so that is slowly getting posted too.)

Posting the routines online as Javascript programs transforms them from passive documents into dynamic, immediate-for-use, tools accessible to anybody with an Internet connection. A few other aspects of Javascript are worth mentioning:

1) Javascript syntax is very similar to that of “C”, so the translation from “C” to Javascript is pretty simple.
2) Javascript programs incorporated in a web page can be run in any browser, regardless of whether the page is being viewed on a computer running Windows, Linux, Mac, etc.
3) A separate executable does not have to be downloaded nor does a plug-in have to be installed.
4) The pages can be saved and the programs can be used in a browser offline.

Over the years, the number of routines posted on the site has grown significantly: there are now utilities for computing the roots of polynomial equations, solving simultaneous equations, computing eigenvalues and eigenvectors, linear least-square data-fitting, cubic spline interpolation, minimization/maximization, numerical integration, and more. In addition, more are planned.

Finally, one short note before I conclude this post. The interface for many of the utilities is getting changed slightly. Instead of requiring all the inputs to be typed in individually, which can get quite tedious and increases the chances of entering typos, more use is being made of textarea boxes, in which all input data can be entered at once. This approach gives users the capability to copy data from a plain-text document and then pasted into the form--a much quicker process and one that eliminates the potential for typos in the input data. In other words, regardless of where the data may come from, as long as it is first copied into a plain-text document and arranged in the appropriate format, it can then quickly and easily be copied into the data input box of the utility.

Labels: , , , , , , , , , , , , , , , ,