Conditioning

Conditioning#

Previously, we discussed how roundoff and truncation error effectively introduce uncertainty into numbers stored on a computer. Herein, we discuss the impact of this uncertainty.

The conditioning of a general function,

\(f(\vec{x}) = \vec{b}\)

relates how uncertainty in the function output, \(b\), relates to uncertainty in the inputs, \(\vec{x}\), in the worst case. It is a key aspect of uncertainty propogation, and is a property of the function \(f\) itself, not the input / output pair, (although behaviour may be localized to certain ranges of \(\vec{x}\) for nonlinear functions).

The condition of the function relates to the distortion between the input uncertainty and output uncertainty. A common misconception is that it is an amplification, but this would imply the inverse function reduces uncertainty, which is not the case. Rather, the condition of the forward function and its inverse (if the inverse exists) is quantifyably the same.

Well conditioned functions feature output uncertainty that varies roughly equivilently to the input uncertinaty. Ill conditioned functions show that small changes in the inputs \(x\) could result in large changes in the outputs \(b\), and vice-versa.

Numerical methods have inherent uncertainty in numbers due to finite precision representation. Ill-conditioned functions can disrupt the solution of finding the solution \(\vec{x}\).

Condition number#

The condition number is the maximum ratio of the relative error in \(\vec{x}\) to that of \(\vec{b}\). For linear functions, which can be represented as \(f(\vec{x}) = A\vec{x} = \vec{b}\), it can be readily derived.

Derivation#

Preturb the inputs and outputs by errors \(\vec{\delta x}\) and \(\vec{\delta b}\) respectively:

(2)#\[\begin{align} \vec{x} & \rightarrow \vec{x} + \vec{\delta x}\\ \vec{b} & \rightarrow \vec{b} + \vec{\delta b} \end{align}\]

\(\vec{x} \rightarrow \vec{x} + \vec{\delta x}\)

\(\vec{b} \rightarrow \vec{b} + \vec{\delta b}\)

The matrix equation becomes,

\(A \left[ \vec{x} + \vec{\delta x} \right] = \vec{b}+\vec{\delta b}\)

\(A \vec{x} + A \vec{\delta x} = \vec{b}+\vec{\delta b}\)

and since we know \(A \vec{x} = \vec{b}\), the uncertainties are directly related:

\(A \vec{\delta x} = \vec{\delta b}\)

or

\(\vec{\delta x} = A^{-1}\vec{\delta b}\)

From this we can calculate the relative error in the input considering the norm (magnitude) of the vector,

\(\frac{||\vec{\delta x}||}{||\vec{x}||} = \frac{||A^{-1}\vec{\delta b}||}{||\vec{x}||}\),

and that of the output,

\(\frac{||\vec{\delta b}||}{||\vec{b}||} = \frac{||\vec{\delta b}||}{|| A \vec{x}||}\).

The ratio of the relative errors is,

\(\frac{||A^{-1}\vec{\delta b}||}{||\vec{x}||} / \frac{||\vec{\delta b}||}{||A \vec{x}||}\)

\(= \frac{||A^{-1}\vec{\delta b}||}{||\vec{x}||} \frac{||A \vec{x}||}{||\vec{\delta b}||} \)

\(= \frac{||A^{-1}\vec{\delta b}||}{||\vec{\delta b}||} \frac{||A \vec{x}||}{||\vec{x}||} \)

The derivation proceeds using the norm of a matrix, the precise definition of which is temporarily deferred for flow. For certain definition of a matrix norm, \(||A \vec{x} || \leq ||A||\cdot ||\vec{x}||\). The maximum ratio of relative errors is defined as the condition number \(\kappa\),

\(\kappa = \max\left( \frac{||A^{-1}\vec{\delta b}||}{||\vec{\delta b}||} \frac{||A \vec{x}||}{||\vec{x}||} \right) \)

\(=\max{\frac{||A^{-1}\vec{\delta b}||}{||\vec{\delta b}||}} \max{\frac{||\vec{b}||}{||A^{-1} \vec{b}||}} \)

\(\kappa =||A^{-1}|| \cdot ||A||\)

from which is it obvious that the condition number is the same for the forward and backward operation.

Application#

The condition number implies,

\(\frac{||\vec{\delta b}||}{||\vec{b}||} \leq \kappa \frac{||\vec{\delta x}||}{||\vec{x}||}\),

but also, surprisingly,

\(\frac{||\vec{\delta x}||}{||\vec{x}||} \leq \kappa \frac{||\vec{\delta b}||}{||\vec{b}||}\)!

This may appear inconsistent, but one must note the role of the inequality. The condition number characterizes the distortion of the transformation, which is reversible, but which transforms uncertainty in \(x\) and \(b\) equivilantly.

Actual calculation through eigenvalues (singular values)#

The formal definition of the condition number requires solution of the inverse. We would rather get the condion number before inverting the matrix / solving the linear system so as to predict the stability of the solution process.

The maximal stretching / distortion of a matrix is (by definition) it’s largest eigenvalue. The maximal stretching of the inverse is the smallest eigenvalue of the original. Therefore, through estimation of the largest and smallest eigenvalues (which can be obtained relatively efficiently), we can calculate,

\(\kappa = \frac{\lambda^{max}}{\lambda^{min}}\)

which is an analytical result for the Euclidian (\(L^2\)) matrix norm.

NB: In the literature one may encounter the term singular value which is a generalization of eigenvalues to non-suare matricies.

Matrix norms#

For vectors we know the magnitude (called the norm) is: \(||v|| = \sqrt{\sum_n v_i^2} = \sqrt{v\cdot v}\).

In fact this is more than one type of vector norm. In general, \(||v||_p = \sum_i \left(v_i^p\right) ^{1/p}\). The p=2 norm is the ‘Euclidian norm’, for \(p\rightarrow \infty\), \(|v|_{\infty} = max(v_i)\).

The Euclidian / Frobenius norm is simply,

\(||A||_F = \sqrt{\sum_{i=1}^{m} \sum_{j=1}^{n} |A_{ij}|^2}\)

\(||A||_F = \sqrt{A:A}\)

Infinity Norm (maximum row-sum norm)

\(||A||_\infty = \max_{1 \le i \le m} \sum_{j=1}^{n} |A_{ij}|\)