‘Global’ convergence

‘Global’ convergence#

There are several options to modify the Newton-Raphson method in order to enhance the robustness of root finding, but the improvement in robustness has to be weighed against the computational expense.

We have to assume our initial guess is reasonable, so the goal is to ensure the solution doesn’t wander.

Line search#

If we trust that $\Delta \vec{x}$ is pointing in the right direction, then we should make sure it doesn’t step too far.

These approaches are called line search alrogithms since we are moving along the direction prescribed by $\Delta \vec{x}$.

Damped Newton-Raphson#

The easiest modification is adding a damping term. Calculate $\Delta \vec{x}$ as usual from $J \Delta \vec{x} = - \vec{f}$ but update the step a scalar factor of the increment: $$ \vec{x}^{i+1} = \vec{x}^i+\alpha \Delta \vec{x} $$

For $\alpha =1$ we recover the method.

For $0 \lt \alpha \lt 1$ the method is underdamped, and the solver forced to take small steps, keeping it closer to the guess but at the cost of convergence speed.

For $\alpha \gt 1$ the method is overdamped, and solver steps exagerated. This may seem like a good idea to speed things up but if you are not careful you will constantly overshoot your root or produce frenetic behaviour!

Optimal step size#

From the damped approach, we can also attempt to find a ‘good’ step size algorithmically. Near the root, we know we want $\alpha=1$ to get quadratic convergence, so this is often a starting point.

There are several approaches which levarage the information we have:

$\vec{f}(\vec{x})$
$J(\vec{x})$
$\vec{f}(\vec{x}+\Delta \vec{x})$

Fit a quadratic#

With 3 pieces of information, we can fit a quadratic in the search direction and choose the minimum for the update.

Backtracking#

The backtracking routine starts with $\alpha=1$ and then subdivides it until

\[\vec{f}(\vec{x}+\alpha \Delta \vec{x}) - \vec{f}(\vec{x}) \approx J(\vec{x}) \alpha \Delta \vec{x}\]

Bounded minimization#

Find $\alpha$ in the range (0,1] that minimizes some measure of residual, e.g. $\| \vec{f}(\vec{x}+\alpha \Delta \vec{x}) \|$.

This is a 1D minimization problem (similar to our 1D root finding) and can use similar tools. The tradeoff here is the efficiency of this minimzation vs just taking another Newton step.

Trust region methods#

A newer method for root finding adapts the concept of trust regions from optimization. Essentially one defines a region around an interation where the function is trusted to behave like a quadratic. This region is updated (similar to backtracking) as the solution proceeds according to some criteria.