Skip to main content

Scientific Computing

Non-linear Optimis...

Line Search M... []

This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training.

This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training.

Creative Commons License
This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1

This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1

Creative Commons License

Non-linear Optimisation

Mathematical formulation

Optimisation aims to find the minimum (or equivilently the maximum) of some objective, or loss function ff, given a set of nn parameters θ\theta

minθRnf(θ)\min_{\theta \in \mathcal{R}^n} f(\theta)

We might also have a set of constraints, for example a parameter might be required to be non-negative (e.g. a concentration or population number). These are often written as a set of equality E\mathcal{E} and inequality I\mathcal{I} constraints

minθRnf(θ) subject to {ci(θ)=0,iEci(θ)0,iI\min_{\theta \in \mathcal{R}^n} f(\theta) \text{ subject to } \begin{cases} c_i(\theta) = 0, & i \in \mathcal{E} \\ c_i(\theta) \ge 0, & i \in \mathcal{I} \end{cases}

Or these might simply be defined as bounds in parameter space that restrict the minimisation to a given domain ΩRn\Omega \in \mathcal{R}^n

minθΩf(θ)\min_{\theta \in \Omega} f(\theta)

Useful terms

Modelling is the process of defining the objective function ff, the parameters of interest θ\theta, and the constraints. The algorithms for performing the minimisation fall under the field of optimisation. Sub-fields of this are concerned with the minimisation of discrete function, often called integer programming. Confusingly, it is common to see the terms "optimisation" and "programming" used interchangeably, as the latter term was coined before the 1940s, and does not refer to computer software programming at all.

If the function ff is linear, then there are specific algorithms for this class of problem that fall under the topic of linear programming, or linear optimisation. The more general problem of a non-linear ff is termed non-linear programming, or non-linear optimisation. If a set of equality and/or inequality constraints are needed then algorithms that deal with these fall under the topic of constrained optimisation.

An important distinction when looking at optimisation problems is the notion of global versus local optimisation. The latter finds a point in parameter space θm\theta_m that has a function value f(θm)f(\theta_m) greater than the surrounding points, but might not necessarily be the global minimum. These algorithms are often initialised to a point that is near to the minima of interest. The more general problem of global optimisation is significantly more difficult as it requires that the optimisation be robust to finding and rejecting such local minima. For a function that is convex, then local and global minimisation are the same, which is very advantagous since local minimisation algorithms are often both faster and often have more guarentees of convergence. The function ff is a convex function if its domain Ω\Omega is a convex set, and for any two points θx\theta_x and θy\theta_y:

f(αθx+(1α)θy)αf(θx)+(1α)f(θy), for all α[0,1].f(\alpha \theta_x + (1 - \alpha) \theta_y ) \le \alpha f(\theta_x) + (1 - \alpha) f(\theta_y), \text{ for all } \alpha \in [0, 1].

The term convex programming is used to describe the case of contrained optimisation where ff is convex, the equality constraints are linear and the inequality contraints are concave.

Non-linear optimisation and Root-Finding

Non-linear optimisation is closely related to finding the roots, or zeros, of a function. This can be seen easily by considering the fact that at each local minima or maxima of the function the value of the gradient of ff is zero, i.e. f=0\nabla f = 0. Therefore finding a local minima or maxima of ff corresponds to finding the zeros of the function g=fg = \nabla f.

Other reading

  • Numerical optimization by Nocedal, Jorge; Wright, Stephen J., 1960-

  • Bazaraa, Sherali, and Shetty, Nonlinear Optimization, 2/e, Wiley

  • Griva, Nash, and Sofer, Linear and Nonlinear Optimization, 2/e, SIAM Press

  • Luenberger, Linear and Nonlinear Programming, 3/e, Springer

  • Bertsekas, Nonlinear Programming, Athena

  • Ruszczynski, Nonlinear Optimization, Princeton University Press

  • Broyden, C. G. (1972). "Quasi-Newton Methods". In Murray, W. (ed.). Numerical Methods for Unconstrained Optimization. London: Academic Press. pp. 87–106. ISBN 0-12-512250-0.