Nonlinear Optimization

📈nonlinear optimization review

3.2 Line search methods

Citation:

Line search methods are crucial for unconstrained optimization. They find the best direction and step size to move towards the minimum. These methods balance between computational efficiency and convergence speed, making them practical for many real-world problems.

Steepest descent and Newton's method are two key line search approaches. Quasi-Newton methods, like BFGS, offer a middle ground. They use clever approximations to avoid costly calculations while still converging quickly in most cases.

Search Direction and Step Size

Fundamentals of Line Search Methods

Search direction determines the path of optimization in multidimensional space
Step size controls the magnitude of movement along the search direction
Line search methods iterate between choosing a direction and determining an appropriate step size
Optimization process continues until convergence criteria are met (gradient norm below threshold)

Steepest Descent and Newton's Method

Steepest descent method uses negative gradient as search direction
- Computationally inexpensive but can be slow to converge
- Formula: $d_k = -\nabla f(x_k)$
- Performs well for functions with circular contours
Newton's method utilizes both gradient and Hessian information
- Faster convergence rate, especially near the optimum
- Search direction given by: $d_k = -H_k^{-1} \nabla f(x_k)$
- Requires computation and inversion of Hessian matrix
Both methods update the current point using: $x_{k+1} = x_k + \alpha_k d_k$ $x_{k + 1} = x_{k} + α_{k} d_{k}$
- $\alpha_k$ represents the step size
- Determined through line search procedures (exact or inexact)

Quasi-Newton Methods

Principles and Advantages

Quasi-Newton methods approximate the Hessian matrix or its inverse
Balance between computational efficiency and convergence speed
Avoid explicit calculation and inversion of the Hessian matrix
Update formula maintains positive definiteness of approximation
Superlinear convergence rate in many practical applications

BFGS Method Implementation

Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm widely used in practice
Approximates the inverse Hessian matrix directly
BFGS update formula: $B_{k+1} = B_k + \frac{y_k y_k^T}{y_k^T s_k} - \frac{B_k s_k s_k^T B_k}{s_k^T B_k s_k}$ $B_{k + 1} = B_{k} + \frac{y _{k} y _{k}^{T}}{y _{k}^{T} s _{k}} - \frac{B _{k} s _{k} s _{k}^{T} B _{k}}{s _{k}^{T} B _{k} s _{k}}$
- $B_k$ represents the current approximation of the inverse Hessian
- $s_k = x_{k+1} - x_k$ denotes the step taken
- $y_k = \nabla f(x_{k+1}) - \nabla f(x_k)$ measures the gradient change
Limited-memory BFGS (L-BFGS) variant for large-scale problems
- Stores only a fixed number of vector pairs
- Reduces memory requirements for high-dimensional optimization tasks

Line Search Conditions

Armijo Rule and Sufficient Decrease

Armijo rule ensures sufficient decrease in objective function value
Condition: $f(x_k + \alpha_k d_k) \leq f(x_k) + c_1 \alpha_k \nabla f(x_k)^T d_k$ $f (x_{k} + α_{k} d_{k}) \leq f (x_{k}) + c_{1} α_{k} \nabla f (x_{k})^{T} d_{k}$
- $c_1$ typically set to a small value (0.0001 to 0.1)
Prevents excessively large steps that may increase function value
Often combined with backtracking to find suitable step size

Wolfe Conditions for Step Size Selection

Wolfe conditions comprise two inequalities for step size selection
Sufficient decrease condition (Armijo rule)
Curvature condition: $\nabla f(x_k + \alpha_k d_k)^T d_k \geq c_2 \nabla f(x_k)^T d_k$ $\nabla f (x_{k} + α_{k} d_{k})^{T} d_{k} \geq c_{2} \nabla f (x_{k})^{T} d_{k}$
- $c_2$ typically chosen between 0.1 and 0.9
Strong Wolfe conditions use absolute value in curvature condition
Ensure reasonable progress and avoid excessively small steps

Backtracking Line Search Algorithm

Iterative procedure to find step size satisfying Armijo rule
Start with initial step size (often $\alpha = 1$)
Reduce step size by a factor $\rho$ (typically 0.5) if Armijo condition not met
Continue reduction until suitable step size found

Pseudocode for backtracking:

while f(x + α*d) > f(x) + c*α*∇f(x)^T*d:
    α = ρ * α

Efficient method for inexact line search in practice

Back

Practice Quiz

Table of Contents

📈nonlinear optimization review

3.2 Line search methods

Search Direction and Step Size

Fundamentals of Line Search Methods

Steepest Descent and Newton's Method

Quasi-Newton Methods

Principles and Advantages

BFGS Method Implementation

Line Search Conditions

Armijo Rule and Sufficient Decrease

Wolfe Conditions for Step Size Selection

Backtracking Line Search Algorithm

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes