Quasi-Newton methods are clever tricks to speed up optimization without the heavy lifting of Newton's method. They approximate the inverse Hessian matrix, using info from previous steps to get better and better guesses.
BFGS and DFP are two popular Quasi-Newton methods. They use different formulas to update the approximation, but both aim for superlinear convergence. BFGS is generally more robust and efficient in practice.
Quasi-Newton Methods Motivation
Approximating the Inverse Hessian
Top images from around the web for Approximating the Inverse Hessian
Hessian transfer for multilevel and adaptive shape optimization | International Journal for ... View original
DFP update: Bk+1=Bk−skTBkskBkskskTBk+ykTskykykT
Where Bk is the current approximation, sk is the step vector, and yk is the difference in gradients
Implementing BFGS and DFP
Algorithm Structure
Initialize inverse Hessian approximation (typically identity matrix or scaled version)
Compute search direction: pk=−Bk∇f(xk)
Perform line search to determine step size αk
Update solution: xk+1=xk+αkpk
Update inverse Hessian approximation using BFGS or DFP formula
Efficient BFGS implementation uses Sherman-Morrison-Woodbury formula for direct updates
DFP update implemented directly on inverse Hessian approximation
Implementation Considerations
Careful implementation of line search procedures ensures sufficient decrease in objective function
Maintains positive definiteness of approximation
Numerical stability safeguards against division by zero
Maintains symmetry of approximation for robust implementations
Termination criteria involve checking norm of gradient, change in function value, or step size against predefined tolerances
Example termination condition: ∥∇f(xk)∥<ϵ or ∣f(xk)−f(xk−1)∣<δ
BFGS vs DFP Performance
Convergence Properties
BFGS and DFP exhibit superlinear convergence under certain conditions
BFGS converges faster and more reliably in practice