« Previous: Lecture 25 Summary | Next: Lecture 27 Summary » |
Simplified discussion of MMA algorithm considering only the linear/quadratic CCSA models from the paper, not the actual MMA model functions. Covered conservative approximations, inner/outer iterations, and trust-region and penalty updating. Because the CCSA approximations are convex, we can use ideas from convex optimization to perform solve the trust-region subproblem: duality.
Started by reviewing the basic idea of Lagrange multipliers to find an extremum of one function f0(x) and one equality constraint h1(x)=0. We instead find an extremum of L(x,ν1)=f0(x)+ν1h1(x) over x and the Lagrange multiplier ν1. The ν1 partial derivative of L ensures h1(x)=0, in which case L=f0 and the remaining derivatives extremize f0 along the constraint surface. Noted that ∇L=0 then enforces ∇f0=0 in the direction parallel to the constraint, whereas perpendicular to the constraint ν1 represents a "force" that prevents x from leaving the h1(x)=0 constraint surface.
Generalized to the Lagrangian L(x,λ,ν) of the general optimization problem (the "primal" problem) with both inequality and equality constraints, following chapter 5 of the Boyd and Vandenberghe book (see below) (section 5.1.1). Defined the Lagrange dual function g(λ,ν) (Boyd, section 5.1.2) and proved weak duality of the dual problem (sections 5.1.3, 5.2, 5.2.2). Commented on strong duality (5.2.3), which is mostly true/useful in convex problems (Slater's condition). Defined the notation x* etcetera for the optimum, as in Boyd.
Described the KKT conditions for a (local) optimum/extremum (Boyd, section 5.5.3). These are true in problems with strong duality, as pointed out by Boyd, but they are actually true in much more general conditions. For example, they hold under the "LICQ" condition in which the gradients of all the active constraints are linearly independents. Gave a simple graphical example to illustrate why violating LICQ requires a fairly weird optimum, at a cusp of two constraints.