Subgradient methods are iterative methods for solving convex minimization problems. Subgradient methods developed by Naum Zuselevich Shor and others in the 1960s and 1970s converge even if applied to undifferentiated objective functions. When a function is differentiable, subgradient methods for tasks without restrictions use the same direction of search as the steepest descent method.
Subgradient methods are slower than Newton methods when they are used to minimize twice continuously differentiable convex functions. However, Newton's methods cease to converge on problems that have undifferentiable bends.
In recent years, some internal point methods have been proposed for convex minimization problems, but also subgradient projection methods and associated beam descent methods remain competitive. For convex minimization problems with a large number of dimensions, subgradient projection methods are acceptable, since they require a small memory size.
Subgradient projection methods are often applied to large tasks using decomposition techniques. Such decomposition methods often allow a simple distributed method for the task.
Content
- 1 The rules of the classic subgradient
- 1.1 Step size rules
- 1.2 Convergence
- 2 Sub-gradient projections and beam methods
- 3 Optimization with restrictions
- 3.1 Subgradient projection method
- 3.2 General restrictions
- 4 notes
- 5 Literature
- 6 Literature for further reading
- 7 References
Classic Sub-Gradient Rules
Let be will be a convex function with scope
. The classic subgradient method iterates
- {\ displaystyle x ^ {(k + 1)} = x ^ {(k)} - \ alpha _ {k} g ^ {(k)} \}
Where mean subdifferential function
at the point
, but
- kth iteration of the variable
. If
differentiable, then its only subgradient is the gradient
. It may happen that
is not a decreasing direction for
at the point
. We therefore list
, in which we store the found smallest values of the objective function, that is,
Step Size Rules
Subgradient methods use a large number of different step size selection rules. Here we mention five classical rules for which evidence of convergence is known:
- Constant step size,
- Constant stride length,
, what gives
- Summable with a square, but not summable step size, i.e. any step size for which
- Non-cumulative decreasing step size, i.e. any step satisfying
- Non-cumulative decreasing stride length, i.e. where
For all five rules, the step size is determined "in advance", before the method starts. The step size does not depend on previous iterations. The “advance” step selection property for subgradient methods differs from the “in process” step selection rules used in methods for differentiable functions - many methods for minimizing differentiable functions satisfy Wolf conditions for convergence, where the step sizes depend on the current position of the point and the current direction of the search. A wide discussion of step selection rules for subgradient methods, including incremental versions, is given in the book of Bertsekas [1] , as well as in the book of Bertsekas, Nedich and Ozdaglar [2] .
Convergence
For a constant step length and scalable subgradients having a Euclidean norm equal to unity, the subgradient method approaches arbitrarily close to the minimum value, i.e.
- according to shore . [3] .
Classical subgradient methods have poor convergence and are no longer recommended for use [4] [5] . However, they are still used in specialized applications, because they are simple and easily adapt to special structures in order to use their features.
Subgradient Projections and Beam Methods
During the 1970s, Claude Lemechel and Phil Wolf proposed “beam methods” for descent for convex minimization problems [6] . The meaning of the term “beam methods” has changed a lot since then. Modern versions and a complete analysis of convergence were given by Kiel [7] . Modern beam methods often use the rules of to select the step size, which develop techniques from the method of "projection subgradient" Boris T. Polyak (1969). However, there are problems due to which often the beam methods give a small advantage over the methods of projecting subgradients [4] [5] .
Restricted Optimization
Subgradient Projection Method
One of the extensions of subgradient methods is the subgradient projection method , which solves the optimization problem with constraints
- minimize under conditions
Where is a convex set . Subgradient Projection Method Iterates
Where is a projection onto , but is any subgradient at the point
General restrictions
The subgradient method can be extended to solve the problem with constraints in the form of inequalities
- minimize under conditions
where are the functions convex. The algorithm takes the same form of the case without restriction.
Where is the step size, and is a subgradient of the objective function or one of the constraint functions at the point Here
Where mean subdifferential function . If the current point is valid, the algorithm uses the subgradient of the objective function. If the point is not valid, the algorithm selects the sub-gradient of any violated constraint.
Notes
- ↑ Bertsekas, 2015 .
- ↑ Bertsekas, Nedic, Ozdaglar, 2003 .
- ↑ The convergence of subgradient methods with a constant (scaled) step is affirmed in exercise 6.3.14 (a) of Bertsekas’s book (page 636) ( Bertsekas 1999 ) and he ascribes this result to Shor ( Shor 1985 )
- ↑ 1 2 Lemaréchal, 2001 , p. 112–156.
- ↑ 1 2 Kiwiel, Larsson, Lindberg, 2007 , p. 669–686.
- ↑ Bertsekas, 1999 .
- ↑ Kiwiel, 1985 , p. 362.
Literature
- Dimitri P. Bertsekas . Convex Optimization Algorithms. - Second. - Belmont, MA .: Athena Scientific, 2015 .-- ISBN 978-1-886529-28-1 .
- Dimitri P. Bertsekas, Angelia Nedic, Asuman Ozdaglar. Convex Analysis and Optimization. - Second. - Belmont, MA .: Athena Scientific, 2003. - ISBN 1-886529-45-0 .
- Naum Z. Shor . Minimization Methods for Non-differentiable Functions. - Springer-Verlag , 1985. - ISBN 0-387-12763-1 .
- Dimitri P. Bertsekas . Nonlinear Programming. - Second. - Cambridge, MA .: Athena Scientific, 1999 .-- ISBN 1-886529-00-0 .
- Krzysztof Kiwiel. Methods of Descent for Nondifferentiable Optimization. - Berlin: Springer Verlag , 1985 .-- ISBN 978-3540156420 .
- Claude Lemaréchal. Lagrangian relaxation // Computational combinatorial optimization: Papers from the Spring School held in Schloß Dagstuhl, May 15–19, 2000. - Berlin: Springer-Verlag, 2001. - T. 2241. - (Lecture Notes in Computer Science). - ISBN 3-540-42877-1 . - DOI : 10.1007 / 3-540-45586-8_4 .
- Krzysztof C. Kiwiel, Torbjörn Larsson, Lindberg PO Lagrangian relaxation via ballstep subgradient methods // Mathematics of Operations Research. - 2007. - August ( t. 32 , No. 3 ). - S. 669–686 . - DOI : 10.1287 / moor.1070.0261 .
Further Reading
- Andrzej Piotr Ruszczyński. Nonlinear Optimization. - Princeton, NJ: Princeton University Press , 2006 .-- C. xii + 454. - ISBN 978-0691119151 .