The conjugate gradient method is a method of finding the local extremum of a function based on information about its values and its gradient . In the case of a quadratic function in the minimum is no more than steps.
Key Concepts
Define the terminology:
Let be .
We introduce on target function
.
Vectors are called conjugate if:
Where - Hessian matrix
.
| Theorem ( on existence ). There is at least one system |
Justification of the method
Zero iteration
Let be
Then .
Define the direction
so that it is interfaced with :
Decompose in the surrounding area and substitute :
Transpose the resulting expression and multiply by on right:
Due to the continuity of the second partial derivatives . Then:
Substitute the resulting expression in (3):
Then, using (1) and (2):
If a then the gradient at the point perpendicular to the gradient at a point , then according to the rules of scalar product of vectors :
Taking into account the latter, we obtain from expression (4) the final formula for calculating :
Kth iteration
At the kth iteration, we have the set .
Then the following direction is calculated by the formula:
This expression can be rewritten in a more convenient iterative form:
Where directly calculated at the kth iteration.
Algorithm
- Let be - starting point, - the direction of the anti-gradient and we are trying to find the minimum function . Put and find the minimum along the direction . Denote the minimum point .
- Let at some step we are at the point , and - direction of the anti-gradient. Put where choose either (standard algorithm - Fletcher-Reeves, for quadratic functions with ), or ( Polak – Ryber algorithm ). Then we find a minimum in the direction and denote the minimum point . If the function does not decrease in the calculated direction, then you need to forget the previous direction by setting and repeating the step.
Formalization
- Set by the initial approximation and error:
- The initial direction is calculated:
-
- If a or then and stop.
- Otherwise
- if a then and transition to 3;
- otherwise and go to 2.
The case of a quadratic function
| Theorem. If conjugate directions are used to find the minimum of a quadratic function, then this function can be minimized in steps, one in each direction, and the order is not significant. |
Literature
- Akulich I. L. Mathematical programming in examples and tasks: Textbook. allowance for students econom. specialist. universities. - M .: Higher. school., 1986.
- Gill F., Murray W., Wright M. Practical Optimization. Per. from English - M .: World, 1985.
- Korshunov Yu. M., Korshunov Yu. M. Mathematical foundations of cybernetics. - M .: Energoatomizdat, 1972.
- Maksimov Yu. A., Filipovskaya E. A. Algorithms for solving nonlinear programming problems. - M .: MEPhI, 1982.
- Maksimov Yu. A. Linear and discrete programming algorithms. - M .: MEPhI, 1980.
- Korn G., Korn T. Handbook of mathematics for scientists and engineers. - M .: Nauka, 1970 .-- S. 575-576.