Fixedpoint iteration based algorithm for a class of nonlinear programming problems
Ashok D. Belegundu
Journal of Applied Mathematics and Computational Mechanics 
Download Full Text 
FIXEDPOINT ITERATION BASED ALGORITHM FOR A CLASS OF NONLINEAR PROGRAMMING PROBLEMS
Ashok D. Belegundu
Department of Mechanical and Nuclear
Engineering
The Pennsylvania State University, University Park, PA 16802 USA
adb3@psu.edu
Received: 9 February 2017; accepted: 22 May 2017
Abstract. A fixedpoint algorithm is presented for a class of singly constrained nonlinear programming (NLP) problems with bounds. Setting the gradient of the Lagrangian equal to zero yields a set of optimality conditions. However, a direct solution on general problems may yield nonKKT points. Under the assumption that the gradient of the objective function is negative while the gradient of the constraint function is positive, and that the variables are positive, it is shown that the fixedpoint iterations can converge to a KKT point. An active set strategy is used to handle lower and upper bounds. While fixedpoint iteration algorithms can be found in the structural optimization literature, these are presented without clearly stating assumptions under which convergence may be achieved. They are also problem specific as opposed to working with general functions f, g. Here, the algorithm targets general functions which satisfy the stated assumptions. Further, within this general context, the fixedpoint variable update formula is given physical significance. Unlike NLP descent methods, no line search is involved to determine step size which involves many function calls or simulations. Thus, the resulting algorithm is vastly superior for the subclass of problems considered. Moreover, the number of function evaluations remains independent of the number of variables allowing the efficient solution of problems with a large number of variables. Applications and numerical examples are presented.
MSC 2010: 47N10, 49M05, 90C30, 90C06, 74P05
Keywords: fixedpoint iteration, resource allocation, nonlinear programming, optimality criteria
1. Introduction
A fixedpoint algorithm is presented for a class of singly constrained NLP problems with bounds. Setting the gradient of the Lagrangian equal to zero yields a set of optimality conditions. However, a direct solution may yield nonKKT points. Under the assumption that the gradient of the objective function is negative while the gradient of the constraint function is positive, and that the variables are positive, it is shown that the fixedpoint iteration can be made to converge to a KKT point. Opposite signs on the derivatives, viz. a positive objective function gradient with a negative constraint function gradient can be handled via inverse variables. An active set strategy is used to handle lower and upper bounds. While fixedpoint iteration algorithms can be found in the structural optimization literature [18], these are presented without clearly stating assumptions under which convergence may be achieved, and in fact, have been observed to not converge in certain situations without an explanation [5]. On the other hand, these algorithms have builtin bells and whistles relating to step size control and Lagrange multiplier updates that render them efficient on a wider variety of problems than targeted herein. Here, the algorithm targets general functions f and g within a subspace of functions, as opposed to problem specific applications in structural optimization. Further, within this general context, the fixedpoint variable update formula is given physical significance.
For the subclass of problems considered, the algorithm presented is vastly superior to descent based NLP methods such as sequential quadratic programming or a generalized reduced gradient or feasible directions. In the latter category of methods, a line search needs to be performed along a search direction, which is computationally expensive as it involves multiple simulations in order to evaluate the functions. Further, in fixedpoint iterative methods, the number of function evaluations involved in reaching the optimum are very weakly dependent on the number of variables [7], allowing an efficient solution of problems with a large number of variables. Fixedpoint iterations have been applied extensively in the area of equation solving and in market equilibrium in economics while hardly at all in nonlinear programming. The algorithm may be applied to a subclass of problems in resource allocation and inventory models, in addition to allocation of material in structures [9, 10].
2. Fixedpoint iteration recurrence formulas
Fixedpoint iterations have been used widely in equation solving. Two recurrence formulas that are prevalent will be illustrated in finding the root of a one variable equation, x = g(x). This will pave the discussion of fixedpoint iterations for solving optimization problems in the next section. Convergence is established by the following theorem [11, 12]:
Theorem 2.1. Assume a is a solution of x = g(x) and suppose that g(x) is continuously differentiable in some neighboring interval about a with , where . Then, provided the starting point x^{0} is chosen sufficiently close to a, x^{k+1} = g(x^{k}), k ³ 0, with .
When x^{0} is sufficiently close, there exists an interval I containing a where g(I) Ì I, i.e. g is a contraction on the interval. For systems of equations with n variables, the above holds true with a and x being vectors and the condition that all eigenvalues of the Jacobian are less than one in magnitude [12]. Two recurrence formulas are now discussed with the help of an example. Consider the solution of x = 10cosx. Rather than generating a sequence as x^{k}^{+1} = 10cos (x^{k}) which is unstable, we use one of two techniques. Treating 10cos (x^{k}) as the ‘new guess’ and x^{k} as the current value, we write
x^{k}^{+1} = w g(x^{k}) + (1w) x^{k} (1) = 10 w cos (x^{k}) + (1w) x^{k }
where w = ½ represents an average of the two. The recurrence relation in Eq. (1) converges to the root 1.4276 provided w £ 1/11 for a sufficiently close starting guess, using the theorem above. Generally, w Î(0,1) and is reduced if the iterations oscillate. If convergence is stable but slow, then w may be increased. w = 0.25 has been taken as the default value for the examples in this paper.
Alternatively, we may multiply both sides of x = 10cosx by and take the pth root, to obtain the recurrence relation
(2) 
which, for this example, converges for p ³ 6 provided the starting guess is close enough. Formulas in Eqs. (1) or (2) will be used for optimization problems as discussed subsequently.
3. Assumptions and algorithm
Assumptions made, algorithm development, procedural steps, extensions and physical interpretation of the fixedpoint scheme are discussed in this section.
3.1. Assumptions
The subclass of NLP problems considered here are of the form
minimize f  (x) 
subject to g(x) £ 0 (single constraint) and x ≥ x^{L} > 0 (3) 
where x^{L} are lower bounds. Descent based methods like sequential quadratic programming and others referred to above update variables as
x^{k}^{+1} = x^{k} + b_{k} d^{k}  (4) 
where b_{k} is a step size chosen to ensure reduction of a descent or merit function, d^{k} is a direction vector and k is an iteration index. In fixedpoint methods, on the other hand, iterations are based on the form
x^{k}^{+1} = f  (x^{k}) (5) 
where . Three assumptions are made:
Assumption 3.1. All functions are C^{1} continuous, i.e. continuously differentiable.
Assumption 3.2. , and .
Assumption 3.3. . That is, variables are real and positive.
Generally speaking, assumption 3.2 above restricts the class of problems considered to those where increasing the available resource reduces the objective function value. For instance, more material reduces deflection (but not necessarily stress), or greater effort increases the probability of finding the treasure or target. Problems where , and for some r can be handled by working with reciprocal variable .
3.2. Development of the fixedpoint algorithm
Optimality conditions reduce to minimizing the Lagrangian L = f + µ g subject to the lower bounds on x_{j}. For x_{j} > x_{j}^{L} then we solve the optimality condition
The main issue to obtaining a minimum point is ensuring µ ≥ 0. Here, our assumptions on the signs of the derivatives (i.e. opposite monotonicity of the functions) guarantee, from the optimality condition above, that µ ≥ 0. With lower bounds, the same result is true since at points where x_{j} = x_{j}^{L} which again requires µ ≥ 0. Further, it readily follows that the single constraint is active (or essential) at optimum, since increasing variable x_{j} will reduce the objective while increasing the constraint value until it reaches its limit. Thus, with the stated assumptions, the solution of (3) using optimality conditions yields a KKT point.
The fixedpoint iteration is derived as follows. The constraint g(x) £ 0 is linearized as c^{T}x £ c_{0}, where with the derivatives evaluated at the current point x^{k}. Working with the linearized constraint, the local optimization problem may be stated as {min. f (x) : c^{T}x £ c_{0}, x ³ x^{L}}. We define a Lagrangian function L = f + m (c^{T} x  c_{0}), and an active set . The optimal point x is obtainable from . The optimality conditions can be stated as
(6) 
From Eq. (6) we have
j Î I : x_{j} = x_{j}^{L} 
j Ï I :  (7) 
Substituting Eq. (7) into c^{T} x = c_{0} gives an expression for m, which when substituted back into Eq. (7) gives
j Î I : x_{j} = x_{j}^{L} 
j Ï I :  (8) 
where
(9) 
A starting point x^{0} > x^{L} is used which may be feasible or infeasible with respect to the constraint. We use the fixedpoint recurrence relation or ‘resizing’ formula:
j Î I : x_{j} = x_{j}^{L} 
j Ï I :  (10) 
As per Eq. (1), we use for j Ï I
, w Î  (0,1) (11a) 
Or as per Eq. (2) we use for j Ï I
(11b) 
Basic steps are given in the algorithm below.
3.3. Algorithm steps
Input Data:
x^{0}, x^{L}, w (or p), g_{max}, move limit , Iteration limit, tol_{abs}, tol_{rel}
Set initial active set I_{0} = {null set}
Typical Iteration:
i. Evaluate derivatives and then evaluate c, c_{0}, E_{j}.
ii. Compute x^{trial} º x^{k}^{+1} from Eq. (11a) or (11b).
iii. Reset x^{k}^{+1} to satisfy move limits based on as well as to satisfy the bounds.
iv. Based on x^{k}^{+1}, update active set from I_{k} to I_{k}_{+1}. A free variable that has reached a bound is included in the active set based on whether for each . Also, variables in I_{k} are tested if they should become free.
v. Stopping criteria is based on (i) iteration limit (= number of function calls), (ii) based on whether
and 
which should hold for, say, 5 consecutive iterations.
3.4. Extensions of the algorithm
The aforementioned fixedpoint algorithm can be applied to problems which are extensions to problem (3) as discussed below.
If the objective is to minimize f where , subject to
g(x) £ 0 where , then we define reciprocal variables y_{i} = 1/x_{i}, i = 1,...,n. Defining f (x) ® f (1/y) º f ^{new}(y) and similarly g^{new}(y), we obtain the problem:

subject to g^{new}(y) £ 0, and y > 0
The switch in the signs of the derivatives allows the algorithm to be applied. For example, the algorithm can be applied to minimize the compliance subject to a mass limit, or to minimize the mass subject to a compliance limit with reciprocal variables.
Upper bounds are present in some problems. In topology optimization, for example, the bounds 0 £ x_{i} £ 1 are imposed on the pseudo densities. The observation made in Section 4 is generalized, viz. now represents the available resource after accounting for variables at their bounds. If a variable x_{i} º q represents an angle and crosses zero as , it can be substituted by .
3.5. Physical interpretation of variable update formula
Importantly, the fixedpoint iteration variable update in Eq. (10) can be physically interpreted as follows. Firstly, the derivative has often been termed a ‘sensitivity coefficient’ since it represents, to within a linear approximation, the change in f due to a unit change in x_{j}. Similarly, the term represents the change in f due to a unit percentage change in x_{j} within a linear approximation. Thus, as x ® x^{*}, the contribution of the jth term (c_{j} x_{j}) in the constraint
c_{1} x_{1} + c_{2} x_{2} + ... + ( c_{j} x_{j }) + ... + c_{n} x_{n} = c_{0} 
is ‘strengthened’ or ‘trimmed’ based on the relative impact or ‘relative return on investment’ the variable has on the objective. In structural optimization, this update is referred to as a ‘resizing’ since {x_{j}} refer to member sizes. Here, the contribution to the literature is that a physical interpretation has been provided with general functions f and g, as opposed to specific functions pertaining to structural optimization applications. Alternatively, dividing through by c_{0}, we may interpret this as splitting unity into different parts, each part reflecting the impact on the objective. Lastly represents the available resource after accounting for variables at their bounds. Thus, the variable updates for x_{j} only compete for the resource that is available. The resizing of variables can thus be expressed as “reallocate term (c_{i }x_{i}) = (fractional % reduction in objective) x (available resource)”. Interestingly, the term has only the units of f and is independent of the units of x_{j}. Also, the iterations are driven by sensitivity only whereas in NLP based descent methods, the function f is also monitored during the line search phase.
4. Applications
4.1. Searching for a missing vessel
Consider the problem of searching for a lost object where it is to be determined how resource b, measured in time units, must be spent to find, with the largest probability, an object among n subdivisions of the area [9]:
Defining for minimization, we have , and derivative of the constraint function g ≤ 0 is 1. Further, variables are positive. The fixedpoint algorithm can thus be applied.
4.2. Optimum allocation in stratified sampling
Consider the problem determining an average of a certain quantity among a large population M. The population is stratified into n strata each having a population M_{j} with the number of samples chosen x_{j}. The total sample is to be allocated to the strata to obtain a minimum variance of the global estimate [9]:
where , s ^{2} is an appropriate estimate of the variance .in each strata, b is the total sample size. We have and derivative of the constraint function g ≤ 0 is 1. Further, variables are positive. The fixedpoint algorithm can thus be applied.
4.3. Structural optimization for minimum compliance
The problem of allocating material within a structural domain to minimize compliance (i.e. maximize stiffness) satisfies the assumption required to use the fixedpoint algorithm. Specifically, f (x) = F^{T}U, with K(x) U = F. Here, K = global stiffness, U = global displacement, F = global force, and x represents either a crosssectional area or element density. Using the adjoint method [13], it can be shown that if element stiffness matrix with the parameter r > 0, then where e_{i} = q^{T}kq is the element strain energy. Note that q = the element displacement vector and k = the element stiffness matrix. Since e_{i }³ 0 owing to k being positive semidefinite, we have . Further, the assumption readily follows when g represents the total volume of material. Further, in view of , the fraction E_{j} in (9) takes the form
(13) 
That is, variable update or resizing is governed by the energy in the jth element as a fraction of the total energy. Thus, many references exist in structural optimization to energy based optimality criteria methods, which are shown here as special cases stemming from a general formula.
4.4. Real time allocation of resources
In view of the assumptions behind this algorithm, it can be expected to be useful in allocating fixed amount of resources relating to security. This follows from the assumption that more security than needed in certain areas will not be harmful. Similarly, allocating fixed amount of funds to school districts is a possible application. Since the algorithm is fast, a dynamic allocation is possible based on say 30day moving averages of the data. These applications need to be explored.
5. Numerical examples
Examples are presented to show that the algorithm is far superior to a general nonlinear programming method for problems that satisfy Assumptions 3.1 and 3.2. The fixedpoint algorithm is compared to MATLAB’s constrained optimizers (sequential quadratic programming and interior point optimizers under ‘fmincon’).
5.1. Cantilever beam
Consider the simple problem of allocating material to minimize the tip deflection of a cantilever beam with two length segments, with certain assumed problem data:
minimize 
subject to 2 x_{1} + x_{2} £ 1 
x > 0 
From Eq. (9), . The recurrence formula in Eq. (11a) becomes
where {c_{i}} = [2 1], c_{0} = 1. The optimum is x^{*} = [0.426, 0.169]. Based on Theorem 2.1, it can be readily shown that the fixedpoint iteration converges for w Î(0,1).
5.2. Searching for a missing vessel
Table 1
Searching for a missing vessel
N 
NFE^{1} as per Fixedpoint algorithm^{2} 
NFE as per NLP algorithm^{2} 
f ^{*}_fixedpoint, f ^{*}_NLP 
10 
37 
209 
‒2.338, ‒2.339 
100 
67 
4.041 
‒5.733, ‒5.750 
100^{3} 
93 
20,021^{3} 
‒9.378, ‒9.240 
^{1} NFE = Number of Function Evaluations^{ }
^{2} max. constraint violation = 0.001, max NFE = 100, in fixedpoint code
^{3} NLP took 19 iterations, and 20,021 function calls
5.3. Randomized test problem
A set of randomized test problems of the form:
minimize f (x) = 
subject to g_{1} (x) º £ 1 
0 < x^{L} £ x £ x^{U} 
with the restriction that all coefficients > 0, , . Note that = coefficient of the kth function, ith term, while a_{k i j} = exponent corresponding to the kth function, ith term, and jth variable. Derivatives are calculated analytically.
Data: x^{L} = 10^{‒6}, x^{U} = 1, , , , , t_{0} = t_{k} = 8, x^{0} = 0.5. To ensure that the constraint is active, .
Table 2 shows results for various random generated trials. The fixedpoint iteration method far outperforms the NLP routine for the class of problems considered. Total computation in NLP methods increase with n while fixedpoint iteration methods are insensitive to it, as also noted in the structural optimization literature.
Table 2
Randomized test problem
n 
NFE^{4} as per Fixedpoint algorithm^{1} 
NFE as per NLP algorithm^{2} 
f ^{*}_fixedpoint / f ^{*}_NLP ^{2} 
10 
50 
363 
1.027 
20 
50 
1,131 
1.030 
40^{3} 
50 
3,008 
1.003 
^{1} fixed value
^{2} averaged over 5 random trials
^{3} larger values of n cannot be solved in reasonable time by the NLP routine
^{4} NFE = Number of Function Evaluations
6. Conclusions
In this paper, an optimality criteria based fixedpoint iteration is developed for a class of nonlinear programming problems. The class of problems require that the variables are positive, derivatives of f are negative, and derivatives of g_{j} are positive. Certain extensions of this subclass of problems have been given. Hitherto, fixedpoint methods were only developed in a problemspecific manner in the field of structural optimization. Convergence aspects are discussed. The fixedpoint iteration algorithms to be found in the structural optimization literature, albeit targeting more general problems than considered here, are presented without clearly stating assumptions under which convergence may be achieved, and in fact, have been observed to not converge in certain situations. Moreover, these have been developed for problem specific applications in the structural optimization and have not targeted general functions f and g_{j} even within a subspace of functions. Importantly, the fixedpoint update, within this general context, is given physical significance in this paper.
Results show that for the subclass of problems considered, fixedpoint iterations far outperform an NLP method. This is a general result because the fixedpoint iteration method does not involve line search, a main step in NLP methods, which requires computationally expensive multiple function evaluations, even with the use of approximations. A fixedpoint algorithm is insensitive to n whereas in NLP methods, the number of iterations increase significantly with n. In the algorithm presented, the value of w (or p) requires, at most, a onetime adjustment based on a simple rule, viz. if oscillations in the maximum constraint violation are noticed during the iterations, then w is reduced (or p is increased); a smaller value of w than that needed also works except that more iterations are needed, as there is more emphasis on the previous point. A default value of w = 0.25 has worked well on the examples here. New applications need to be explored for using the algorithm.
References
[1] Venkayya V.B., Design of optimum structures, Computers and Structures 1971, 1, 265309.
[2] Berke L., An efficient approach to the minimum weight design of deflection limited structures, AFFD1TM704FDTR, Flight Dynamics Laboratory, Wright Patterson AFB, OH 1970.
[3] Khot N.S., Berke L., Venkayya V.B., Comparison of optimality criteria algorithms for minimum weight design of structures, AIAA Journal 1979, 17(2), 182190.
[4] Dobbs M.W., Nelson R.B., Application of optimality criteria to automated structural design, AIAA Journal 1975, 14(10), 14361443.
[5] Khan M.R., Willmert K.D., Thornton W.A., Optimality criterion method for largescale structures, AIAA Journal 1979, 17(7), 753761.
[6] McGee O.G., Phan K.F., A robust optimality criteria procedure for crosssectional optimization of frame structures with multiple frequency limits, Computers and Structures 1991, 38(5/6), 485 500.
[7] Yin L., Yang W., Optimality criteria method for topology optimization under multiple constraints, Computers and Structures 2001, 79, 18391850.
[8] Belegundu A.D., A general optimality criteria algorithm for a class of engineering optimization problems, Engineering Optimization 2015, 47(5), 674688.
[9] Patriksson M., A survey on the continuous nonlinear resource allocation problem, European Journal of Operational Research 2008, 185(1), 146.
[10] Jose J.A., Klein C.A., A note on multiitem inventory systems with limited capacity, Operations Research Letters 1988, 7(2), 7175.
[11] Atkinson K.E., An Introduction to Numerical Analysis, John Wiley, 1978.
[12] Bryant V., Metric Spaces: Iteration and Application, Cambridge University Press, 1985.
[13] Belegundu A.D., Chandrupatla T.R., Optimization Concepts and Applications in Engineering, 2^{nd} edition, Cambridge University Press, 2011.