Quantcast
Channel: Yet Another Math Programming Consultant
Viewing all articles
Browse latest Browse all 804

Solution methods for linear bilevel problems

$
0
0
For some classes of problems, it is a strategy to form explicit KKT optimality conditions and convert the resulting complementarity conditions to big-M constraints. In case the original problem was linear (or in some cases quadratic), the resulting model is a MIP model. One such application is solving non-convex QP models [1]. Another application is a bilevel LP, where we reformulate the inner problem by stating the optimality conditions for the inner-problem. In [2], such a problem is given: 



bilevel LP
\[\begin{align}\max\> &\color{darkred}z = \color{darkred}x+\color{darkred}y \\ & \color{darkred}x \in [0,2] \\ & \min \>\color{darkred}y\\ & \qquad \color{darkred}y \ge 0 \perp \color{darkred} \lambda_1 \\ & \qquad \color{darkred}x - 0.01\color{darkred}y \le 1 \perp \color{darkred}\lambda_2 \end{align} \]

The notation \(\perp\) indicates the name of the dual variable for the inner constraints. After the inner problem is converted to optimality conditions, we have:

Inner LP as optimality conditions
\[\begin{align}\max\> & \color{darkred}z = \color{darkred}x+\color{darkred}y \\ & \color{darkred}x \in [0,2]\\ & 1 - \color{darkred}\lambda_1 - 0.01 \color{darkred}\lambda_2 = 0 \\ & \color{darkred}y \ge 0 \\ & \color{darkred}x - 0.01\color{darkred}y \le 1\\ & \color{darkred}\lambda_1 \cdot \color{darkred}y = 0 \\ & \color{darkred}\lambda_2 \cdot \left( \color{darkred}x -0.01\color{darkred}y - 1 \right) = 0 \\ & \color{darkred}\lambda_1,\color{darkred}\lambda_2\ge 0 \end{align} \]

The products can be interpreted as an "or" condition. These can be linearized using binary variables and big-M constraints:

Linearized problem
\[\begin{align}\max\> & \color{darkred}z = \color{darkred}x+\color{darkred}y \\ & \color{darkred}x \in [0,2]\\ & 1 - \color{darkred}\lambda_1 - 0.01 \color{darkred}\lambda_2 = 0 \\ & \color{darkred}y \ge 0 \\ & \color{darkred}x - 0.01\color{darkred}y \le 1\\ & \color{darkred}\lambda_1 \le \color{darkblue} M_1^d \color{darkred}\delta_1 \\ & \color{darkred}y \le \color{darkblue} M_1^p (1-\color{darkred}\delta_1) \\ & \color{darkred}\lambda_2 \le \color{darkblue} M_2^d \color{darkred}\delta_2 \\ & -\color{darkred}x +0.01\color{darkred}y + 1 \le \color{darkblue} M_2^p (1-\color{darkred}\delta_2) \\ & \color{darkred}\lambda_1,\color{darkred}\lambda_2\ge 0 \\ & \color{darkred}\delta_1,\color{darkred}\delta_2 \in \{0,1\} \end{align} \]

The letters \(p\) and \(d\) in the big-M constants indicate primal and dual.  Note that we flipped the sign of the second constraint of the inner problem. This was to make \(\color{darkred}\lambda_2\) a non-negative variable.

Choosing big-M values


One big issue is: choose appropriate values for the big-M constants. The values \(\color{darkblue} M_1^d,\color{darkblue} M_1^p,\color{darkblue} M_2^d,\color{darkblue} M_2^p = 200\) are valid bounds on \(\color{darkred}\lambda_1, \color{darkred}y, \color{darkred}\lambda_2, -\color{darkred}x +0.01\color{darkred}y + 1\). This gives the solution:

LOWERLEVELUPPER

----VARx . 2.00002.0000
----VARy . 100.0000+INF
----VARlambda1 . . +INF
----VARlambda2 . 100.0000+INF
----VARdelta1 . . 1.0000
----VARdelta2 . 1.00001.0000
----VARz-INF102.0000+INF


The problem of choosing the right big M values is important. Choosing values that are too small, can lead to cutting off optimal values. Big-M values that are too large have other problems such as numerical issues and trickle flow. In [2] it is argued that many modelers use the following algorithm to detect if a big-M value is too small:

  1. Select initial values for \( \color{darkblue} M_j^d,  \color{darkblue} M_j^p \).
  2. Solve the linearized MIP model.
  3. Look for binding big-M constraints that should be inactivated. If not found, stop: we have found the optimal solution.
  4. Increase the corresponding big-M value.
  5. Go to step 2.

Unfortunately, this is algorithm is not guaranteed to work. [2] gives an example: use as starting values \(\color{darkblue} M_j^d = 50, \color{darkblue} M_j^p = 200\). The solution looks like:


LOWERLEVELUPPER

----VARx . 2.00002.0000
----VARy . . +INF
----VARlambda1 . 1.0000+INF
----VARlambda2 . . +INF
----VARdelta1 . 1.00001.0000
----VARdelta2 . 1.00001.0000
----VARz-INF2.0000+INF

The values \(\color{darkred}\delta_j=1\) indicate we should look at the constraints: \(\color{darkred}\lambda_j \le \color{darkblue}M_j^d\). As \(\color{darkred}\lambda_1=1, \color{darkred}\lambda_2=0\) these are not binding. Hence, the erroneous conclusion is that this solution is optimal. 

Our conclusion is: we cannot reliably detect that big-M values are chosen too small by searching for binding constraints.

Alternatives


In [3], it is argued that SOS1 (special ordered sets of type 1) variables should be used instead of binary variables. For our example, this means we would replace our big-M constraints by: 


SOS1 representation
\[\begin{align} & \color{darkred}\lambda_1, \color{darkred}y \in SOS1 \\ & \color{darkred}\lambda_2, \color{darkred}s \in SOS1 \\ &\color{darkred}s = -\color{darkred}x+0.01\color{darkred}y+1 \\ & \color{darkred}s \ge 0 \end{align} \]


Many MIP solvers (but not all) support SOS1 variables. Binary variables can sometimes be faster, but SOS1 variables have the advantage that no big-M constants are needed. The same can be said for indicator constraints. A formulation using indicator constraints can look like:
 

Indicator constraints
\[\begin{align} & \color{darkred} \delta_1 = 0 \Rightarrow \color{darkred}\lambda_1 =0 \\ & \color{darkred} \delta_1 = 1\Rightarrow \color{darkred}y = 0 \\ & \color{darkred} \delta_2 = 0 \Rightarrow \color{darkred}\lambda_2 = 0 \\ & \color{darkred} \delta_2 = 1 \Rightarrow \color{darkred}x-0.01\color{darkred}y = 1 \\ & \color{darkred}x-0.01\color{darkred}y \le 1 \\ & \color{darkred} \delta_1,\color{darkred}\delta_2 \in \{0,1\} \end{align} \]


We can also solve the quadratic model directly using global solvers like Antigone, Baron, Couenne, or Gurobi. Sometimes reasonable bounds are needed to help globals solvers.

It is noted that the GAMS EMP tool can generate the quadratic model for you. I still had to specify to use a global solver for it to find the optimal solution.

Conclusion


Instead of using big-M constraints to handle complementarity conditions in linear bi-level optimization problems, there are a few alternatives:
  • SOS1 variables
  • Indicator constraints
  • Global solvers for non-convex quadratic problems

References


  1. Solving non-convex QP problems as a MIP,  https://yetanothermathprogrammingconsultant.blogspot.com/2016/06/solving-non-convex-qp-problems-as-mip.html 
  2. Salvador Pineda and Juan Miguel Morales, Solving Linear Bilevel Problems Using Big-Ms: Not All That Glitters Is Gold, September 2018,  https://arxiv.org/pdf/1809.10448.pdf
  3. Thomas Kleinert and Martin Schmidt, Why there is no need to use a big-M in linear bilevel optimization: A computational study of two ready-to-use approaches, October 2020,  http://www.optimization-online.org/DB_FILE/2020/10/8065.pdf 

Viewing all articles
Browse latest Browse all 804

Trending Articles