Quantcast
Channel: Yet Another Math Programming Consultant
Viewing all articles
Browse latest Browse all 809

Sparsest solution: a difficult MIP

$
0
0
The problem of finding a solution vector that is as sparse as possible can be formulated as a MIP.

I was looking at solving an underdetermined  problem \[Ax=b\] i.e. the number of rows \(m\) is less than the number of colums \(n\). A solution \(x\) has in general \(m\) nonzero elements. 

The actual system was: \[Ax\approx b\] or to be more precise \[-\epsilon \le Ax-b\le \epsilon \] By adding slack variables \(s_i\) we can write this as: \[\begin{align}&Ax=b+s\\&-\epsilon\le s \le \epsilon\end{align} \] Now we have more wiggle room, and can try to find the vector \(x\) that has as many zero elements as possible.

Big-M formulation


The MIP model seems obvious: \[\begin{align} \min & \sum_j \delta_j \\ & Ax=b+s \\ & -M\delta_j \le x_j \le M\delta_j \\ & s_i \in [-\epsilon,\epsilon]\\ & \delta_j \in \{0,1\}\end{align}\] This turns out to be a problematic formulation. First we have no good a priori value for \(M\). I used \(M=10,000\) but that gives some issues. Furthermore the performance is horrible. For a very small problem with \(m=20, n=40\) we already see a solution time of about 20 minutes. With just 40 binary variables I expected something like a minute or two. The results with Cplex look like:


----     82 VARIABLE x.L  

j3 -0.026, j4 0.576, j7 0.638, j11 0.040, j12 -0.747, j14 0.039, j15 -0.169, j19 -0.088
j23 0.509, j31 -0.475, j34 -0.750, j35 -0.509


---- 82 VARIABLE delta.L

j3 2.602265E-6, j4 1.000, j7 1.000, j11 4.012925E-6, j12 1.000, j14 3.921840E-6
j15 1.000, j19 8.834778E-6, j23 1.000, j31 1.000, j34 1.000, j35 1.000


---- 82 PARAMETER statistics

m 20.000, n 40.000, epsilon 0.050, big M 10000.000
iterations 5.443661E+7, nodes 5990913.000, seconds 1004.422, nz(x) 12.000
obj 8.000, gap EPS

Well, we have some problems here. The optimal objective is 8 so we have 8 \(\delta_j\)'s that are one. But at the same time the number of nonzero values in \(x\) is 12.  The reason is we have a few \(\delta_j\)'s that are very small. Cplex considers them as being zero, while the model sees opportunities to exploit these small values for what is sometimes called "trickle flow". These results are just not reliable. The underlying reason is a relatively large value for \(M\) combined with a Cplex integer feasibility tolerance that is large enough to create leaks.

There a few things we can do to repair this:

  • reduce the value of \(M\)
  • tighten the integer feasibility tolerance
  • use SOS1 sets instead of big-M's 
  • use indicator constraints instead of big-M's

SOS1 formulation


We can get rid of the big-M problem by using SOS1 variables: \[\begin{align} \max & \sum_j y_j \\ & Ax=b+s \\ & (x_j,y_j) \in \text{SOS1} \\ & s_i \in [-\epsilon,\epsilon]\\ & y_j \in [0,1]\end{align}\] The SOS1 sets say: \(x_j = 0\) or \(y_j=0\) (or both), or in different words: at most one of \(x_j,y_j\) can be non-zero. The objective tries to make as many elements of \(y\) equal to one. The corresponding elements of \(x\) will be zero, making the \(x\) vector sparser. This SOS1 approach is more reliable but can be slower. When we try this on our small problem, we see:


----    119 VARIABLE x.L  

j4 0.601, j7 0.650, j11 0.072, j12 -0.780, j15 -0.170, j19 -0.117, j23 0.515, j31 -0.483
j34 -0.750, j35 -0.528, j36 -0.031


---- 119 PARAMETER statistics

m 20.000, n 40.000, epsilon 0.050, iterations 4.787142E+7
nodes 1.229474E+7, seconds 3600.156, nz(x) 11.000, obj 29.000
gap 0.138

This model could not be solved to optimality within one hour! We stopped on a time limit of 3600 seconds. Remember, we only have 40 discrete structures in this model. The gap is still 13.8% after running for an hour. Otherwise, the results make sense: the objective of 29 corresponds to 11 nonzero values in \(x\).

Indicator constraint formulation


Finally, a formulation using indicator constraints can look like: \[\begin{align} \min & \sum_j \delta_j \\ & Ax=b+s \\ & \delta_j = 0 \Rightarrow x_j=0 \\ & s_i \in [-\epsilon,\epsilon]\\ & \delta_j \in \{0,1\}\end{align}\]

This does not help:


----    178 VARIABLE x.L  

j1 0.158, j2 -0.459, j13 -0.155, j14 0.470, j17 -0.490, j22 0.269, j23 0.211, j32 1.164
j33 0.147, j38 -0.604, j39 -0.563


---- 178 VARIABLE delta.L

j1 1.000, j2 1.000, j13 1.000, j14 1.000, j17 1.000, j22 1.000, j23 1.000, j32 1.000
j33 1.000, j38 1.000, j39 1.000


---- 178 PARAMETER statistics

m 20.000, n 40.000, epsilon 0.050, iterations 7.127289E+7
nodes 1.523042E+7, seconds 3600.203, nz(x) 11.000, obj 11.000
gap 0.273

We are hitting our time limit. The gap is 27.3%.

Optimal solution


The optimal solution is indeed 11 nonzero elements in \(x\). It took me 4 hours to prove this. Here are the MIP bounds:


The optimal solution was found quite early but proving optimality took a long time. The final jump in the best bound (red line : bound on best possible integer solution) can be explained as follows. The objective (blue line) can only assume integer variables, so it jumps by one. You can see this happening early on when jumping from 12 to 11. This also means we are optimal as soon as the best bound (red line) reaches a value larger than 10. At that point we know 11 is the optimal objective value.

Conclusion


Here is a very small MIP model with only 40 binary variables that is just very difficult to solve to optimality. We cannot prove optimality within an hour of computation time. To be more precise: the optimal solution was found but not proven. Proving optimality takes about 4 hours. This is not the usual case: for most models with 40 binary variables we can expect very quick turnaround. This model is unusually difficult.

In addition we saw the dangers of big-M formulations. Even for modest values, \(M=10,000\), we can be in trouble.



Viewing all articles
Browse latest Browse all 809

Trending Articles