Sparsest solution: a difficult MIP

The problem of finding a solution vector that is as sparse as possible can be formulated as a MIP.

I was looking at solving an underdetermined problem \[Ax=b\] i.e. the number of rows \(m\) is less than the number of colums \(n\). A solution \(x\) has in general \(m\) nonzero elements.

The actual system was: \[Ax\approx b\] or to be more precise \[-\epsilon \le Ax-b\le \epsilon \] By adding slack variables \(s_i\) we can write this as: \[\begin{align}&Ax=b+s\\&-\epsilon\le s \le \epsilon\end{align} \] Now we have more wiggle room, and can try to find the vector \(x\) that has as many zero elements as possible.

Big-M formulation

The MIP model seems obvious: \[\begin{align} \min & \sum_j \delta_j \\ & Ax=b+s \\ & -M\delta_j \le x_j \le M\delta_j \\ & s_i \in [-\epsilon,\epsilon]\\ & \delta_j \in \{0,1\}\end{align}\] This turns out to be a problematic formulation. First we have no good a priori value for \(M\). I used \(M=10,000\) but that gives some issues. Furthermore the performance is horrible. For a very small problem with \(m=20, n=40\) we already see a solution time of about 20 minutes. With just 40 binary variables I expected something like a minute or two. The results with Cplex look like:

----     82 VARIABLE x.L  

j3  -0.026,    j4   0.576,    j7   0.638,    j11  0.040,    j12 -0.747,    j14  0.039,    j15 -0.169,    j19 -0.088
j23  0.509,    j31 -0.475,    j34 -0.750,    j35 -0.509


----     82 VARIABLE delta.L  

j3  2.602265E-6,    j4        1.000,    j7        1.000,    j11 4.012925E-6,    j12       1.000,    j14 3.921840E-6
j15       1.000,    j19 8.834778E-6,    j23       1.000,    j31       1.000,    j34       1.000,    j35       1.000


----     82 PARAMETER statistics  

m               20.000,    n               40.000,    epsilon          0.050,    big M        10000.000
iterations 5.443661E+7,    nodes      5990913.000,    seconds       1004.422,    nz(x)           12.000
obj              8.000,    gap                EPS

Well, we have some problems here. The optimal objective is 8 so we have 8 \(\delta_j\)'s that are one. But at the same time the number of nonzero values in \(x\) is 12. The reason is we have a few \(\delta_j\)'s that are very small. Cplex considers them as being zero, while the model sees opportunities to exploit these small values for what is sometimes called "trickle flow". These results are just not reliable. The underlying reason is a relatively large value for \(M\) combined with a Cplex integer feasibility tolerance that is large enough to create leaks.

There a few things we can do to repair this:

reduce the value of \(M\)
tighten the integer feasibility tolerance
use SOS1 sets instead of big-M's
use indicator constraints instead of big-M's

SOS1 formulation

We can get rid of the big-M problem by using SOS1 variables: \[\begin{align} \max & \sum_j y_j \\ & Ax=b+s \\ & (x_j,y_j) \in \text{SOS1} \\ & s_i \in [-\epsilon,\epsilon]\\ & y_j \in [0,1]\end{align}\] The SOS1 sets say: \(x_j = 0\) or \(y_j=0\) (or both), or in different words: at most one of \(x_j,y_j\) can be non-zero. The objective tries to make as many elements of \(y\) equal to one. The corresponding elements of \(x\) will be zero, making the \(x\) vector sparser. This SOS1 approach is more reliable but can be slower. When we try this on our small problem, we see:

----    119 VARIABLE x.L  

j4   0.601,    j7   0.650,    j11  0.072,    j12 -0.780,    j15 -0.170,    j19 -0.117,    j23  0.515,    j31 -0.483
j34 -0.750,    j35 -0.528,    j36 -0.031


----    119 PARAMETER statistics  

m               20.000,    n               40.000,    epsilon          0.050,    iterations 4.787142E+7
nodes      1.229474E+7,    seconds       3600.156,    nz(x)           11.000,    obj             29.000
gap              0.138

This model could not be solved to optimality within one hour! We stopped on a time limit of 3600 seconds. Remember, we only have 40 discrete structures in this model. The gap is still 13.8% after running for an hour. Otherwise, the results make sense: the objective of 29 corresponds to 11 nonzero values in \(x\).

Indicator constraint formulation

Finally, a formulation using indicator constraints can look like: \[\begin{align} \min & \sum_j \delta_j \\ & Ax=b+s \\ & \delta_j = 0 \Rightarrow x_j=0 \\ & s_i \in [-\epsilon,\epsilon]\\ & \delta_j \in \{0,1\}\end{align}\]

This does not help:

----    178 VARIABLE x.L  

j1   0.158,    j2  -0.459,    j13 -0.155,    j14  0.470,    j17 -0.490,    j22  0.269,    j23  0.211,    j32  1.164
j33  0.147,    j38 -0.604,    j39 -0.563


----    178 VARIABLE delta.L  

j1  1.000,    j2  1.000,    j13 1.000,    j14 1.000,    j17 1.000,    j22 1.000,    j23 1.000,    j32 1.000
j33 1.000,    j38 1.000,    j39 1.000


----    178 PARAMETER statistics  

m               20.000,    n               40.000,    epsilon          0.050,    iterations 7.127289E+7
nodes      1.523042E+7,    seconds       3600.203,    nz(x)           11.000,    obj             11.000
gap              0.273

We are hitting our time limit. The gap is 27.3%.

Optimal solution

The optimal solution is indeed 11 nonzero elements in \(x\). It took me 4 hours to prove this. Here are the MIP bounds:

The optimal solution was found quite early but proving optimality took a long time. The final jump in the best bound (red line : bound on best possible integer solution) can be explained as follows. The objective (blue line) can only assume integer variables, so it jumps by one. You can see this happening early on when jumping from 12 to 11. This also means we are optimal as soon as the best bound (red line) reaches a value larger than 10. At that point we know 11 is the optimal objective value.

Conclusion

Here is a very small MIP model with only 40 binary variables that is just very difficult to solve to optimality. We cannot prove optimality within an hour of computation time. To be more precise: the optimal solution was found but not proven. Proving optimality takes about 4 hours. This is not the usual case: for most models with 40 binary variables we can expect very quick turnaround. This model is unusually difficult.

In addition we saw the dangers of big-M formulations. Even for modest values, \(M=10,000\), we can be in trouble.

Sparsest solution: a difficult MIP

Big-M formulation

SOS1 formulation

Indicator constraint formulation

Optimal solution

Conclusion

Trending Articles

RAMAYAMPET Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers Medak...

लड़कियां सेक्स के दौरान क्यों करती है उह! आह!लड़कियां सेक्स के दौरान क्यों करती...

Neem Baba Extra Questions Answer Class 6 English Poorvi

Throw Back: 4×4 — Sikilitele (Ft Castro) Prod by JQ

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Lowe faces four theft charges

Practice Sheet of Right form of verbs for HSC Students

Mafia, Murder & Mayhem In The Motor City: Detroit Mob Hit Timeline (1937-2007)

The 10 Tennessee Cities With The Largest Black Population For 2021

Materials Around Us Class 6 Worksheet Science Chapter 6

デスクトップヒープの枯渇

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Kanulanu Thaake Lyrics and translation | Manam (2014)

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Teen Shot In Miami Drive-By Dies From Injuries

Download: IQ Muzatasha feat Shy D & Pmj – Ulesi NiFertilizer Yamavuto

Mahakal Attitude Status

Property developer set up cannabis factory to help pay off debts...

♡

KB: How to troubleshoot issues when adding a Hyper-V host in System Center...