In certain models, we can choose some level of "presolve" at the model level. Presolvers are part of LP/MIP solvers. Their main task is to inspect the model and look for opportunities to make it smaller. Some of this is done by simple operations such as: change a singleton constraint into a bound, remove variables that are not needed, etc [2,3].
The question I am exploring here: how many reductions do we apply at the modeling level opposed to leave it to solver? If a solver is able to reduce the size by a large amount (loosely defined), I always feel I did not do a good job as a modeler. I just did not pay attention.
The model below demonstrates how we can apply different reduction levels to the model. The model becomes smaller, but at the expense of more complex modeling. What is the right level to choose? Of course there is no objective answer to this. Your "optimal level" may be different from mine.
We need to fill the board with integers: \(x_{i,j} \in Z\). The following rules apply:
In the board above we have the following statistics:
This model, with all equations stated as explicit constraints, has the following sizes:
The counts here exclude the objective function. Although the solver will automatically convert singleton equations into bounds, I never write these as explicit constraints. I prefer to specify singleton equations as bounds.
The first three constraints can be implemented as bounds: \[\ell_{i,j} = \begin{cases} 1 & \mathit{Green}(i,j)\\ -\infty & \mathit{Red}(i,j)\\ 0 & \text{otherwise}\end{cases}\] and \[u_{i,j} = \begin{cases} \infty & \mathit{Green}(i,j)\\ -1& \mathit{Red}(i,j)\\ 0 & \text{otherwise}\end{cases}\] Now the model can read: \[\begin{align}\min \> & z=\sum_{\mathit{Green}(i,j)} x_{i,j}\\ & x_{i,j} = -x_{j,i} & \forall i\lt j\\ &\sum_i x_{i,j} =0 & \forall j\\&\sum_j x_{i,j} =0 & \forall i\\ & x_{i,j} \in [\ell_{i,j}, u_{i,j}] \\& x_{i,j} \in Z\end{align}\] I also reduced the number of skew-symmetry constraints \(x_{i,j}=-x_{j,i}\): we only need these for \(i\lt j\). This reduces the model size to:
All singleton equations have been formulates as bounds. This model has a large number of variables fixed to zero (all variables corresponding to blue and white cells). The solver will presolve those variables away, but I prefer to do this myself.
The next level is to remove all \(x_{i,j}\) that are known to be zero from the model. \[\begin{align}\min \> & z= \sum_{\mathit{Green}(i,j)} x_{i,j}\\ & x_{i,j} = -x_{j,i} & \forall \mathit{Green}(i,j)\\ &\sum_{i|\mathit{GreenRed}(i,j)} x_{i,j} =0 & \forall j\\&\sum_{j|\mathit{GreenRed}(i,j)} x_{i,j} =0 & \forall i\\ & x_{i,j} \in [\ell_{i,j}, u_{i,j}] & \forall \mathit{GreenRed}(i,j)\\& x_{i,j} \in Z\end{align}\] The only cells we model here are the red and green ones. Our counts are:
This was my first actual implementation. However, we can go further, and use some more reductions. Here the model starts to become less intuitive.
We can implicitly deal with the red cells: a red cell \(x_{i,j}\) has a corresponding green cell \(-x_{j,i}\). \[\begin{align}\min \> & z=\sum_{\mathit{Green}(i,j)} x_{i,j}\\ & \sum_{i|\mathit{Green}(i,j)} x_{i,j} - \sum_{i|\mathit{Green}(j,i)}x_{j,i} =0 & \forall j\\&\sum_{j|\mathit{Green}(i,j)} x_{i,j} - \sum_{j|\mathit{Green}(j,i)} x_{j,i} =0 & \forall i\\ & x_{i,j} \ge 1 & \forall \mathit{Green}(i,j)\\& x_{i,j} \in Z\end{align}\] We only solve for the green cells here. The value of the red cells can be recovered afterwards.
The row and column sums now become more difficult to recognize. In addition we need to add some code to recalculate the value of the red cells after the solve.
Finally, we can also remove one of the summations because of symmetry. We end up with: \[\begin{align}\min \> & z=\sum_{\mathit{Green}(i,j)} x_{i,j}\\ & \sum_{i|\mathit{Green}(i,j)} x_{i,j} - \sum_{i|\mathit{Green}(j,i)}x_{j,i} =0 & \forall j\\ & x_{i,j} \ge 1 & \forall \mathit{Green}(i,j)\\& x_{i,j} \in Z\end{align}\]
Again, the value of the red cells need to be calculated after the solve. This model is now very compact, but we moved away from the original problem statement. When reading this model, we would not immediately see the correspondence with the problem.
No. The first model shows:
The solutions looks like:
The minimum sum of the green cells is 87.
The question I am exploring here: how many reductions do we apply at the modeling level opposed to leave it to solver? If a solver is able to reduce the size by a large amount (loosely defined), I always feel I did not do a good job as a modeler. I just did not pay attention.
The model below demonstrates how we can apply different reduction levels to the model. The model becomes smaller, but at the expense of more complex modeling. What is the right level to choose? Of course there is no objective answer to this. Your "optimal level" may be different from mine.
Problem description
The problem is from [1]. Consider the board:
- Green cells must contain strictly positive values, \(x_{i,j}\ge 1\).
- Red cells must contain strictly negative values, \(x_{i,j}\le -1\).
- White and blue cells have a value of zero, \(x_{i,j}= 0\).
- Skew-symmetry or Anti-symmetry: we have the restriction \(x_{i,j} = -x_{j,i}\). Putting it differently: \(X^T = -X\).
- Row and column sums are equal to zero: \[ \begin{align}&\sum_i x_{i,j} =0 & \forall j\\& \sum_j x_{i,j} = 0 & \forall i\end{align}\]
There are multiple solutions. We may choose the solutions with the smallest sum of green values:\[\min \sum_{\mathit{Green}(i,j)} x_{i,j}\]
In the board above we have the following statistics:
Cell type | Count |
---|---|
Green cells | 57 |
Red cells | 57 |
Blue cells | 20 |
White cells | 266 |
Total | 400 |
Presolve level 0
A direct formulation for all \(x_{i,j}\) is: \[\begin{align}\min\> & z=\sum_{\mathit{Green}(i,j)} x_{i,j}\\ & x_{i,j}\ge 1 & \mathit{Green}(i,j)\\ & x_{i,j} \le -1 & \mathit{Red}(i,j)\\ & x_{i,j} =0 & \mathit{WhiteBlue}(i,j)\\ & x_{i,j} = -x_{j,i} & \forall i,j\\ &\sum_i x_{i,j} =0 & \forall j\\&\sum_j x_{i,j} =0 & \forall i \\ & x_{i,j} \in Z\end{align}\]
This model, with all equations stated as explicit constraints, has the following sizes:
Model Size | Count |
---|---|
rows | 840 |
columns | 400 |
nonzero elements | 1980 |
The counts here exclude the objective function. Although the solver will automatically convert singleton equations into bounds, I never write these as explicit constraints. I prefer to specify singleton equations as bounds.
Presolve level 1
The first three constraints can be implemented as bounds: \[\ell_{i,j} = \begin{cases} 1 & \mathit{Green}(i,j)\\ -\infty & \mathit{Red}(i,j)\\ 0 & \text{otherwise}\end{cases}\] and \[u_{i,j} = \begin{cases} \infty & \mathit{Green}(i,j)\\ -1& \mathit{Red}(i,j)\\ 0 & \text{otherwise}\end{cases}\] Now the model can read: \[\begin{align}\min \> & z=\sum_{\mathit{Green}(i,j)} x_{i,j}\\ & x_{i,j} = -x_{j,i} & \forall i\lt j\\ &\sum_i x_{i,j} =0 & \forall j\\&\sum_j x_{i,j} =0 & \forall i\\ & x_{i,j} \in [\ell_{i,j}, u_{i,j}] \\& x_{i,j} \in Z\end{align}\] I also reduced the number of skew-symmetry constraints \(x_{i,j}=-x_{j,i}\): we only need these for \(i\lt j\). This reduces the model size to:
Model Size | Count |
---|---|
rows | 230 |
columns | 400 |
nonzero elements | 1180 |
All singleton equations have been formulates as bounds. This model has a large number of variables fixed to zero (all variables corresponding to blue and white cells). The solver will presolve those variables away, but I prefer to do this myself.
Presolve level 2
The next level is to remove all \(x_{i,j}\) that are known to be zero from the model. \[\begin{align}\min \> & z= \sum_{\mathit{Green}(i,j)} x_{i,j}\\ & x_{i,j} = -x_{j,i} & \forall \mathit{Green}(i,j)\\ &\sum_{i|\mathit{GreenRed}(i,j)} x_{i,j} =0 & \forall j\\&\sum_{j|\mathit{GreenRed}(i,j)} x_{i,j} =0 & \forall i\\ & x_{i,j} \in [\ell_{i,j}, u_{i,j}] & \forall \mathit{GreenRed}(i,j)\\& x_{i,j} \in Z\end{align}\] The only cells we model here are the red and green ones. Our counts are:
Model Size | Count |
---|---|
rows | 97 |
columns | 114 |
nonzero elements | 342 |
This was my first actual implementation. However, we can go further, and use some more reductions. Here the model starts to become less intuitive.
Presolve level 3
We can implicitly deal with the red cells: a red cell \(x_{i,j}\) has a corresponding green cell \(-x_{j,i}\). \[\begin{align}\min \> & z=\sum_{\mathit{Green}(i,j)} x_{i,j}\\ & \sum_{i|\mathit{Green}(i,j)} x_{i,j} - \sum_{i|\mathit{Green}(j,i)}x_{j,i} =0 & \forall j\\&\sum_{j|\mathit{Green}(i,j)} x_{i,j} - \sum_{j|\mathit{Green}(j,i)} x_{j,i} =0 & \forall i\\ & x_{i,j} \ge 1 & \forall \mathit{Green}(i,j)\\& x_{i,j} \in Z\end{align}\] We only solve for the green cells here. The value of the red cells can be recovered afterwards.
Model Size | Count |
---|---|
rows | 40 |
columns | 57 |
nonzero elements | 228 |
The row and column sums now become more difficult to recognize. In addition we need to add some code to recalculate the value of the red cells after the solve.
Presolve level 4
Finally, we can also remove one of the summations because of symmetry. We end up with: \[\begin{align}\min \> & z=\sum_{\mathit{Green}(i,j)} x_{i,j}\\ & \sum_{i|\mathit{Green}(i,j)} x_{i,j} - \sum_{i|\mathit{Green}(j,i)}x_{j,i} =0 & \forall j\\ & x_{i,j} \ge 1 & \forall \mathit{Green}(i,j)\\& x_{i,j} \in Z\end{align}\]
Model Size | Count |
---|---|
rows | 20 |
columns | 57 |
nonzero elements | 114 |
Again, the value of the red cells need to be calculated after the solve. This model is now very compact, but we moved away from the original problem statement. When reading this model, we would not immediately see the correspondence with the problem.
Does it make a difference?
No. The first model shows:
Presolved: 13 rows, 48 columns, 96 nonzeros
Variable types: 0 continuous, 48 integer (0 binary)
The last model gives:
Presolved: 13 rows, 48 columns, 96 nonzeros
Variable types: 0 continuous, 48 integer (0 binary)
So should we even worry? I still like to generate models that are somewhat small. How far I take this is determined by readability and understandability: the limit is when the formulation becomes less obvious and when readibility starts to suffer.
Solution
The solutions looks like:
The minimum sum of the green cells is 87.
References
- Antisymmetric Table Puzzle where the rows/columns sum to zero. https://math.stackexchange.com/questions/2794165/antisymmetric-table-puzzle-where-the-rows-columns-sum-to-zero
- A.L. Brearley, G. Mitra, H.P. Williams, Analysis of mathematical programming problems prior to applying the simplex algorithm, Mathematical Programming, 8 (1975), pp. 54-83.
- E.D. Anderson, K.D. Anderson, Presolving in Linear Programming, Mathematical Programming, 71 (1995), pp. 221-245.