Problem Description
This is adapted from [1].
Assume we have data \(a_i\) and \(b_j\):
---- 11 PARAMETER a
i1 0.172, i2 0.843, i3 0.550, i4 0.301, i5 0.292, i6 0.224, i7 0.350, i8 0.856, i9 0.067
i10 0.500
---- 11 PARAMETER b
j1 0.998, j2 0.621, j3 0.992, j4 0.786, j5 0.218, j6 0.676, j7 0.244, j8 0.325, j9 0.702
j10 0.492, j11 0.424, j12 0.416, j13 0.218, j14 0.235, j15 0.630
Try to find an assignment of each \(a_i\) to a unique \(b_j\) such that the quantities \[ \frac{a_i}{b_j}\] are as close to each other as possible.
We first introduce assignment variables: \[x_{i,j} = \begin{cases} 1 & \text{if $a_i$ is assigned to $b_j$} \\ 0 & \text{otherwise}\end{cases}\] The standard assignment constraints apply \[\begin{align}&\sum_i x_{i,j}\le 1&&\forall j \\&\sum_j x_{i,j} = 1&&\forall i \end{align}\] This is an "unbalanced" assignment.
To model "all about equal", we introduce a continuous variable \(c\) that represents the common value. I.e. \[\frac{a_i}{b_j} \approx c \>\>\text{for all $(i,j)$ such that $x_{i,j}=1$}\]
The objective is \[\min \sum_{i,j} \left[ x_{i,j} \left( \frac{a_i}{b_j} - c\right) \right]^2 \] or \[\min \sum_{i,j} x_{i,j} \left( \frac{a_i}{b_j} - c\right)^2 \]
A complete model can look like:
When we run this model we see:
The assignment can be depicted as follows.
In the above matrix the cell values are \[\text{cell}_{i,j} = \frac{a_i}{b_j}\]
We can approximate the quadratic weighting of the residuals by absolute values: \[\sum_{i,j} \left| x_{i,j} \left( \frac{a_i}{b_j} - c \right) \right| \] The first thing to is to linearize the term \(x_{i,j} c\). We introduce variables \[y_{i,j} = x_{i,j} c\] We can form the implications: \[\begin{align} & x_{i,j} = 0 \Rightarrow y_{i,j} = 0\\ &x_{i,j} = 1 \Rightarrow y_{i,j} = c\end{align}\] Most solvers support indicator constraints, which allows us to implement these implications as is. If we have a solver without this, we can formulate this as a bunch of big-\(M\) constraints: \[\begin{align} & - M x_{i,j} \le y_{i,j} \le M x_{i,j} \\ & c-M(1-x_{i,j}) \le y_{i,j} \le c+M(1-x_{i,j})\end{align}\] From the data we can assume \(a_{i}\ge 0\) and \(b_{j}\gt 0\). We can exploit this: \[\begin{align} & 0 \le y_{i,j} \le M_1 x_{i,j} \\ & c-M_2(1-x_{i,j}) \le y_{i,j} \le c+M_2(1-x_{i,j})\end{align}\] We can think a bit about good values for \(M_1\) and \(M_2\). I suggest: \[M_1 = M_2 = \max c = \max_{i,j} \frac{a_i}{b_j}\]
The complete model can look like
The results look like:
The model results are very close. The assignments are the same. Just the \(c\) value and resulting sum of squared or absolute residuals are somewhat different.
The solution looks like:
A picture of this solution:
The disadvantage of this method is that it does not care about assignments as long as they are inside our optimal bandwidth. We can see this if we recalculate an optimal central \(c\) value afterwards. The results with this reconstructed \(c\) value:
Indeed, we see that the total sum of squared or absolute deviations is not as good as we saw before. The same thing can be seen by just calculating the standard deviation for the selected ratios \(a_i/b_j\). This gives:
A first model
We first introduce assignment variables: \[x_{i,j} = \begin{cases} 1 & \text{if $a_i$ is assigned to $b_j$} \\ 0 & \text{otherwise}\end{cases}\] The standard assignment constraints apply \[\begin{align}&\sum_i x_{i,j}\le 1&&\forall j \\&\sum_j x_{i,j} = 1&&\forall i \end{align}\] This is an "unbalanced" assignment.
To model "all about equal", we introduce a continuous variable \(c\) that represents the common value. I.e. \[\frac{a_i}{b_j} \approx c \>\>\text{for all $(i,j)$ such that $x_{i,j}=1$}\]
The objective is \[\min \sum_{i,j} \left[ x_{i,j} \left( \frac{a_i}{b_j} - c\right) \right]^2 \] or \[\min \sum_{i,j} x_{i,j} \left( \frac{a_i}{b_j} - c\right)^2 \]
A complete model can look like:
| MINLP Model |
|---|
| \[\begin{align} \min & \sum_{i,j} \color{darkred}x_{i,j} \left( \frac{\color{darkblue}a_i}{\color{darkblue}b_j} - \color{darkred}c\right)^2 \\ & \sum_j \color{darkred} x_{i,j} = 1 &&\forall i\\ & \sum_i \color{darkred} x_{i,j} \le 1 &&\forall j\\ & \color{darkred}x_{i,j} \in \{0,1\}\\ & \color{darkred}c \text{ free} \end{align}\] |
When we run this model we see:
---- 33 VARIABLE x.L assignment
j1 j3 j4 j5 j7 j8 j9 j10 j11 j12
i1 1.000
i2 1.000
i3 1.000
i4 1.000
i5 1.000
i6 1.000
i7 1.000
i8 1.000
i9 1.000
i10 1.000
---- 33 VARIABLE c.L = 0.695 common value
VARIABLE z.L = 0.201 objective
The assignment can be depicted as follows.
| Solution. Matrix values are a(i)/b(j). |
A linear model
We can approximate the quadratic weighting of the residuals by absolute values: \[\sum_{i,j} \left| x_{i,j} \left( \frac{a_i}{b_j} - c \right) \right| \] The first thing to is to linearize the term \(x_{i,j} c\). We introduce variables \[y_{i,j} = x_{i,j} c\] We can form the implications: \[\begin{align} & x_{i,j} = 0 \Rightarrow y_{i,j} = 0\\ &x_{i,j} = 1 \Rightarrow y_{i,j} = c\end{align}\] Most solvers support indicator constraints, which allows us to implement these implications as is. If we have a solver without this, we can formulate this as a bunch of big-\(M\) constraints: \[\begin{align} & - M x_{i,j} \le y_{i,j} \le M x_{i,j} \\ & c-M(1-x_{i,j}) \le y_{i,j} \le c+M(1-x_{i,j})\end{align}\] From the data we can assume \(a_{i}\ge 0\) and \(b_{j}\gt 0\). We can exploit this: \[\begin{align} & 0 \le y_{i,j} \le M_1 x_{i,j} \\ & c-M_2(1-x_{i,j}) \le y_{i,j} \le c+M_2(1-x_{i,j})\end{align}\] We can think a bit about good values for \(M_1\) and \(M_2\). I suggest: \[M_1 = M_2 = \max c = \max_{i,j} \frac{a_i}{b_j}\]
| MIP Model |
|---|
| \[\begin{align} \min & \sum_{i,j} \color{darkred}r_{i,j} \\ & \sum_j \color{darkred} x_{i,j} = 1 && \forall i\\ & \sum_i \color{darkred} x_{i,j} \le 1 && \forall j\\ & -\color{darkred}r_{i,j} \le \color{darkred} x_{i,j} \frac{\color{darkblue} a_i}{\color{darkblue} b_j} - \color{darkred} y_{i,j} \le \color{darkred}r_{i,j} &&\forall i,j\\ & \color{darkred} y_{i,j} \le \color{darkblue} M_1 \color{darkred} x_{i,j} &&\forall i,j\\ & \color{darkred} c - \color{darkblue} M_2 (1- \color{darkred} x_{i,j}) \le \color{darkred} y_{i,j} \le \color{darkred} c + \color{darkblue} M_2 (1- \color{darkred} x_{i,j})&&\forall i,j \\ & \color{darkred}x_{i,j} \in \{0,1\}\\ & \color{darkred}c \ge 0 \\ & \color{darkred}y_{i,j} \ge 0\\ & \color{darkred}r_{i,j} \ge 0\\ \end{align}\] |
The results look like:
---- 72 VARIABLE x.L assignment
j1 j3 j4 j5 j7 j8 j9 j10 j11 j12
i1 1.000
i2 1.000
i3 1.000
i4 1.000
i5 1.000
i6 1.000
i7 1.000
i8 1.000
i9 1.000
i10 1.000
---- 72 VARIABLE c.L = 0.711 common value
---- 72 VARIABLE y.L c*x(i,j)
j1 j3 j4 j5 j7 j8 j9 j10 j11 j12
i1 0.711
i2 0.711
i3 0.711
i4 0.711
i5 0.711
i6 0.711
i7 0.711
i8 0.711
i9 0.711
i10 0.711
---- 72 VARIABLE r.L abs(residuals)
j1 j3 j4 j5 j7 j8 j9 j10 j12
i1 0.006
i2 0.139
i3 0.010
i5 0.009
i6 0.021
i7 6.137042E-4
i8 0.147
i9 0.402
i10 0.002
The model results are very close. The assignments are the same. Just the \(c\) value and resulting sum of squared or absolute residuals are somewhat different.
---- 77 PARAMETER sumdev sum of squared/absolute deviations
dev^2 |dev| c
minlp model 0.2010.7840.695
mip model 0.2040.7370.711
A different approach
We can look at the problem in a different way. Instead of minimize the spread around a central value \(c\), we can minimize the bandwidth directly. I.e. we can write: \[\begin{align} \min\> & \mathit{maxv} - \mathit{minv} \\ & \mathit{minv} \le \frac{a_i}{b_j} && \forall x_{i,j}=1\\& \mathit{maxv} \ge \frac{a_i}{b_j} && \forall x_{i,j}=1 \end{align}\] These constraints can be implemented as indicator constraints \[\begin{align} & x_{i,j}=1 \Rightarrow \mathit{minv} \le \frac{a_i}{b_j}\\ & x_{i,j}=1 \Rightarrow \mathit{maxv} \ge \frac{a_i}{b_j}\end{align}\] If we don't have indicator constraints available we need to resort to big-\(M\) constraints:
| Alternative MIP Model |
|---|
| \[\begin{align} \min\> & \color{darkred}{\mathit{maxv}} - \color{darkred}{\mathit{minv}} \\ & \sum_j \color{darkred} x_{i,j} = 1 && \forall i \\ & \sum_i \color{darkred} x_{i,j} \le 1 && \forall j \\ & \color{darkred}{\mathit{minv}} \le \frac{\color{darkblue}a_i}{\color{darkblue}b_j} + M (1-\color{darkred} x_{i,j}) && \forall i,j \\ & \color{darkred}{\mathit{maxv}} \ge \frac{\color{darkblue}a_i}{\color{darkblue}b_j} - M (1-\color{darkred} x_{i,j}) && \forall i,j \\ & \color{darkred}x_{i,j} \in \{0,1\} \\ & \color{darkred}{\mathit{maxv}},\color{darkred}{\mathit{minv}} \text{ free} \end{align}\] |
The solution looks like:
---- 103 VARIABLE x.L assignment
j1 j2 j3 j4 j5 j6 j9 j11 j14 j15
i1 1.000
i2 1.000
i3 1.000
i4 1.000
i5 1.000
i6 1.000
i7 1.000
i8 1.000
i9 1.000
i10 1.000
---- 103 VARIABLE minv.L = 0.308 minimum value
VARIABLE maxv.L = 0.858 maximum value
A picture of this solution:
| Minimize bandwidth solution |
The disadvantage of this method is that it does not care about assignments as long as they are inside our optimal bandwidth. We can see this if we recalculate an optimal central \(c\) value afterwards. The results with this reconstructed \(c\) value:
---- 114 PARAMETER sumdev sum of squared/absolute deviations
dev^2 |dev| c
minlp model 0.2010.7840.695
mip model 0.2040.7370.711
mip model 20.3131.5480.617
Indeed, we see that the total sum of squared or absolute deviations is not as good as we saw before. The same thing can be seen by just calculating the standard deviation for the selected ratios \(a_i/b_j\). This gives:
---- 130 PARAMETER stdev
minlp model 0.149, mip model 0.149, mip model 20.186
Again we see this last approach leads to more variability.
Non-unique \(b_j\)
If we allow the \(b_j\) to be reused, only a small change in the models is needed. We just need to drop the constraint \[\sum_i x_{i,j} \le 1 \] When we do this, we may see a solution like:
| Solution. Allow a column to be selected multiple times. |