Select n items but optimize for the best k items

Problem Statement

This is a strange one [1]. Consider a standard objective function: \[\max\>\sum_j p_j\cdot x_j\] where \(p\) are coefficients and \(x\) are binary decision variables. This will look at all \(n\) objective coefficients. Now the question is:

Can I optimize just for the \(k\) best ones?

This means in effect: \[\begin{align} \max & \sum_j \delta_j \cdot p_j\cdot x_j\\ & \sum_j \delta_j = k \\ & \delta_j \in \{0,1\}\end{align}\] We introduced extra binary decision variables \(\delta\). But also: we made the objective non-linear.

The question is: how can we model this while keeping the model linear.

Example problem

Here is a small multi-dimensional knapsack-like problem:

Base MIP Model
\[ \begin{align} \max& \sum_j \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \color{darkred}x_j \in \{0,1\}\end{align}\]

The randomly generated data looks like:

----     16 PARAMETER p  objective coefficients

j1  0.172,    j2  0.843,    j3  0.550,    j4  0.301,    j5  0.292,    j6  0.224,    j7  0.350,    j8  0.856
j9  0.067,    j10 0.500


----     16 PARAMETER a  matrix coefficients

            j1          j2          j3          j4          j5          j6          j7          j8          j9

i1       0.9980.5790.9910.7620.1310.6400.1600.2500.669
i2       0.3600.3510.1310.1500.5890.8310.2310.6660.776
i3       0.1100.5020.1600.8720.2650.2860.5940.7230.628
i4       0.4130.1180.3140.0470.3390.1820.6460.5610.770
i5       0.6610.7560.6270.2840.0860.1030.6410.5450.032

 +         j10

i1       0.435
i2       0.304
i3       0.464
i4       0.298
i5       0.792


----     16 PARAMETER b  rhs values

i1 2.500,    i2 2.500,    i3 2.500,    i4 2.500,    i5 2.500


----     16 PARAMETER m                    =        5.000  number of x to select

To see the "best" \(x_j\), I reformulated the problem to:

Base MIP Model with y variables
\[ \begin{align} \max& \sum_j \color{darkred}y_j \\ & \color{darkred}y_j = \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \color{darkred}x_j \in \{0,1\}\end{align}\]

When we solve this problem we see:

----     41 VARIABLE x.L  binary decision variables

j3 1.000,    j5 1.000,    j6 1.000,    j7 1.000,    j8 1.000


----     41 VARIABLE y.L  auxiliary variables (y=p*x)

j3 0.550,    j5 0.292,    j6 0.224,    j7 0.350,    j8 0.856


----     41 VARIABLE z.L                   =        2.273  objective

----     61 PARAMETER yordered  ordered version of y

k1.j8 0.856
k2.j3 0.550
k3.j7 0.350
k4.j5 0.292
k5.j6 0.224

The model says to select items 3,5,6,7 and 8.

The three "best" items are j8 (0.856), j3 (0.550), and j7 (0.350). Let's optimize for the best three. We change the model to:

MIQP Model
\[ \begin{align} \max& \sum_j \color{darkred}\delta_j \cdot \color{darkred}y_j \\ & \color{darkred}y_j = \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \sum_j \color{darkred}\delta_j = 3\\ & \color{darkred}x_j \in \{0,1\}\\& \color{darkred}\delta_j \in \{0,1\}\end{align}\]

MIQP Model

\[ \begin{align} \max& \sum_j \color{darkred}\delta_j \cdot \color{darkred}y_j \\ & \color{darkred}y_j = \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \sum_j \color{darkred}\delta_j = 3\\ & \color{darkred}x_j \in \{0,1\}\\& \color{darkred}\delta_j \in \{0,1\}\end{align}\]

We can solve this with a non-convex solver (or a solver smart enough to reformulate things automatically). The results look like:

----     75 VARIABLE delta.L  three best y variables

j3  1.000,    j8  1.000,    j10 1.000


----     75 VARIABLE x.L  binary decision variables

j3  1.000,    j5  1.000,    j8  1.000,    j9  1.000,    j10 1.000


----     75 VARIABLE y.L  auxiliary variables (y=p*x)

j3  0.550,    j5  0.292,    j8  0.856,    j9  0.067,    j10 0.500


----     75 VARIABLE z.L                   =        1.907  objective

----     89 PARAMETER yordered  ordered version of y

k1.j8  0.856
k2.j3  0.550
k3.j10 0.500
k4.j5  0.292
k5.j9  0.067

This has changed the best three. The third item has changed from j7 (with contribution 0.350) to j10 (0.5). The selected items become: 3,5,8,9, and 10. This is different than before.

There are few linearizations we can consider:

Create ordered variables for \(y\) inside the model. Once we know these, just add the first three to the objective. Sorting variables inside a MIP is not completely trivial.
Better is to linearize \(\delta_j \cdot y_j = p_j \cdot \delta_j \cdot x_j\) directly.
Even better is to note that \(\delta_j = 1 \Rightarrow x_j=1\) or \(x_j\ge \delta_j\). This is actually the only thing we need. So we end up with:

Linear MIP Model
\[ \begin{align} \max& \sum_j \color{darkblue}p_j \cdot \color{darkred}\delta_j \\ & \color{darkred}x_j \ge \color{darkred}\delta_j \\ & \color{darkred}y_j = \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \sum_j \color{darkred}\delta_j = 3\\ & \color{darkred}x_j \in \{0,1\}\\& \color{darkred}\delta_j \in \{0,1\}\end{align}\]

Linear MIP Model

\[ \begin{align} \max& \sum_j \color{darkblue}p_j \cdot \color{darkred}\delta_j \\ & \color{darkred}x_j \ge \color{darkred}\delta_j \\ & \color{darkred}y_j = \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \sum_j \color{darkred}\delta_j = 3\\ & \color{darkred}x_j \in \{0,1\}\\& \color{darkred}\delta_j \in \{0,1\}\end{align}\]

The solution looks like:

----     52 VARIABLE x.L  binary decision variables

j3  1.000,    j5  1.000,    j8  1.000,    j9  1.000,    j10 1.000


----     52 VARIABLE delta.L  three best y variables

j3  1.000,    j8  1.000,    j10 1.000


----     52 VARIABLE y.L  auxiliary variables (y=p*x)

j3  0.550,    j5  0.292,    j8  0.856,    j9  0.067,    j10 0.500


----     52 VARIABLE z.L                   =        1.907  objective

----     72 PARAMETER yordered  ordered version of y

k1.j8  0.856
k2.j3  0.550
k3.j10 0.500
k4.j5  0.292
k5.j9  0.067

This is the same solution as obtained by the MIQP model.

Conclusion

It turns out that linearizing the problem of finding the "best three" is rather easy. A little bit surprising to me: I expected to have to do some complicated sorting. Obviously, we exploited here that the decision variables \(x_j\) are binary variables.

References

With R and lpsolve is there a way of optimising for the best 11 elements but still pick 15 elements in total? https://stackoverflow.com/questions/62511975/with-r-and-lpsolve-is-there-a-way-of-optimising-for-the-best-11-elements-but-sti