The problem from [1], slightly generalized, is easy to describe:
Given an \(n\) vector \(\color{darkblue}v_i\) with data, select \(k\lt n\) elements that are closest to each other.
Of course, we can state the problem as a formal optimization model:
| Optimization Model |
|---|
| \[\begin{align}\min\>&\color{darkred}U-\color{darkred}L \\ & \color{darkred}\delta_i=1 \Rightarrow \color{darkblue}v_i \in [\color{darkred}L,\color{darkred}U] \\ & \sum_i \color{darkred}\delta_i=\color{darkblue}k \\ & \color{darkred}\delta_i \in \{0,1\} \end{align}\] |
We can model this as a standard Mixed-Integer Programming (MIP) problem as follows:
| MIP Model |
|---|
| \[\begin{align}\min\>& \color{darkred}U-\color{darkred}L\\ & \color{darkblue}v_i \ge \color{darkred}L - (1-\color{darkred}\delta_i)\cdot\color{darkblue}M \\ & \color{darkblue}v_i \le \color{darkred}U + (1-\color{darkred}\delta_i)\cdot \color{darkblue}M \\ & \sum_i \color{darkred}\delta_i=\color{darkblue}k \\ & \color{darkred}\delta_i \in \{0,1\} \\ & \color{darkred}L,\color{darkred}U \in [ \min_i \color{darkblue}v_i, \max_i \color{darkblue}v_i] \end{align}\] |
Data
Performance
| Improved MIP Model |
|---|
| \[\begin{align}\min\>& \color{darkred}R \\ & \color{darkred}R = \color{darkred}U-\color{darkred}L\\ & \color{darkblue}v_i \ge \color{darkred}L - (1-\color{darkred}\delta_i)\cdot\color{darkblue}M \\ & \color{darkblue}v_i \le \color{darkred}U + (1-\color{darkred}\delta_i)\cdot \color{darkblue}M \\ & \sum_i \color{darkred}\delta_i=\color{darkblue}k \\ & \color{darkred}\delta_i \in \{0,1\} \\ & \color{darkred}L,\color{darkred}U \in [ \min_i \color{darkblue}v_i, \max_i \color{darkblue}v_i] \\ & \color{darkred}R \ge 0\end{align}\] |
---- 114 PARAMETER results
MIP IMPROVED
points 1000.0001000.000
vars 1003.0001003.000
discr 1000.0001000.000
equs 2002.0002002.000
status Optimal Optimal
obj 15.92315.923
time 75.2032.578
nodes 53387.0006445.000
iterations 355004.00088794.000
Algorithm
- Sort the vector \(\color{darkblue}v\).
- Calculate the ranges \(r_j := \color{darkblue}v_{j+k-1} - \color{darkblue}v_j \) for \(j=1,\dots,n-k+1\).
- Pick the best \(r_j\).
---- 170 PARAMETER results
MIP IMPROVED SORTING
points 1000.0001000.0001000.000
vars 1003.0001003.000
discr 1000.0001000.000
equs 2002.0002002.000
status Optimal Optimal
obj 15.92315.92315.923
time 75.2032.5780.085
nodes 53387.0006445.000
iterations 355004.00088794.000
Conclusion
References
- Find n-1 closest values based on criteria in a dataframe in R, https://stackoverflow.com/questions/73337722/find-n-1-closest-values-based-on-criteria-in-a-dataframe-in-r
Appendix: GAMS code
$ontext |