In [1], the following problem is posted:
I have multiple sets of data points. For example, set 1 contains 5 data points, set 2 contains 1 data point, set 3 contains 10, etc. I need to select one data point from each set so that distances between these selected points is minimal. Any Python based functions to be used will be very helpful
This can be stated as a MIP problem. Writing down the model is a useful exercise, not only to solve it and get solutions but also to define the problem a bit more precisely than a typical "word problem". So here is my suggestion for a simple MIP model:
| MIP Model |
|---|
| \[\begin{align}\min&\sum_{i,j|\color{darkblue}ok(i,j)}\color{darkblue}{\mathit{dist}}_{i,j}\cdot\color{darkred}{\mathit{pair}}_{i,j} \\ & \sum_{i|\color{darkblue}group(i,g)}\color{darkred}x_i = 1 && \forall g\\ & \color{darkred}{\mathit{pair}}_{i,j} \ge \color{darkred}x_i + \color{darkred}x_j - 1 && \forall i,j|\color{darkblue}ok(i,j)\\ & \color{darkred}x_i \in \{0,1\}\\ & \color{darkred}{\mathit{pair}}_{i,j} \in \{0,1\} \end{align}\] |
- The set \(\color{darkblue}ok(i,j)\) indicates if points \(i\) and \(j\) are in different groups and \(i\lt j\). These are the allowed pairs. The purpose of restriction \(i\lt j\) is to prevent double counting.
- The first constraint says: we want to select exactly one point in each group.
- The second constraint implements the implication: \[\color{darkred}x_i = \color{darkred}x_j = 1 \Rightarrow \color{darkred}{\mathit{pair}}_{i,j}=1\]
- \( \color{darkred}{\mathit{pair}}_{i,j}\) may be relaxed to be continuous between 0 and 1.
- I used as objective: minimize the sum of the distances between all selected points (making sure we don't do any double counting). Another objective could be: minimize the size of the square that contains all selected points. However, that would make the model non-linear (non-convex quadratic objective).
Example data
----29SETipoints
i1,i2,i3,i4,i5,i6,i7,i8,i9,i10,i11,i12,i13,i14,i15
i16,i17,i18,i19,i20,i21,i22,i23,i24,i25,i26,i27,i28,i29,i30
i31,i32,i33,i34,i35,i36,i37,i38,i39,i40,i41,i42,i43,i44,i45
i46,i47,i48,i49,i50
----29SETggroups
g1,g2,g3,g4,g5
----29SETiggroupmembership
g1g2g3g4g5
i1YES
i2YES
i3YES
i4YES
i5YES
i6YES
i7YES
i8YES
i9YES
i10YES
i11YES
i12YES
i13YES
i14YES
i15YES
i16YES
i17YES
i18YES
i19YES
i20YES
i21YES
i22YES
i23YES
i24YES
i25YES
i26YES
i27YES
i28YES
i29YES
i30YES
i31YES
i32YES
i33YES
i34YES
i35YES
i36YES
i37YES
i38YES
i39YES
i40YES
i41YES
i42YES
i43YES
i44YES
i45YES
i46YES
i47YES
i48YES
i49YES
i50YES
----50PARAMETERloccoordinates of locations
xy
i166.11175.582
i262.74528.386
i38.64210.251
i464.12554.531
i53.15279.236
i67.27717.566
i752.56375.021
i817.8123.414
i958.51362.123
i1038.93635.871
i1124.30324.642
i1213.05093.345
i1337.99478.340
i1430.00312.548
i1574.8876.923
i1620.2020.507
i1726.96149.985
i1815.12917.417
i1933.06431.691
i2032.20996.398
i2199.36036.990
i2237.28977.198
i2339.66891.310
i2411.95873.548
i255.54257.630
i265.1410.601
i2740.12351.988
i2862.88822.575
i2939.61227.601
i3015.23793.632
i3142.26613.466
i3238.60637.463
i3326.84894.837
i3418.89429.751
i357.45540.135
i3610.16938.389
i3732.40919.213
i3811.23759.656
i3951.1454.507
i4078.31094.575
i4159.64660.734
i4236.25159.407
i4367.98550.659
i4415.92565.689
i4552.38812.440
i4698.67222.812
i4767.56577.678
i4893.24520.124
i4929.71419.723
i5024.63564.648
----83VARIABLEx.Lselectpoints
i111.000,i181.000,i311.000,i371.000,i491.000
References
- Find common data point in multiple sets, https://stackoverflow.com/questions/73904007/find-common-data-point-in-multiple-sets
Appendix: GAMS model
$onText |