Arranging points on a line

The problem stated in [1] is:

In the original problem, the poster talked about the distance between neighbors. But we don't know in advance what the neighboring points are. Of course, we can just generalize and talk about any two points.

Data

To do some experiments, I generated a data set with 50 points. Their ranges are:

----     15 PARAMETER bounds  ranges

i1 .lo 13.740,    i1 .up 23.656,    i2 .lo 67.461,    i2 .up 78.799,    i3 .lo 44.030,    i3 .up 53.442
i4 .lo 24.091,    i4 .up 28.349,    i5 .lo 23.377,    i5 .up 24.673,    i6 .lo 17.924,    i6 .up 19.462
i7 .lo 27.986,    i7 .up 37.605,    i8 .lo 68.502,    i8 .up 76.681,    i9 .lo  5.369,    i9 .up  5.842
i10.lo 40.017,    i10.up 51.902,    i11.lo 79.849,    i11.up 80.941,    i12.lo 46.299,    i12.up 48.934
i13.lo 79.291,    i13.up 87.175,    i14.lo 60.980,    i14.up 72.233,    i15.lo 10.455,    i15.up 13.127
i16.lo 51.178,    i16.up 51.690,    i17.lo 12.761,    i17.up 21.538,    i18.lo 20.006,    i18.up 29.325
i19.lo 53.514,    i19.up 59.355,    i20.lo 34.829,    i20.up 40.209,    i21.lo 28.776,    i21.up 32.422
i22.lo 28.115,    i22.up 31.812,    i23.lo 10.519,    i23.up 12.477,    i24.lo 12.008,    i24.up 26.010
i25.lo 47.129,    i25.up 52.828,    i26.lo 66.471,    i26.up 78.222,    i27.lo 18.465,    i27.up 22.966
i28.lo 53.259,    i28.up 55.141,    i29.lo 62.069,    i29.up 73.302,    i30.lo 24.293,    i30.up 25.331
i31.lo  8.839,    i31.up 11.870,    i32.lo 40.191,    i32.up 40.267,    i33.lo 12.814,    i33.up 16.858
i34.lo 69.797,    i34.up 77.295,    i35.lo 21.209,    i35.up 23.478,    i36.lo 22.865,    i36.up 25.478
i37.lo 47.516,    i37.up 52.476,    i38.lo 57.818,    i38.up 62.571,    i39.lo 50.260,    i39.up 55.091
i40.lo 37.104,    i40.up 51.563,    i41.lo 33.065,    i41.up 47.969,    i42.lo  9.416,    i42.up 14.964
i43.lo 25.137,    i43.up 30.730,    i44.lo  3.724,    i44.up 15.304,    i45.lo 27.084,    i45.up 33.034
i46.lo 14.568,    i46.up 28.264,    i47.lo 51.658,    i47.up 53.452,    i48.lo 44.860,    i48.up 55.892
i49.lo 61.597,    i49.up 62.428,    i50.lo 23.824,    i50.up 32.469

The data looks like:

High-level model

A high-level model that defines our problem can look like:

High-level Model
\[\begin{align} \max\>& \color{darkred}z \\ & \color{darkred}z \le \|\color{darkred}x_i - \color{darkred}x_j\| && \forall i \lt j \\ & \color{darkred}x_i \in [\color{darkblue}\ell_i, \color{darkblue}u_i] \end{align}\]

We only need to compare points $i$ and $j$ if $i\lt j$ (otherwise we would be checking each pair twice). Modeling the absolute value is interesting.

MINLP Model

In order to be able to use a standard MINLP solver, we can formulate:

MINLP Model
\[\begin{align} \max\>& \color{darkred}z \\ & \color{darkred}z \le \color{darkred}\delta_{i,j} (\color{darkred}x_i-\color{darkred}x_j) + (1-\color{darkred}\delta_{i,j})(\color{darkred}x_j-\color{darkred}x_i) && \forall i\lt j\\ & \color{darkred}x_i \in [\color{darkblue}\ell_i, \color{darkblue}u_i] \\ & \color{darkred}\delta_{i,j} \in \{0,1\}\end{align}\]

I have some thoughts about this model. First, in some cases, the algorithm may initially have a negative objective. This is the case when it just chooses $\color{darkred}\delta_{i,j}=1$ for cases where range $i$ is to the left of range $j$. In that case, $\color{darkred}x_i-\color{darkred}x_j$ is negative. (Or the other way around: it chooses $\color{darkred}\delta_{i,j}=0$ for cases where range $i$ is to the right of range $j$). One simple fix is to add the lowerbound: \[\color{darkred}z \ge 0\]

Another, more impactful change can be devised by observing that if two ranges $i$ and $j$ don't overlap, we already know which branch to choose. So we can add the bounds: \[\begin{align} & \color{darkblue}\ell_j \gt \color{darkblue}u_i\Rightarrow \color{darkred}\delta_{i,j}=0 && \forall i\lt j\\ & \color{darkblue}\ell_i \gt \color{darkblue}u_j\Rightarrow \color{darkred}\delta_{i,j}=1 && \forall i\lt j \end{align} \] Note that is just fixing some of the binary variables ahead of solving the problem. So, using our data set, how many binary variables could be fixed due to non-overlapping ranges? I kept track:

----     77 PARAMETER fixed  statistics on fixed variables

total    1225,    fixed(0)  529,    fixed(1)  493,    unfixed   203

This means that of a total of 1225 binary variables, we could fix 529 to zero and 493 to one. Only 203 binary variables are left. The question is of course: will this help the model, or are solvers smart enough that we don't need to do this fixing? We have to try this out.

MIP Models

We can linearize the absolute value. Admittedly this requires a bit of effort.

The constraint $\color{darkred}y=|\color{darkred}x|$ can be interpreted as:\[\begin{align} & \color{darkred}y\ge \color{darkred}x \>{\bf and}\>\color{darkred}y \ge -\color{darkred}x \\ & \color{darkred}y\le \color{darkred}x \>{\bf or}\>\color{darkred}y \le -\color{darkred}x \end{align}\] This can be linearized as: \[\begin{align} & \color{darkred}y \ge \color{darkred}x \\ & \color{darkred}y \ge -\color{darkred}x \\ &\color{darkred}y \le \color{darkred}x + \color{darkblue}M\cdot \color{darkred}\delta \\ &\color{darkred}y \le -\color{darkred}x + \color{darkblue}M\cdot (1-\color{darkred}\delta) \\ & \color{darkred}y \ge 0 \\ & \color{darkred}\delta \in \{0,1\}\end{align}\]

When we apply this to our model we can do the following:

MIP Model with binary variables
\[\begin{align} \max\>& \color{darkred}z \\ & \color{darkred}z \le \color{darkred}d_{i,j} && \forall i \lt j \\ & \color{darkred}d_{i,j} \ge \color{darkred}x_i - \color{darkred}x_j && \forall i \lt j \\ & \color{darkred}d_{i,j}\ge \color{darkred}x_j-\color{darkred}x_i && \forall i \lt j \\ & \color{darkred}d_{i,j} \le \color{darkred}x_i - \color{darkred}x_j + \color{darkblue}M \cdot (1-\color{darkred}\delta_{i,j}) && \forall i \lt j \\ & \color{darkred}d_{i,j}\le \color{darkred}x_j-\color{darkred}x_i + \color{darkblue}M \cdot \color{darkred}\delta_{i,j} && \forall i \lt j \\ & \color{darkred}x_i \in [\color{darkblue}\ell_i, \color{darkblue}u_i] \\ & \color{darkred}\delta_{i,j}\in \{0,1\} \\ & \color{darkred}d_{i,j} \ge 0 \end{align}\]

MIP Model with binary variables

\[\begin{align} \max\>& \color{darkred}z \\ & \color{darkred}z \le \color{darkred}d_{i,j} && \forall i \lt j \\ & \color{darkred}d_{i,j} \ge \color{darkred}x_i - \color{darkred}x_j && \forall i \lt j \\ & \color{darkred}d_{i,j}\ge \color{darkred}x_j-\color{darkred}x_i && \forall i \lt j \\ & \color{darkred}d_{i,j} \le \color{darkred}x_i - \color{darkred}x_j + \color{darkblue}M \cdot (1-\color{darkred}\delta_{i,j}) && \forall i \lt j \\ & \color{darkred}d_{i,j}\le \color{darkred}x_j-\color{darkred}x_i + \color{darkblue}M \cdot \color{darkred}\delta_{i,j} && \forall i \lt j \\ & \color{darkred}x_i \in [\color{darkblue}\ell_i, \color{darkblue}u_i] \\ & \color{darkred}\delta_{i,j}\in \{0,1\} \\ & \color{darkred}d_{i,j} \ge 0 \end{align}\]

I calculated \[\color{darkblue}M_{i,j} = 2 \left[\max\{\color{darkblue}u_i,\color{darkblue}u_j\}-\min\{\color{darkblue}\ell_i,\color{darkblue}\ell_j\} \right]\] The factor 2 is present as we have to bridge the gap between $-|x|$ and $+|x|$. We can use the same fixing rule as before.

If we don't have good bounds (well, we do), we can use SOS1 variables or indicator variables. Here is a SOS1 based model:

MIP Model with SOS1 sets
\[\begin{align} \max\>& \color{darkred}z \\ & \color{darkred}z \le \color{darkred}d_{i,j} && \forall i \lt j \\ & \color{darkred}d_{i,j} = \color{darkred}x_i - \color{darkred}x_j + \color{darkred}v_{i,j,1} && \forall i \lt j \\ & \color{darkred}d_{i,j} = \color{darkred}x_j-\color{darkred}x_i + \color{darkred}v_{i,j,2} && \forall i \lt j \\ &\color{darkred}v_{i,j,1}, \color{darkred}v_{i,j,2} \in {\bf SOS1} && \forall i \lt j\\ & \color{darkred}x_i \in [\color{darkblue}\ell_i, \color{darkblue}u_i] \\ & \color{darkred}v_{i,j,k}\ge 0 \end{align}\]

MIP Model with SOS1 sets

\[\begin{align} \max\>& \color{darkred}z \\ & \color{darkred}z \le \color{darkred}d_{i,j} && \forall i \lt j \\ & \color{darkred}d_{i,j} = \color{darkred}x_i - \color{darkred}x_j + \color{darkred}v_{i,j,1} && \forall i \lt j \\ & \color{darkred}d_{i,j} = \color{darkred}x_j-\color{darkred}x_i + \color{darkred}v_{i,j,2} && \forall i \lt j \\ &\color{darkred}v_{i,j,1}, \color{darkred}v_{i,j,2} \in {\bf SOS1} && \forall i \lt j\\ & \color{darkred}x_i \in [\color{darkblue}\ell_i, \color{darkblue}u_i] \\ & \color{darkred}v_{i,j,k}\ge 0 \end{align}\]

We can also fix $\color{darkred}v_{i,j,k}=0$ where approriate.

Results

The results look like:

----    192 PARAMETER results  

                 MINLP    MINLP/FX     MIP/BIN  MIP/BIN/FX     MIP/SOS  MIP/SOS/FX

points          50.00050.00050.00050.00050.00050.000
vars          1276.0001276.0002501.0002501.0003726.0003726.000
  discr       1225.0001225.0001225.0001225.000
  fixed                   1022.0001022.0001022.000
equs          1225.0001225.0006125.0006125.0003675.0003675.000
status       TimeLimit   TimeLimit     Optimal     Optimal   TimeLimit   TimeLimit
obj              1.1301.1301.1301.1301.1301.130
time          1000.0101000.000101.734103.4071009.6871000.625
nodes          405.000440.000813120.000813120.0001.937801E+71.940324E+7
iterations                         8989768.0008989768.0005.076019E+73.436852E+7
gap%             8.3628.3624.0820.609

All models find the optimal solution of 1.130. But only the MIP model with binary variables was able to prove this within the allotted time of 1,000 seconds. The fixing does not make much difference except in the case of SOS1 sets. Why this is the case, I don't know.

For the MIP model, indeed Cplex recognizes it can remove a lot of binary variables. The logs indicate this. For both versions (without and with fixing), Cplex shows:

Reduced MIP has 203 binaries, 0 generals, 0 SOSs, and 0 indicators.

This is the number of binary variables we expect to see after removing all fixed binary variables. For the SOS1 models we see:

Reduced MIP has 0 binaries, 0 generals, 203 SOSs, and 0 indicators.

Genetic algorithm

A quick experiment using the ga function in R leads to poor results. Using default settings I find a best objective of 0.35 which is way below the optimal value of 1.130. The advantage is that we can use the high-level model directly. But these results indicate this method may have trouble beating the MIP model.

> res <- ga(type="real-valued",fitness=obj,lower=df$lo,upper=df$up,monitor=T)
GA | iter = 1 | Mean = 0.02192014 | Best = 0.08627789
GA | iter = 2 | Mean = 0.03742875 | Best = 0.12054808
GA | iter = 3 | Mean = 0.04143667 | Best = 0.17856289
GA | iter = 4 | Mean = 0.04257377 | Best = 0.17856289
GA | iter = 5 | Mean = 0.04524743 | Best = 0.17856289
GA | iter = 6 | Mean = 0.04677386 | Best = 0.21788714
GA | iter = 7 | Mean = 0.05426804 | Best = 0.21788714
GA | iter = 8 | Mean = 0.04921423 | Best = 0.21788714
GA | iter = 9 | Mean = 0.04905014 | Best = 0.21788714
 . . .
GA | iter = 90 | Mean = 0.2727138 | Best = 0.3488972
GA | iter = 91 | Mean = 0.2973201 | Best = 0.3488972
GA | iter = 92 | Mean = 0.2946026 | Best = 0.3488972
GA | iter = 93 | Mean = 0.2891295 | Best = 0.3488972
GA | iter = 94 | Mean = 0.2928266 | Best = 0.3488972
GA | iter = 95 | Mean = 0.3196950 | Best = 0.3488972
GA | iter = 96 | Mean = 0.3192331 | Best = 0.3488972
GA | iter = 97 | Mean = 0.3014952 | Best = 0.3488972
GA | iter = 98 | Mean = 0.3015511 | Best = 0.3488972
GA | iter = 99 | Mean = 0.3093204 | Best = 0.3488972
GA | iter = 100 | Mean = 0.2887380 | Best = 0.3488972

Conclusion

This is a fascinating little model. We can attack this in different ways. A linearized MIP model using binary variables seems the best of things I tried. It looks like that spending effort to fix as many binary variables as possible, may not be needed for a good MIP solver. The presolver will take of that (interestingly: not for the SOS1 model). A meta-heuristic, in our case a genetic algorithm, has trouble finding really good, close-to-optimal solutions

References

Given n points where each point has its own range, adjust all points to maximize the minimum distance of adjacent points, https://stackoverflow.com/questions/68180974/given-n-points-where-each-point-has-its-own-range-adjust-all-points-to-maximize

Appendix 1. GAMS models

$ontext

   Arrange points on a line such that minimum distance between points
   is maximized. Each point has a valid interval where it can be placed.
   These intervals may overlap.

   Random data set with 50 points.

   Implement three different models:
     1. MINLP model
     2. MIP model using binary variables
     3. MIP model using SOS1 variables

   All models are solved in two ways

   erwin@amsterdamoptimization.com

$offtext

*--------------------------------------------------
* data
*--------------------------------------------------

set
i 'points'/i1*i50/
b 'bounds'/lo,up/
;

parameter bounds(i,b) 'ranges';
bounds(i,'lo') = uniform(0,80);
bounds(i,'up') = bounds(i,'lo') + uniform(0,15);

option bounds:3:0:6;
display bounds;

*-------------------------------------------------------
* reporting macros
*-------------------------------------------------------

acronym TimeLimit;
acronym Optimal;
acronym Error;

parameter results(*,*);
$macro report(m,label,isfixed) \
    results('points',label) = card(i); \
    results('vars',label) = m.numVar; \
    results(' discr',label) = m.numDVar; \
    results(' fixed',label)$isfixed = card(fix0)+card(fix1); \
    results('equs',label) = m.numEqu; \
    results('status',label) = Error; \
    results('status',label)$(m.solvestat=1) = Optimal; \
    results('status',label)$(m.solvestat=3) = TimeLimit; \
    results('obj',label) = z.l; \
    results('time',label) = m.resusd; \
    results('nodes',label) = m.nodusd; \
    results('iterations',label) = m.iterusd; \
    results('gap%',label)$(m.solvestat=3) = 100*abs(m.objest - m.objval)/abs(m.objest);

*--------------------------------------------------
* MINLP model
*--------------------------------------------------

alias(i,j);
variable
   x(i)         'location'
   delta(i,j)   'binary variable'
   z
;
binaryvariable delta;

x.lo(i) = bounds(i,'lo');
x.up(i) = bounds(i,'up');
z.lo = 0;

* what can we fix?
sets
   gt(i,j)
   fix0(i,j)
   fix1(i,j)
;
gt(i,j) = ord(i)>ord(j);
fix0(gt(i,j)) = bounds(j,'lo')>bounds(i,'up');
fix1(gt(i,j)) = bounds(i,'lo')>bounds(j,'up');

parameter fixed(*) 'statistics on fixed variables';
fixed('total') = card(gt);
fixed('fixed(0)') = card(fix0);
fixed('fixed(1)') = card(fix1);
fixed('unfixed') = card(gt)-card(fix0)-card(fix1);
option fixed:0;
display fixed;

equations dist(i,j);

dist(gt(i,j)).. z =l= delta(i,j)*(x(i)-x(j)) + (1-delta(i,j))*(x(j)-x(i));

model m1 /dist/;
option optcr=0, minlp=baron, threads=8, reslim=1000;
solve m1 maximizing z using minlp;

report(m1,'MINLP',0)
display results;

* remove solution
delta.l(gt) = 0;
x.l(i) = 0;

* and fix viariables
delta.fx(fix0) = 0;
delta.fx(fix1) = 1;

solve m1 maximizing z using minlp;

report(m1,'MINLP/FX',1)
display results;

*unfix for next model
delta.lo(gt) = 0;
delta.up(gt) = 1;

*--------------------------------------------------
* MIP model
*--------------------------------------------------

parameter M(i,j) 'big-M';
M(gt(i,j)) = 2*max(bounds(i,'up'),bounds(j,'up')) - min(bounds(i,'lo'),bounds(j,'lo'));
* the factor 2 is to bridge from -abs() to +abs()

positivevariable d(i,j) 'absolute distance between points i and j';

equations
   dist1(i,j), dist2(i,j), dist3(i,j), dist4(i,j)
   smallest(i,j) 'bound on d(i,j)'
;

dist1(gt(i,j)).. d(i,j) =g= x(i)-x(j);
dist2(gt(i,j)).. d(i,j) =g= x(j)-x(i);
dist3(gt(i,j)).. d(i,j) =l= x(i)-x(j) + M(i,j)*(1-delta(i,j));
dist4(gt(i,j)).. d(i,j) =l= x(j)-x(i) + M(i,j)*delta(i,j);
smallest(gt(i,j)).. z =l= d(i,j);

model m2 /dist1,dist2,dist3,dist4,smallest/;
solve m2 maximizing z using mip;

report(m2,'MIP/BIN',0)
display results;

* fix variables
delta.fx(fix0) = 0;
delta.fx(fix1) = 1;

solve m2 maximizing z using mip;

report(m2,'MIP/BIN/FX',1)
display results;

*--------------------------------------------------
* MIP model using SOS1 variables
*--------------------------------------------------

set k /case1,case2/;

sos1variable v(i,j,k);

equations
distA(i,j), distB(i,j)
;

distA(gt(i,j)).. d(i,j) =e= x(i)-x(j) + v(i,j,'case1');
distB(gt(i,j)).. d(i,j) =e= x(j)-x(i) + v(i,j,'case2');

model m3 /distA,distB,smallest/;
solve m3 maximizing z using mip;

report(m3,'MIP/SOS',0)
display results;

* fix variables
v.fx(fix0,'case2') = 0;
v.fx(fix1,'case1') = 0;

solve m3 maximizing z using mip;

report(m3,'MIP/SOS/FX',1)
display results;

Appendix 2: R code for GA model

df <- read.table(text="
id     lo          up
i1     13.73977    23.65636
i2     67.46134    78.79866
i3     44.03003    53.44174
i4     24.09103    28.34900
i5     23.37697    24.67334
i6     17.92423    19.46195
i7     27.98644    37.60521
i8     68.50163    76.68127
i9      5.36910     5.84197
i10    40.01685    51.90226
i11    79.84941    80.94092
i12    46.29867    48.93359
i13    79.29064    87.17513
i14    60.98004    72.23315
i15    10.45540    13.12725
i16    51.17750    51.68962
i17    12.76143    21.53840
i18    20.00644    29.32489
i19    53.51429    59.35472
i20    34.82851    40.20922
i21    28.77602    32.42154
i22    28.11531    31.81163
i23    10.51933    12.47687
i24    12.00814    26.00989
i25    47.12909    52.82816
i26    66.47142    78.22243
i27    18.46526    22.96577
i28    53.25876    55.14101
i29    62.06861    73.30172
i30    24.29268    25.33117
i31     8.83938    11.86962
i32    40.19079    40.26678
i33    12.81382    16.85802
i34    69.79698    77.29476
i35    21.20916    23.47845
i36    22.86515    25.47769
i37    47.51647    52.47604
i38    57.81753    62.57112
i39    50.25989    55.09120
i40    37.10383    51.56348
i41    33.06456    47.96859
i42     9.41563    14.96417
i43    25.13698    30.73031
i44     3.72412    15.30380
i45    27.08402    33.03428
i46    14.56797    28.26441
i47    51.65817    53.45184
i48    44.85964    55.89183
i49    61.59694    62.42821
i50    23.82447    32.46897
",header=T)

obj <- function(x) {
  smallest <- Inf
  n <- length(x)
for(i in1:n)
if (i<n)
for(j in (i+1):n) {
        smallest <- min(abs(x[i]-x[j]),smallest)
      }
  smallest
}

library(GA);
res <- ga(type="real-valued",fitness=obj,lower=df$lo,upper=df$up,monitor=T)

Arranging points on a line

Data

High-level model

MINLP Model

MIP Models

Results

Genetic algorithm

Conclusion

References

Appendix 1. GAMS models

Appendix 2: R code for GA model

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112