Quantcast
Channel: Yet Another Math Programming Consultant
Viewing all articles
Browse latest Browse all 809

Continuous max sum rectangle: MIQP vs GA

$
0
0

This is a follow-up on the post on the max sum submatrix problem [1].  There we looked at the problem of finding a contiguous submatrix that maximizes the sum of the values in this submatrix.


A little bit more complicated is the following continuous version of this problem. 

Assume we have \(n\) points with an \(x\)- and \(y\)-coordinate and a value. Find the rectangle such that the sum of the values of all points inside is maximized.


Data set


I generated a random data set with \(n=100\) points:


----     10 PARAMETER p  points

x y value

i1 17.1755.1418.655
i2 84.3270.601 -3.025
i3 55.03840.123 -9.834
i4 30.11451.9888.977
i5 29.22162.8881.438
i6 22.40522.575 -3.327
i7 34.98339.6129.675
i8 85.62727.6015.329
i9 6.71115.237 -7.798
i10 50.02193.6329.896
i11 99.81242.2661.606
i12 57.87313.466 -6.672
i13 99.11338.6062.867
i14 76.22537.463 -3.114
i15 13.06926.8488.247
i16 63.97294.8378.001
i17 15.95218.894 -9.675
i18 25.00829.751 -2.627
i19 66.8937.4553.288
i20 43.53640.1351.868
i21 35.97010.169 -9.309
i22 35.14438.3896.836
i23 13.14932.4098.642
i24 15.01019.2130.159
i25 58.91111.237 -4.008
i26 83.08959.656 -0.068
i27 23.08251.145 -9.101
i28 66.5734.5075.474
i29 77.58678.3100.659
i30 30.36694.5754.935
i31 11.04959.6464.401
i32 50.23860.7342.632
i33 16.01736.251 -7.702
i34 87.24659.4079.423
i35 26.51167.9854.135
i36 28.58150.6599.725
i37 59.39615.9257.096
i38 72.27265.6892.429
i39 62.82552.3884.026
i40 46.38012.4404.018
i41 41.33198.6725.814
i42 11.77022.8122.204
i43 31.42167.565 -8.914
i44 4.65577.678 -0.296
i45 33.85593.245 -8.949
i46 18.21020.1243.972
i47 64.57329.714 -6.104
i48 56.07519.723 -5.479
i49 76.99624.6356.273
i50 29.78164.6489.835
i51 66.11173.4975.013
i52 75.5828.5444.367
i53 62.74515.035 -9.988
i54 28.38643.419 -4.723
i55 8.64218.6946.476
i56 10.25169.2696.391
i57 64.12576.2977.208
i58 54.53115.481 -5.746
i59 3.15238.938 -0.864
i60 79.23669.543 -9.233
i61 7.27784.581 -3.540
i62 17.56661.272 -1.202
i63 52.56397.597 -3.693
i64 75.0212.689 -7.305
i65 17.81218.7456.219
i66 3.4148.712 -1.664
i67 58.51354.040 -7.164
i68 62.12312.686 -0.689
i69 38.93673.400 -4.340
i70 35.87111.3237.914
i71 24.30348.835 -8.712
i72 24.64279.560 -1.708
i73 13.05049.205 -3.168
i74 93.34553.356 -0.634
i75 37.9941.0622.853
i76 78.34054.3872.872
i77 30.00345.113 -3.248
i78 12.54897.533 -7.984
i79 74.88718.3858.117
i80 6.92316.353 -5.653
i81 20.2022.4638.377
i82 0.50717.782 -0.965
i83 26.9616.132 -8.201
i84 49.9851.664 -2.516
i85 15.12983.565 -1.700
i86 17.41760.166 -1.916
i87 33.0642.702 -7.767
i88 31.69119.6095.023
i89 32.20995.0716.068
i90 96.39833.554 -9.527
i91 99.36059.426 -0.382
i92 36.99025.919 -4.428
i93 37.28964.0638.032
i94 77.19815.525 -9.648
i95 39.66846.0023.621
i96 91.31039.3349.018
i97 11.95880.5468.004
i98 73.54854.0997.976
i99 5.54239.0727.489
i100 57.63055.782 -2.180





High-level model


Let's denote our points by: \(\color{darkblue}p_{i,a}\), where \(a=\{x,y,{\mathit{value}}\}\). Our decision variables are the corner points of the rectangle: \(\color{darkred}r_{c,q}\) where \(c=\{x,y\}\) and \(q=\{min,max\}\). These variables are continuous between \(\color{darkblue}L=0\) and  \(\color{darkblue}U=100\).

High-level model
\[\begin{align}\max & \sum_{\substack{\color{darkred}r_{x,min} \le \color{darkblue}x_i \le \color{darkred}r_{x,max}\\ \color{darkred}r_{y,min} \le \color{darkblue}y_i \le \color{darkred}r_{y,max}}}\color{darkblue}p_{i,{\mathit{value}}} \\ & \color{darkred}{\mathit{xmin}} \le \color{darkred}{\mathit{xmax}} \\ & \color{darkred}{\mathit{ymin}} \le \color{darkred}{\mathit{ymax}} \\ &\color{darkred}{\mathit{xmin}},\color{darkred}{\mathit{xmax}}, \color{darkred}{\mathit{ymin}}, \color{darkred}{\mathit{ymax}} \in [L,U]\end{align}\]

This looks easy. Well, not so fast...


Development


The first thing to do is to introduce binary variables that indicate if a point \(i\) is in between \(\color{darkred}r_{c,min}\) and \(\color{darkred}r_{c,max}\) for both \(c \in \{x,y\}\). It is actually simpler to consider three cases:  


So, we need to implement: \[\begin{align} &\color{darkred}\delta_{i,x,1}=1 \Rightarrow \color{darkblue}L \le \color{darkblue} p_{i,x} \le \color{darkred}r_{x,min} \\ & \color{darkred}\delta_{i,x,2}=1 \Rightarrow \color{darkred}r_{x,min} \le  \color{darkblue}p_{i,x} \le \color{darkred}r_{x,max} \\ &\color{darkred}\delta_{i,x,3}=1\Rightarrow \color{darkred}r_{x,max} \le \color{darkblue} p_{i,x}  \le \color{darkblue}U \\ & \sum_k \color{darkred}\delta_{i,x,k}= 1\end{align}\] We can formulate the implications as the following inequalities: \[\begin{align} &\color{darkred}r_{x,min} \ge \color{darkblue}p_{i,x} \cdot \color{darkred}\delta_{i,x,1} + \color{darkblue}L \cdot (1-\color{darkred}\delta_{i,x,1}) \\ & \color{darkred}r_{x,min} \le \color{darkblue} p_{i,x} \cdot \color{darkred}\delta_{i,x,2} +  \color{darkblue}U \cdot (1-\color{darkred}\delta_{i,x,2}) \\& \color{darkred}r_{x,max} \ge \color{darkblue}p_{i,x} \cdot \color{darkred}\delta_{i,x,2} + \color{darkblue}L \cdot (1-\color{darkred}\delta_{i,x,2}) \\ & \color{darkred}r_{x,max} \le \color{darkblue} p_{i,x} \cdot \color{darkred}\delta_{i,x,3} + \color{darkblue}U \cdot (1-\color{darkred}\delta_{i,x,3})\end{align}\]

Of course, we need to repeat this for the \(y\) direction. Once we have all variables \(\color{darkred}\delta_{i,x,k}\) and \(\color{darkred}\delta_{i,y,k}\), we still need to combine them. A point \(i\) is inside the rectangle \(\color{darkred}r\) if and only if \(\color{darkred}\delta_{i,x,2}\cdot \color{darkred}\delta_{i,y,2} = 1\). 

With this, our model can look like: 


MIQP Model
\[\begin{align}\max & \sum_i \color{darkblue}p_{i,{\mathit{value}}} \cdot  \color{darkred}\delta_{i,x,2} \cdot  \color{darkred}\delta_{i,y,2}\\ & \color{darkred}r_{c,min} \ge \color{darkblue} p_{i,c} \cdot \color{darkred}\delta_{i,c,1} + \color{darkblue}L \cdot (1-\color{darkred}\delta_{i,c,1}) && \forall i,c\\ & \color{darkred}r_{c,{\mathit{min}}} \le \color{darkblue} p_{i,c} \cdot \color{darkred}\delta_{i,c,2} +  \color{darkblue}U \cdot (1-\color{darkred}\delta_{i,c,2}) && \forall i,c\\& \color{darkred}r_{c,{\mathit{max}}} \ge \color{darkblue} p_{i,c} \cdot \color{darkred}\delta_{i,c,2} + \color{darkblue}L \cdot (1-\color{darkred}\delta_{i,c,2}) && \forall i,c \\ & \color{darkred}r_{c,{\mathit{max}}} \le \color{darkblue} p_{i,c} \cdot \color{darkred}\delta_{i,c,3} + \color{darkblue}U \cdot (1-\color{darkred}\delta_{i,c,3}) &&\forall i,c \\ & \sum_k  \color{darkred}\delta_{i,c,k}=1 && \forall i,c \\ & \color{darkred}r_{c,min} \le \color{darkred}r_{c,max} && \forall c  \\ &  \color{darkred}\delta_{i,c,k} \in \{0,1\} \\ &\color{darkred}r_{c,q} \in [\color{darkblue}L,\color{darkblue}U] \end{align}\]


This model can easily be linearized using the techniques demonstrated in [1]. Here we will assume the solver is linearizing this automatically. This model solves very quickly, and the results are:

----     62 VARIABLE z.L                   =      102.234  objective

---- 62 VARIABLE r.L rectangle

min max

x 7.27793.345
y 15.52597.533



The solver (Cplex in this case), linearized the problem for us and solved it as a linear MIP.  It found and proved the global optimal solution in 0 nodes (i.e. all work was done during preprocessing) and 1,249 iterations. The solution time was 2 seconds.

Genetic algorithm


Here we try to solve the problem using R's GA package. The fitness function is basically the same as our objective in the high-level model. The code (and output) can look like:


> library(GA)
>
> # data is stored in data frame
> str(df)
'data.frame': 100 obs. of 4 variables:
$ point: chr "i1""i2""i3""i4" ...
$ x : num 17.2 84.3 55 30.1 29.2 ...
$ y : num 5.141 0.601 40.123 51.988 62.888 ...
$ value: num 8.66 -3.02 -9.83 8.98 1.44 ...
>
> # fitness function
> f <- function(x) {
+ xmin <- min(x[1],x[2])
+ xmax <- max(x[1],x[2])
+ ymin <- min(x[3],x[4])
+ ymax <- max(x[3],x[4])
+ ok <- (df$x <= xmax) & (df$x >= xmin) & (df$y <= ymax) & (df$y >= ymin)
+ sum(ok * df$value)
+ }
>
> # call the ga solver
> system.time(result <- ga(type = "real-valued",
+ fitness = f,
+ lower = rep(0,4), upper = rep(100,4),
+ popSize = 100, maxiter = 500, monitor = T,
+ seed = 12345))
GA | iter = 1 | Mean = 8.058819 | Best = 48.175182
GA | iter = 2 | Mean = 6.413503 | Best = 48.175182
GA | iter = 3 | Mean = 8.234065 | Best = 48.175182
GA | iter = 4 | Mean = 5.209888 | Best = 48.175182
GA | iter = 5 | Mean = 7.171262 | Best = 48.175182
GA | iter = 6 | Mean = 9.767468 | Best = 48.175182
GA | iter = 7 | Mean = 11.41823 | Best = 48.17518
GA | iter = 8 | Mean = 9.89723 | Best = 48.20186
GA | iter = 9 | Mean = 14.19131 | Best = 53.02362
GA | iter = 10 | Mean = 20.83307 | Best = 53.61161
. . .
GA | iter = 490 | Mean = 87.13611 | Best = 102.23405
GA | iter = 491 | Mean = 83.61628 | Best = 102.23405
GA | iter = 492 | Mean = 87.16728 | Best = 102.23405
GA | iter = 493 | Mean = 86.52578 | Best = 102.23405
GA | iter = 494 | Mean = 83.02602 | Best = 102.23405
GA | iter = 495 | Mean = 86.58533 | Best = 102.23405
GA | iter = 496 | Mean = 89.53308 | Best = 102.23405
GA | iter = 497 | Mean = 89.4938 | Best = 102.2341
GA | iter = 498 | Mean = 88.29249 | Best = 102.23405
GA | iter = 499 | Mean = 88.39676 | Best = 102.23405
GA | iter = 500 | Mean = 85.33931 | Best = 102.23405
user system elapsed
6.78 0.27 7.32


Note that I did not specify the constraints \(\color{darkred}{\mathit{xmin}} \le \color{darkred}{\mathit{xmax}}\) and \(\color{darkred}{\mathit{ymin}} \le \color{darkred}{\mathit{ymax}}\). Instead when the fitness function receives the vector \(x\) it just sorts the values. This is somewhat of a trick. But it helps in allowing us to use an unconstrained optimization problem. This is usually a big win for heuristic solvers like this. 

Because we know the optimal solution, we can indeed verify that this heuristic also finds the optimal solution.

A nice feature is to plot the results:


 
It finds 5 solutions with the best objective:


> result@solution
x1 x2 x3 x4
[1,] 8.058577 91.76226 15.88715 95.49885
[2,] 8.072476 91.76382 15.87492 95.50330
[3,] 8.086532 91.76143 15.91187 95.52999
[4,] 8.071193 91.76518 15.88455 95.50384
[5,] 7.978493 92.05050 15.92354 95.49153


These are essentially the same.

Larger data set


When using a larger random data set with \(n=1,000\) points, we end up with a large MIQP model. It has 10k rows and 6k columns. After Cplex reformulates this into a linear model, we have 11.5k rows and 7k columns. This solves to optimality in 2,000 seconds with a proven optimal objective of 209.426.


----     63 VARIABLE z.L                   =      209.426  objective

---- 63 VARIABLE r.L rectangle

min max

x 7.04597.298
y 2.81258.167


When I feed this into the GA solver, with a limit of 5,000 iterations, we see:


-- Genetic Algorithm ------------------- 

GA settings:
Type = real-valued
Population size = 10
Number of generations = 5000
Elitism = 1
Crossover probability = 0.8
Mutation probability = 0.1
Search domain =
x1 x2 x3 x4
lower 0000
upper 100100100100

GA results:
Iterations = 5000
Fitness function value = 205.203
Solutions =
x1 x2 x3 x4
[1,] 80.911257.08916258.034211.90685
[2,] 80.911257.08916258.034211.90685
[3,] 80.911257.08916258.034211.90685
[4,] 80.911257.08916258.034211.90685


We see that the \(x\) part of the rectangle is close, but \(ymin=11.9\) is a bit off compared to the optimal MIQP solution. Indeed the objective 205.203 is a little bit worse than 209.426 which we found earlier. On the other hand, GA just needed about 10 seconds to find this solution.

Conclusions


  • Formulating this problem as a non-convex MIQP model requires some thought (and some work).
  • But it can provide proven optimal solutions (or good solutions if optimality is too costly).
  • Using a Genetic Algorithm based heuristic makes the modeling quite easy. There is one trick involved so we have an unconstrained problem. After that, the GA solver finds the optimal solution fairly quickly. Of course, it does not prove optimality. We only know this because we also have a mathematical programming model that was solved before.
  • When developing heuristics for a large and difficult problem, I also like to implement a mathematical programming model. This allows us to compare solutions.  First, this is a good debugging aid. But also this can give us some feedback on the quality of the solutions found with the heuristic, even if only for smaller data sets.

References



Viewing all articles
Browse latest Browse all 809

Trending Articles