Quantcast
Channel: Yet Another Math Programming Consultant
Viewing all articles
Browse latest Browse all 804

Maximum Correlation, Global MINLP vs GA

$
0
0
In [1], the following question is posed:


Suppose we have data like:

worker weight height weight2 height2
11206012560
21525515666
32225510020

We can for each case pick (weight,height) or (weight2,height2).  The problem is to pick one of these observations for each row such that the correlation between weight and height is maximized.


Data 


First, let's invent some data that we can use for some experiments. 

----     27 PARAMETER data  

height1 weight1 height2 weight2

i1 67.433285168.26287167.445523163.692389
i2 70.638374174.43775068.649190160.084811
i3 71.317794159.90967269.503911164.720010
i4 59.850261145.70415961.175728142.708300
i5 65.341938155.58698468.483909165.564991
i6 64.142009154.33500168.568683166.169507
i7 67.030368158.76881365.780803153.721717
i8 73.672863175.12695173.236515164.704340
i9 65.203516157.59358763.279277149.784500
i10 69.001848160.06342868.786656162.278007
i11 64.455422159.03919563.930208152.827710
i12 70.719334164.88570469.666096157.356595
i13 65.688428151.22346863.614565150.071072
i14 66.569252160.97867170.533320160.722483
i15 78.417676172.29865280.070076172.695207
i16 65.396154158.23470967.404942158.310596
i17 62.504967150.89942861.000439154.094647
i18 62.122630150.02429863.634554153.644324
i19 70.598400165.08652372.999194166.771223
i20 74.935107170.82061076.622182169.013550
i21 63.233956154.33154660.372876149.152520
i22 72.550105173.96191576.748649167.462369
i23 74.086553168.19086775.433331171.773607
i24 65.379648163.57769765.717553160.134888
i25 64.003038155.35760767.301426158.713710


The optimal solution that has the highest correlation coefficient between height and weight is:


----     92 PARAMETER result  optimal selected observations

height1 weight1 height2 weight2

i1 67.445523163.692389
i2 68.649190160.084811
i3 69.503911164.720010
i4 59.850261145.704159
i5 65.341938155.586984
i6 64.142009154.335001
i7 67.030368158.768813
i8 73.236515164.704340
i9 63.279277149.784500
i10 69.001848160.063428
i11 63.930208152.827710
i12 70.719334164.885704
i13 63.614565150.071072
i14 70.533320160.722483
i15 78.417676172.298652
i16 67.404942158.310596
i17 62.504967150.899428
i18 62.122630150.024298
i19 72.999194166.771223
i20 74.935107170.820610
i21 60.372876149.152520
i22 76.748649167.462369
i23 75.433331171.773607
i24 65.717553160.134888
i25 67.301426158.713710

Below we shall see how we arrive at this conclusion.

MINLP Model


A high-level model is simply:


MINLP Model
\[ \begin{align}\max\> &\mathbf{cor}(\color{darkred}h,\color{darkred}w) \\ & \color{darkred}h_i = \color{darkblue}{\mathit{height1}}_i\cdot(1-\color{darkred}x_i)+ \color{darkblue}{\mathit{height2}}_i\cdot\color{darkred}x_i\\ & \color{darkred}w_i = \color{darkblue}{\mathit{weight1}}_i\cdot(1-\color{darkred}x_i)+ \color{darkblue}{\mathit{weight2}}_i\cdot\color{darkred}x_i \\ & \color{darkred}x_i \in \{0,1\} \end{align}\]


Here \(\mathbf{cor}(h,w)\) indicates the (Pearson) correlation between vectors \(h\) and \(w\). It is noted that height and weight are positively correlated, so maximizing makes sense.  Of course, GAMS does not know about correlations, so we implement this model as:

 
Implementation of MINLP Model
\[ \begin{align}\max\> & \color{darkred}z = \frac{\displaystyle\sum_i (\color{darkred}h_i-\bar{\color{darkred}h})(\color{darkred}w_i-\bar{\color{darkred}w})}{\sqrt{\displaystyle\sum_i(\color{darkred}h_i-\bar{\color{darkred}h})^2}\cdot \sqrt{\displaystyle\sum_i(\color{darkred}w_i-\bar{\color{darkred}w})^2}} \\ & \color{darkred}h_i = \color{darkblue}{\mathit{height1}}_i\cdot(1-\color{darkred}x_i)+ \color{darkblue}{\mathit{height2}}_i\cdot\color{darkred}x_i \\ & \color{darkred}w_i = \color{darkblue}{\mathit{weight1}}_i\cdot(1-\color{darkred}x_i)+ \color{darkblue}{\mathit{weight2}}_i\cdot\color{darkred}x_i \\ & \bar{\color{darkred}h} = \frac{1}{n}\displaystyle\sum_i \color{darkred}h_i \\ & \bar{\color{darkred}w} = \frac{1}{n}\displaystyle\sum_i \color{darkred}w_i \\ & \color{darkred}x_i \in \{0,1\} \end{align}\]



We need to watch when using divisions. I typically like to reformulate divisions by multiplications: \[\begin{align} \max \>&  \color{darkred}z\\ &\color{darkred}z\cdot \sqrt{\displaystyle\sum_i(\color{darkred}h_i-\bar{\color{darkred}h})^2}\cdot \sqrt{\displaystyle\sum_i(\color{darkred}w_i-\bar{\color{darkred}w})^2} = \displaystyle\sum_i (\color{darkred}h_i-\bar{\color{darkred}h})(\color{darkred}w_i-\bar{\color{darkred}w})\end{align}\]

The advantage of this formulation is in the protection against division by zero. A disadvantage is that non-linearities are moved to the constraints. Many NLP solvers are happier when constraints are linear, and only the objective is non-linear. My answer: experiment with formulations.


When we solve this model with GAMS/Baron, we see:


----     79 VARIABLE x.L  select 1 or 2

i1 1.000000, i2 1.000000, i3 1.000000, i8 1.000000, i9 1.000000, i11 1.000000, i13 1.000000
i14 1.000000, i16 1.000000, i19 1.000000, i21 1.000000, i22 1.000000, i23 1.000000, i24 1.000000
i25 1.000000


---- 79 VARIABLE z.L = 0.956452 objective

---- 83 PARAMETER corr

all1 0.868691, all2 0.894532, optimal 0.956452


The parameter corr shows different correlations:
  • When all \(x_i=0\), that is when we compare height1 vs weight1.
  • When all \(x_i=1\), so we compare  height2 vs weight2.
  • When using an optimal solution for \(x\). 
At least for this data set, cherry-picking the data can significantly improve the correlation coefficient.

Nonconvex MIQCP

 
Gurobi can solve non-convex quadratic models quite efficiently. To try this out, I reformulated the MINLP model into a MIQCP (Mixed-Integer Quadratically-Constrained Programming) model. We need to add some variables and constraints to make this happen. The final model is still quite small, but I encountered severe problems:

Gurobi Optimizer version 9.0.1 build v9.0.1rc0 (win64)
Optimize a model with 52 rows, 84 columns and 152 nonzeros
Model fingerprint: 0xab3fb310
Model has 7 quadratic constraints
Variable types: 59 continuous, 25 integer (25 binary)
Coefficient statistics:
Matrix range [1e-02, 1e+01]
QMatrix range [1e+00, 2e+01]
QLMatrix range [1e+00, 1e+00]
Objective range [1e+00, 1e+00]
Bounds range [1e+00, 2e+02]
RHS range [6e+01, 2e+02]
Presolve removed 50 rows and 50 columns
Presolve time: 0.00s
Presolved: 179 rows, 88 columns, 637 nonzeros
Presolved model has 7 bilinear constraint(s)
Variable types: 63 continuous, 25 integer (25 binary)

Root relaxation: unbounded, 0 iterations, 0.00 seconds

Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time

02 postponed 0 - - - - 0s
7128364203 postponed 891 - - - 0.05s
134459127181 postponed 1772 - - - 0.010s
214323207261 postponed 1772 - - - 0.015s
292995285775 postponed 2630 - - - 0.020s
372859365589 postponed 3534 - - - 0.025s
452723445507 postponed 4414 - - - 0.030s
531395524129 postponed 4414 - - - 0.035s
611259604451 postponed 5145 - - - 0.040s
691123683851 postponed 5296 - - - 0.045s
770987763725 postponed 6168 - - - 0.050s
849659842391 postponed 6177 - - - 0.055s
929523922367 postponed 7058 - - - 0.060s
...
4259588342589647 postponed 9111 - - - 0.02695s
4267455542668119 postponed 9402 - - - 0.02700s
4275441942747901 postponed 8827 - - - 0.02705s
4283309142826421 postponed 9649 - - - 0.02710s
4291057142903743 postponed 9221 - - - 0.02715s
 

This was really bad: not even a feasible solution after 2700 seconds. Compare this to Baron which could solve the MINLP model in about 300 seconds (on a much slower machine).

Genetic Algorithm


It is interesting to see how a simple meta-heuristic would do on this problem. Here are the results using the ga function from the GA package [2]. This was actually quite easy to implement.


> df <- read.table(text="
+ id height1 weight1 height2 weight2
+ i1 67.433285 168.262871 67.445523 163.692389
+ i2 70.638374 174.437750 68.649190 160.084811
+ i3 71.317794 159.909672 69.503911 164.720010
+ i4 59.850261 145.704159 61.175728 142.708300
+ i5 65.341938 155.586984 68.483909 165.564991
+ i6 64.142009 154.335001 68.568683 166.169507
+ i7 67.030368 158.768813 65.780803 153.721717
+ i8 73.672863 175.126951 73.236515 164.704340
+ i9 65.203516 157.593587 63.279277 149.784500
+ i10 69.001848 160.063428 68.786656 162.278007
+ i11 64.455422 159.039195 63.930208 152.827710
+ i12 70.719334 164.885704 69.666096 157.356595
+ i13 65.688428 151.223468 63.614565 150.071072
+ i14 66.569252 160.978671 70.533320 160.722483
+ i15 78.417676 172.298652 80.070076 172.695207
+ i16 65.396154 158.234709 67.404942 158.310596
+ i17 62.504967 150.899428 61.000439 154.094647
+ i18 62.122630 150.024298 63.634554 153.644324
+ i19 70.598400 165.086523 72.999194 166.771223
+ i20 74.935107 170.820610 76.622182 169.013550
+ i21 63.233956 154.331546 60.372876 149.152520
+ i22 72.550105 173.961915 76.748649 167.462369
+ i23 74.086553 168.190867 75.433331 171.773607
+ i24 65.379648 163.577697 65.717553 160.134888
+ i25 64.003038 155.357607 67.301426 158.713710
+ "
, header=T)
>
>#
># print obvious cases
>#
> cor(df$weight1,df$height1)
[1] 0.8686908
> cor(df$weight2,df$height2)
[1] 0.894532
>
>#
># fitness function
>#
>f<-function(x) {
+ w <- df$weight1*(1-x) + df$weight2*x
+ h <- df$height1*(1-x) + df$height2*x
+ cor(w,h)
+ }
>
> library(GA)
> res <- ga(type=c("binary"),fitness=f,nBits=25,seed=123)
GA |iter=1|Mean=0.8709318|Best=0.9237155
GA |iter=2|Mean=0.8742004|Best=0.9237155
GA |iter=3|Mean=0.8736450|Best=0.9237155
GA |iter=4|Mean=0.8742228|Best=0.9384788
GA |iter=5|Mean=0.8746517|Best=0.9384788
GA |iter=6|Mean=0.8792048|Best=0.9486227
GA |iter=7|Mean=0.8844841|Best=0.9486227
GA |iter=8|Mean=0.8816874|Best=0.9486227
GA |iter=9|Mean=0.8805522|Best=0.9486227
GA |iter=10|Mean=0.8820974|Best=0.9486227
GA |iter=11|Mean=0.8859074|Best=0.9486227
GA |iter=12|Mean=0.8956467|Best=0.9486227
GA |iter=13|Mean=0.8989140|Best=0.9486227
GA |iter=14|Mean=0.9069327|Best=0.9486227
GA |iter=15|Mean=0.9078787|Best=0.9486227
GA |iter=16|Mean=0.9069163|Best=0.9489443
GA |iter=17|Mean=0.9104712|Best=0.9489443
GA |iter=18|Mean=0.9169900|Best=0.9489443
GA |iter=19|Mean=0.9175285|Best=0.9489443
GA |iter=20|Mean=0.9207076|Best=0.9489443
GA |iter=21|Mean=0.9210288|Best=0.9489443
GA |iter=22|Mean=0.9206928|Best=0.9489443
GA |iter=23|Mean=0.9210399|Best=0.9489443
GA |iter=24|Mean=0.9208985|Best=0.9489443
GA |iter=25|Mean=0.9183778|Best=0.9511446
GA |iter=26|Mean=0.9217391|Best=0.9511446
GA |iter=27|Mean=0.9274271|Best=0.9522764
GA |iter=28|Mean=0.9271156|Best=0.9522764
GA |iter=29|Mean=0.9275347|Best=0.9522764
GA |iter=30|Mean=0.9278315|Best=0.9522764
GA |iter=31|Mean=0.9300289|Best=0.9522764
GA |iter=32|Mean=0.9306409|Best=0.9528777
GA |iter=33|Mean=0.9309087|Best=0.9528777
GA |iter=34|Mean=0.9327691|Best=0.9528777
GA |iter=35|Mean=0.9309344|Best=0.9549574
GA |iter=36|Mean=0.9341977|Best=0.9549574
GA |iter=37|Mean=0.9374437|Best=0.9559043
GA |iter=38|Mean=0.9394410|Best=0.9559043
GA |iter=39|Mean=0.9405482|Best=0.9559043
GA |iter=40|Mean=0.9432749|Best=0.9564515
GA |iter=41|Mean=0.9441814|Best=0.9564515
GA |iter=42|Mean=0.9458232|Best=0.9564515
GA |iter=43|Mean=0.9469625|Best=0.9564515
GA |iter=44|Mean=0.9462313|Best=0.9564515
GA |iter=45|Mean=0.9449716|Best=0.9564515
GA |iter=46|Mean=0.9444071|Best=0.9564515
GA |iter=47|Mean=0.9437149|Best=0.9564515
GA |iter=48|Mean=0.9446355|Best=0.9564515
GA |iter=49|Mean=0.9455424|Best=0.9564515
GA |iter=50|Mean=0.9456497|Best=0.9564515
GA |iter=51|Mean=0.9461382|Best=0.9564515
GA |iter=52|Mean=0.9444960|Best=0.9564515
GA |iter=53|Mean=0.9434671|Best=0.9564515
GA |iter=54|Mean=0.9451851|Best=0.9564515
GA |iter=55|Mean=0.9481903|Best=0.9564515
GA |iter=56|Mean=0.9477778|Best=0.9564515
GA |iter=57|Mean=0.9481829|Best=0.9564515
GA |iter=58|Mean=0.9490952|Best=0.9564515
GA |iter=59|Mean=0.9505670|Best=0.9564515
GA |iter=60|Mean=0.9499329|Best=0.9564515
GA |iter=61|Mean=0.9509299|Best=0.9564515
GA |iter=62|Mean=0.9505341|Best=0.9564515
GA |iter=63|Mean=0.9519624|Best=0.9564515
GA |iter=64|Mean=0.9518618|Best=0.9564515
GA |iter=65|Mean=0.9523598|Best=0.9564515
GA |iter=66|Mean=0.9516766|Best=0.9564515
GA |iter=67|Mean=0.9521926|Best=0.9564515
GA |iter=68|Mean=0.9524419|Best=0.9564515
GA |iter=69|Mean=0.9532865|Best=0.9564515
GA |iter=70|Mean=0.9535871|Best=0.9564515
GA |iter=71|Mean=0.9536049|Best=0.9564515
GA |iter=72|Mean=0.9534035|Best=0.9564515
GA |iter=73|Mean=0.9532859|Best=0.9564515
GA |iter=74|Mean=0.9521064|Best=0.9564515
GA |iter=75|Mean=0.9534997|Best=0.9564515
GA |iter=76|Mean=0.9539987|Best=0.9564515
GA |iter=77|Mean=0.9536670|Best=0.9564515
GA |iter=78|Mean=0.9526224|Best=0.9564515
GA |iter=79|Mean=0.9531871|Best=0.9564515
GA |iter=80|Mean=0.9527495|Best=0.9564515
GA |iter=81|Mean=0.9526061|Best=0.9564515
GA |iter=82|Mean=0.9525577|Best=0.9564515
GA |iter=83|Mean=0.9525084|Best=0.9564515
GA |iter=84|Mean=0.9519052|Best=0.9564515
GA |iter=85|Mean=0.9518549|Best=0.9564515
GA |iter=86|Mean=0.9511299|Best=0.9564515
GA |iter=87|Mean=0.9505129|Best=0.9564515
GA |iter=88|Mean=0.9518203|Best=0.9564515
GA |iter=89|Mean=0.9537234|Best=0.9564515
GA |iter=90|Mean=0.9531017|Best=0.9564515
GA |iter=91|Mean=0.9514525|Best=0.9564515
GA |iter=92|Mean=0.9505517|Best=0.9564515
GA |iter=93|Mean=0.9524752|Best=0.9564515
GA |iter=94|Mean=0.9533879|Best=0.9564515
GA |iter=95|Mean=0.9519166|Best=0.9564515
GA |iter=96|Mean=0.9524416|Best=0.9564515
GA |iter=97|Mean=0.9526676|Best=0.9564515
GA |iter=98|Mean=0.9523745|Best=0.9564515
GA |iter=99|Mean=0.9523710|Best=0.9564515
GA |iter=100|Mean=0.9519255|Best=0.9564515
> res@solution
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20 x21 x22 x23 x24 x25
[1,] 1110000110101101001011111
> res@fitnessValue
[1] 0.9564515

When comparing to our proven optimal Baron solution, we see that the ga function finds the optimal solution. Of course, we would not know this if we did not solve the model with a global solver like Baron.


References




Viewing all articles
Browse latest Browse all 804

Trending Articles