Consider the following data:
---- 14 PARAMETER xdata
i1 17.175, i2 84.327, i3 55.038, i4 30.114, i5 29.221, i6 22.405, i7 34.983, i8 85.627
i9 6.711, i10 50.021, i11 99.812, i12 57.873, i13 99.113, i14 76.225, i15 13.069, i16 63.972
i17 15.952, i18 25.008, i19 66.893, i20 43.536, i21 35.970, i22 35.144, i23 13.149, i24 15.010
i25 58.911, i26 83.089, i27 23.082, i28 66.573, i29 77.586, i30 30.366, i31 11.049, i32 50.238
i33 16.017, i34 87.246, i35 26.511, i36 28.581, i37 59.396, i38 72.272, i39 62.825, i40 46.380
i41 41.331, i42 11.770, i43 31.421, i44 4.655, i45 33.855, i46 18.210, i47 64.573, i48 56.075
i49 76.996, i50 29.781
---- 14 PARAMETER ydata
i1 32.320, i2 238.299, i3 87.081, i4 17.825, i5 -10.154, i6 -35.813, i7 14.506
i8 204.903, i9 -103.014, i10 79.521, i11 -153.679, i12 -47.971, i13 277.483, i14 160.084
i15 28.687, i16 133.650, i17 -63.820, i18 4.028, i19 145.434, i20 48.326, i21 49.295
i22 35.478, i23 -58.602, i24 34.463, i25 -50.666, i26 -115.522, i27 -19.540, i28 134.829
i29 198.074, i30 -7.351, i31 40.786, i32 82.107, i33 27.661, i34 -82.669, i35 5.081
i36 -1.278, i37 118.345, i38 172.905, i39 -57.943, i40 56.981, i41 -45.248, i42 34.517
i43 11.780, i44 -98.506, i45 -1.064, i46 33.323, i47 139.050, i48 -35.595, i49 -102.580
i50 3.912
When we plot this, we see:
This gives a clear impression that we actually have two different regression problems here. As we only have one data set, let's see if we can develop an optimization model that can assign points \((\color{darkblue}x_i,\color{darkblue}y_i)\) to one of the regression lines, and find the least-squares fit for both lines at the same time.
For this, we introduce a binary variable \[\color{darkred}\delta_i = \begin{cases} 1 & \text{if point $i$ belongs to regression line 1} \\ 0 & \text{if point $i$ belongs to regression line 2}\end{cases}\]
A first model to perform this combined task can look like this:
| Non-convex MIQCP model A |
|---|
| \[\begin{align}\min& \sum_i \left(\color{darkred}r_{1,i}^2 + \color{darkred}r_{2,i}^2 \right)\\ & \color{darkred}r_{1,i} = \left(\color{darkblue}y_i - \color{darkred}\alpha_0 - \color{darkred}\alpha_1 \cdot \color{darkblue}x_i \right)\cdot \color{darkred}\delta_i \\& \color{darkred}r_{2,i} = \left(\color{darkblue}y_i - \color{darkred}\beta_0 - \color{darkred}\beta_1 \cdot \color{darkblue}x_i \right)\cdot(1-\color{darkred}\delta_i) \\ & \color{darkred}\delta_i \in \{0,1\}\end{align} \] |
Here we set the residual to zero if the point belongs to the other regression line. For all models here, variables without restrictions are supposed to be free (i.e. their bounds are minus infinity and plus infinity).
A somewhat more compact formulation is:
| Non-convex MIQCP model B |
|---|
| \[\begin{align}\min& \sum_i \color{darkred}r_i^2 \\ & \color{darkred}r_i = \color{darkblue}y_i - \left[\color{darkred}\alpha_0\cdot \color{darkred}\delta_i + \color{darkred}\beta_0\cdot(1-\color{darkred}\delta_i)\right] - \left[\color{darkred}\alpha_1\cdot \color{darkred}\delta_i+ \color{darkred}\beta_1\cdot(1-\color{darkred}\delta_i)\right] \cdot \color{darkblue}x_i \\ & \color{darkred}\delta_i \in \{0,1\}\end{align} \] |
Although fewer variables and constraints, this model may be more difficult to solve.
Instead of using the \(\color{darkred}\delta\) variables in the constraints, we can also move them to the objective:
| Non-convex MINLP model C |
|---|
| \[\begin{align}\min& \sum_i \left( \color{darkred}r_{i,1}^2\cdot\color{darkred}\delta_i +\color{darkred}r_{2,1}^2\cdot (1-\color{darkred}\delta_i) \right)\\ & \color{darkred}r_{1,i} = \color{darkblue}y_i - \color{darkred}\alpha_0 - \color{darkred}\alpha_1\cdot \color{darkblue}x_i\\ & \color{darkred}r_{2,i} = \color{darkblue}y_i - \color{darkred}\beta_0 - \color{darkred}\beta_1\cdot \color{darkblue}x_i \\ & \color{darkred}\delta_i \in \{0,1\}\end{align} \] |
This MINLP model is no longer quadratic.
We can easily make this model convex by using linear constraints. If we can use indicator constraints (depends on the solver), we can write:
| Convex MIQP model D |
|---|
| \[\begin{align}\min& \sum_i \left( \color{darkred}r_{i,1}^2 +\color{darkred}r_{2,1}^2 \right)\\ & \color{darkred}\delta_i=1 \Rightarrow\color{darkred}r_{1,i} = \color{darkblue}y_i - \color{darkred}\alpha_0 - \color{darkred}\alpha_1\cdot \color{darkblue}x_i\\ &\color{darkred}\delta_i=0 \Rightarrow\color{darkred}r_{2,i} = \color{darkblue}y_i - \color{darkred}\beta_0 - \color{darkred}\beta_1\cdot \color{darkblue}x_i \\ & \color{darkred}\delta_i \in \{0,1\}\end{align} \] |
We can make those model a little bit less nonlinear as follows:
| Convex MIQP model E |
|---|
| \[\begin{align}\min& \sum_i \color{darkred}r_i^2 \\ & \color{darkred}\delta_i=1 \Rightarrow\color{darkred}r_i = \color{darkblue}y_i - \color{darkred}\alpha_0 - \color{darkred}\alpha_1\cdot \color{darkblue}x_i\\ &\color{darkred}\delta_i=0 \Rightarrow\color{darkred}r_i = \color{darkblue}y_i - \color{darkred}\beta_0 - \color{darkred}\beta_1\cdot \color{darkblue}x_i \\ & \color{darkred}\delta_i \in \{0,1\}\end{align} \] |
Let's do a big-M formulation:
| Convex MIQP model F |
|---|
| \[\begin{align}\min& \sum_i \color{darkred}r_i^2 \\ &\color{darkred}r_i = \color{darkblue}y_i - \color{darkred}\alpha_0 - \color{darkred}\alpha_1\cdot \color{darkblue}x_i +\color{darkred}s_{1,i}\\ &\color{darkred}r_i = \color{darkblue}y_i - \color{darkred}\beta_0 - \color{darkred}\beta_1\cdot \color{darkblue}x_i +\color{darkred}s_{2,i} \\ & -\color{darkblue}M\cdot \color{darkred}\delta_i \le \color{darkred}s_{1,i} \le \color{darkblue}M\cdot \color{darkred}\delta_i\\ & -\color{darkblue}M\cdot (1-\color{darkred}\delta_i )\le \color{darkred}s_{2,i} \le \color{darkblue}M\cdot (1-\color{darkred}\delta_i)\\ & \color{darkred}\delta_i \in \{0,1\}\end{align} \] |
and finally a SOS1 formulation:
| Convex MIQP model G |
|---|
| \[\begin{align}\min& \sum_i \color{darkred}r_i^2 \\ &\color{darkred}r_i = \color{darkblue}y_i - \color{darkred}\alpha_0 - \color{darkred}\alpha_1\cdot \color{darkblue}x_i +\color{darkred}s_{1,i}\\ &\color{darkred}r_i = \color{darkblue}y_i - \color{darkred}\beta_0 - \color{darkred}\beta_1\cdot \color{darkblue}x_i +\color{darkred}s_{2,i} \\ & \color{darkred}s_{1,i}, \color{darkred}s_{2,i} \in \textbf{SOS1}\\ & \color{darkred}\delta_i \in \{0,1\}\end{align} \] |
There are a few formulations I left out, but the point is clear: even a small problem can lead to very many different formulations.
I have not done extensive computational testing, but the big-M formulation seems to be working quite well. Of course, this assumes we have some reasonable bounds (\(-1000\) and \(+1000\) in my example). Some of the other formulations may be more appropriate if you are not sure about bounds.
Results
After solving the model with any of these models, we have the following result:
This problem is called the multiple-line fitting problem. There are other approaches to this problem [2,3,4]. Some of these methods are heuristic in nature and may require tuning some parameters. This may be a statistical problem where MIP/MIQP tools have something to offer. Using a proper formulation we can solve a problem like this within a few seconds to proven optimality and without any tuning parameters,
References
- Fitting two lines to a set of 2d points, https://stackoverflow.com/questions/66349148/fitting-two-lines-to-a-set-of-2d-points
- Yan Guo, A Multiple-Line Fitting Algorithm Without Initialization, https://cse.sc.edu/~songwang/CourseProj/proj2005/report/guo.pdf
- Lara-Alvarez, C., Romero, L. & Gomez, C. Multiple straight-line fitting using a Bayes factor. Adv Data Anal Classif 11, 205–218 (2017)
- Random sample consensus, https://en.wikipedia.org/wiki/Random_sample_consensus

