Finding the central point in a point cloud

Problem

The problem is easy to describe: given the coordinates of points \(p_i\), find the central point \(x\) that minimizes the sum of the distances from \(x\) to \(p_i\) [1].

Data

We just use some random data on the unit square \([0,1]\times[0,1]\):

----11 PARAMETER p  points (randomly generated)

                  x           y

point1        0.1720.843
point2        0.5500.301
point3        0.2920.224
point4        0.3500.856
point5        0.0670.500
point6        0.9980.579
point7        0.9910.762
point8        0.1310.640
point9        0.1600.250
point10       0.6690.435
point11       0.3600.351
point12       0.1310.150
point13       0.5890.831
point14       0.2310.666
point15       0.7760.304
point16       0.1100.502
point17       0.1600.872
point18       0.2650.286
point19       0.5940.723
point20       0.6280.464
point21       0.4130.118
point22       0.3140.047
point23       0.3390.182
point24       0.6460.561
point25       0.7700.298
point26       0.6610.756
point27       0.6270.284
point28       0.0860.103
point29       0.6410.545
point30       0.0320.792
point31       0.0730.176
point32       0.5260.750
point33       0.1780.034
point34       0.5850.621
point35       0.3890.359
point36       0.2430.246
point37       0.1310.933
point38       0.3800.783
point39       0.3000.125
point40       0.7490.069
point41       0.2020.005
point42       0.2700.500
point43       0.1510.174
point44       0.3310.317
point45       0.3220.964
point46       0.9940.370
point47       0.3730.772
point48       0.3970.913
point49       0.1200.735
point50       0.0550.576
point51       0.0510.006
point52       0.4010.520
point53       0.6290.226
point54       0.3960.276
point55       0.1520.936
point56       0.4230.135
point57       0.3860.375
point58       0.2680.948
point59       0.1890.298
mean          0.3800.464

The last entry labeled mean is just the average of the \(x\)- and \(y\)-coordinates.

NLP Model

A straightforward non-linear programming model can look like:

Unconstrained NLP Model
\[\min \sum_i \sqrt{\sum_c (\color{darkred}x_c-\color{darkblue}p_{i,c})^2 }\]

We use \(c = \{x,y\}\), i.e. we have \(x\) and \(y\)-coordinates. Note that we use \(x\) in two different contexts: element of set \(c\), being the \(x\)-coordinate, and the decision variable \(x_c\).

We can use the mean as a very good starting point to help the NLP solver. I.e. \[x_c := \frac{\displaystyle\sum_i^n p_{i,c}}{n}\]

The picture below shows why the mean is such a good starting point:

Optimal center point is close to mean point

The numeric values are here:

----45 PARAMETER results  x(center) vs mean

               x           y     sumdist

mean  0.390427090.4729904218.02414861
x     0.370482980.4385757217.96891985

The sumdist column shows the objective values for these two points.

This is an easy NLP problem. Most NLP solvers just need a few iterations. With a system like GAMS or AMPL we get exact gradients automatically. That is much preferable to finite differences which seems the prevalent method people use in an R or Python environment.

Cone programming I

The above problem can also be written as a cone programming problem. This will allow us to use a different class of solvers to work on this problem. Here we use CVXPY [2] to express the model. The Python code can look like:

importcvxpyascp

x = cp.Variable(2)   # center point

obj = cp.Minimize( cp.sum( [ cp.norm(x-p[i,:]) for i inrange(N) ] ) )
prob = cp.Problem(obj)

objval = prob.solve(solver=cp.SCS, verbose=True)

This is very much a straight translation of our unconstrained NLP model. Although we only declared two \(x\) variables, behind the scenes the model is blown up to a rather large one. We can see from the log:

----------------------------------------------------------------------------
    SCS v2.1.1- Splitting Conic Solver
    (c) Brendan O'Donoghue, Stanford University, 2012
----------------------------------------------------------------------------
Lin-sys: sparse-direct, nnz in A = 150
eps = 1.00e-04, alpha = 1.50, max_iters = 5000, normalize = 1, scale = 1.00
acceleration_lookback = 0, rho_x = 1.00e-03
Variables n = 52, constraints m = 150
Cones:  soc vars: 150, soc blks: 50

The generated SOCP (second order cone programming) model is larger, but also very sparse. The solver has no problem solving it very quickly. With SOCP solvers we usually don't worry about an initial point like we used in the NLP model.

Cone programming II

If we don't use a modeling tool that can do these transformations automatically, we can use a DIY approach. Second-order cone constraints can be stated as: \[||A_i x+b_i||_2 \le c_i^Tx + d_i \>\>\forall i\] This would imply that we can write our model as:

SOCP Model Attempt
\[\begin{align} \min & \sum_i \color{darkred} d_i \\ & \color{darkred} d_i^2 \ge \sum_c (\color{darkred}x_c-\color{darkblue}p_{i,c})^2 && \forall i \\ &\color{darkred} d_i \ge 0, \color{darkred} x_c \text{ free}\end{align} \]

Unfortunately this will yield the message: CPLEX Error 5002: 'e(point1)' is not convex. We can repair this as follows:

Repaired SOCP Model
\[\begin{align} \min & \sum_i \color{darkred} d_i \\ & \color{darkred} d_i^2 \ge \sum_c \color{darkred}y_{i,c}^2 && \forall i \\ & \color{darkred} y_{i,c} = \color{darkred} x_c -\color{darkblue}p_{i,c} && \forall i,c \\ &\color{darkred} d_i \ge 0, \color{darkred} x_c \text{ free}, \color{darkred} y_{i,c} \text{ free}\end{align} \]

This now solves quickly. We can now understand that CVXPY did quite a few steps before passing the model to the solver. As argued in [3], it is much better if the modeling system takes care of these reformulations. Some of them are not immediate obvious, and hand-crafted reformulations can be error-prone.

Conclusion

The min sum distance problem has a simple NLP formulation which can be improved by using a good initial point. It can also be formulated as a SOCP problem. Using high-level modeling tools this is not difficult. Without automatic reformulations things become a bit less obvious.

References

The point that minimizes the sum of euclidean distances to a set of n points, https://stackoverflow.com/questions/57277247/the-point-that-minimizes-the-sum-of-euclidean-distances-to-a-set-of-n-points
https://www.cvxpy.org/
Victor Zverovich, Robert Fourer, Automatic Reformulation of Second-Order Cone Programming Problems, https://ampl.com/MEETINGS/TALKS/2015_01_Richmond_2E.2.pdf

Finding the central point in a point cloud

Problem

Data

NLP Model

Cone programming I

Cone programming II

Conclusion

References

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112