Quantcast
Channel: Yet Another Math Programming Consultant
Viewing all articles
Browse latest Browse all 809

A facility location problem

$
0
0
The following problem is from [1]:

I am trying to solve an uncapacitated facility location problem in which I would like
to find the location to open at most \(k\) facilities, and assign \(n\) clients to those
facilities, each with cost \(c_i\) (proportionate to the distance of the client from
the facility it's assigned to) for assignment with a budget \(B\). So I would like to
find the locations of the facilities, and assign clients to those facilities
such that the maximum number of clients are serviced within the budget \(B\).
I haven't found a variant like this in which there is an overall budget;
I've only seen versions where we want to minimize the overall cost while
assigning all the clients to facilities.

Let's give this a shot.

Example Data


A good way to familiarize yourself with a problem, is to invent some example data. Here is some random test data I used for the models below:


----     35 SET i  clients

client1 , client2 , client3 , client4 , client5 , client6 , client7 , client8 , client9
client10, client11, client12, client13, client14, client15, client16, client17, client18
client19, client20, client21, client22, client23, client24, client25


---- 35 SET j facilities

facility1, facility2, facility3, facility4


---- 35 SET k xy-coordinates

x, y


---- 35 PARAMETER U = 100.000 box size for generating random data
PARAMETER maxdist = 141.421 maximum distance (big-M)
PARAMETER budget = 500.000 available budget

---- 35 PARAMETER client location and cost data

x y cost

client1 17.17584.3276.611
client2 55.03830.1147.558
client3 29.22122.4056.274
client4 34.98385.6272.839
client5 6.71150.0210.864
client6 99.81257.8731.025
client7 99.11376.2256.413
client8 13.06963.9725.453
client9 15.95225.0080.315
client10 66.89343.5367.924
client11 35.97035.1440.728
client12 13.14915.0101.757
client13 58.91183.0895.256
client14 23.08266.5737.502
client15 77.58630.3661.781
client16 11.04950.2380.341
client17 16.01787.2465.851
client18 26.51128.5816.212
client19 59.39672.2723.894
client20 62.82546.3803.587
client21 41.33111.7702.430
client22 31.4214.6552.464
client23 33.85518.2101.305
client24 64.57356.0759.334
client25 76.99629.7813.799

The clients are uniformly distributed over the \([0,100]\times[0,100]\) square.

Client locations

1. High-level MINLP Formulation


The first step is to translate the "word-problem" into a high-level Mixed-Integer Nonlinear Programming (MINLP) model. The goal is not to achieve the best performance possible, but rather to have a "reference model": a precise mathematical description of the problem.

We use the following naming:


Names
Indices
\(i\)Clients
\(j\)Facilities
\(k\)x or y coordinate
Data
\(\color{darkblue}{\mathit{Cost}}_i\)Unit cost
\(\color{darkblue}{\mathit{Client}}_{i,k}\)Client locations
\(\color{darkblue}{\mathit{Budget}}\)Available budget
\(\color{darkblue}U\)Size of box containing client locations
Variables
\(\color{darkred}{\mathit{Assign}}_{i,j} \in \{0,1\} \)Assignment of client to facility
\(\color{darkred}{\mathit{ClientUsed}}_{i} \in \{0,1\} \)Client is assigned to a facility
\(\color{darkred}{\mathit{Facility}}_{j,k} \in [0,\color{darkblue}U]\)Facility locations
\(\color{darkred}{\mathit{Dist}}_{i,j} \in [0,\sqrt{2}\color{darkblue}U] \)Distance between client and facility
Zero if not assigned

With this we can start developing our model:


MINLP Problem
\[\begin{align}\max & \sum_i \color{darkred}{\mathit{ClientUsed}}_i\\ & \color{darkred}{\mathit{ClientUsed}}_i = \sum_j \color{darkred}{\mathit{Assign}}_{i,j} \\ &\color{darkred}{\mathit{Dist}}_{i,j} = \color{darkred}{\mathit{Assign}}_{i,j} \cdot \sqrt{\sum_k \left(\color{darkblue}{\mathit{Client}}_{i,k}-\color{darkred}{\mathit{Facility}}_{j,k} \right)^2} \\ & \sum_{i,j} \color{darkblue}{\mathit{Cost}}_i \color{darkred}{\mathit{Dist}}_{i,j} \le \color{darkblue}{\mathit{Budget}} \end{align}\]


We don't need to add an assignment constraint like\[\sum_j {\mathit{Assign}}_{i,j}  \le 1 \>\>\forall i \] as this is already captured in the constraint \[\mathit{ClientUsed}_i = \sum_j {\mathit{Assign}}_{i,j}\]

The distance constraint can be interpreted as \[{\mathit{Dist}}_{i,j} = \begin{cases} \sqrt{\displaystyle\sum_k \left({\mathit{Client}}_{i,k}-{\mathit{Facility}}_{j,k} \right)^2} & \text{if $\mathit{Assign}_{i,j}=1$} \\ 0 & \text{if $\mathit{Assign}_{i,j}=0$}\end{cases}\]

We can even use a global MINLP solver to get optimal solutions for smaller data sets (or "good" solutions for slightly larger problems). This will give us more confidence that the model is correctly representing the problem.

2. Mixed Integer Quadratically Constrained Model


The second step is to get rid of some of the non-linearities so that a quadratic model remains. We need to work on the distance constraint. By making it a big-M constraint, we can make everything quadratic. We get rid of the square root and the multiplication by the Assign variable.


MIQCP Problem
\[\begin{align}\max & \sum_i \color{darkred}{\mathit{ClientUsed}}_i\\ & \color{darkred}{\mathit{ClientUsed}}_i = \sum_j \color{darkred}{\mathit{Assign}}_{i,j} \\ &\color{darkred}{\mathit{Dist}}_{i,j}^2 \ge \sum_k \left(\color{darkblue}{\mathit{Client}}_{i,k}-\color{darkred}{\mathit{Facility}}_{j,k} \right)^2 - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i,j}) \\ & \sum_{i,j} \color{darkblue}{\mathit{Cost}}_i \color{darkred}{\mathit{Dist}}_{i,j} \le \color{darkblue}{\mathit{Budget}} \\ &\color{darkblue} M = 2\color{darkblue}U^2 \end{align}\]


If we run this with Cplex we see:

CPLEX Error  5002: 'QCP_row_for_distance2(client1.facility1)' is not convex.

The new version Gurobi 9 can solve this as a nonconvex MIQCP. Of course we can also solve this with a global MINLP solver.

The problem would be easier if the costs do not increase proportionally with the distance but with the squared distance. That would yield a simpler and convex MIQCP problem. [2]

3. Mixed Integer Second Order Cone Formulation


The constraint \[y^2 \ge \sum_i x_i^2 \>\>\text{where $y\ge 0$}\] is a Second Order Cone constraint. This type of constraint is supported by Cplex. But only if this is specified (almost) exactly like this. I.e. no additional linear terms in the constraints. The mechanics are not complicated, just a bit tedious. We need to split our distance constraint in parts to shoehorn it into a SOCP framework:

MISOCP Problem
\[\begin{align}\max & \sum_i \color{darkred}{\mathit{ClientUsed}}_i\\ & \color{darkred}{\mathit{ClientUsed}}_i = \sum_j \color{darkred}{\mathit{Assign}}_{i,j} \\ & \color{darkred}{\mathit{Dist}}_{i,j} \ge \color{darkred}\Delta_{i,j} - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i,j})  \\ &\color{darkred}{\mathit{\Delta}}_{i,j}^2 \ge  \sum_k  \color{darkred} D_{i,j,k}^2\\ & \color{darkred} D_{i,j,k} = \color{darkblue}{\mathit{Client}}_{i,k}-\color{darkred}{\mathit{Facility}}_{j,k}  \\  & \sum_{i,j} \color{darkblue}{\mathit{Cost}}_i \color{darkred}{\mathit{Dist}}_{i,j} \le \color{darkblue}{\mathit{Budget}} \\ & \color{darkred}\Delta_{i,j} \ge 0 \\ & \color{darkred}D_{i,j,k} \>\text{free} \\ & \color{darkblue} M = \sqrt{2}\color{darkblue}U\end{align}\]


This is now a convex problem. It can be solved with MISOCP solvers like Cplex, Gurobi and Mosek.

This is just a reformulation of the previous MIQCP model. Solvers do not typically recognize that form as a second order cone problem. So often the task is left to the modeler to help the solver and produce a convex model. Of course we would like to see this automated [3] so we don't need to rely on remembering a bag of reformulation tricks. Another issue is that this reformulation makes the model less intuitive.

This is a small problem. But Cplex has no easy time in solving it. Interestingly, it found the optimal solution completely at the end.


6939104130377220.73952419.000020.80472.53e+089.50%
6971225130220120.00002019.000020.79152.54e+089.43%
70057951298106 infeasible 19.000020.77792.55e+089.36%
7039089129272420.00002419.000020.75732.56e+089.25%
70722541283943 infeasible 19.000020.74302.58e+089.17%
71073481281824 infeasible 19.000020.72742.58e+089.09%
71430171262286 infeasible 19.000020.70022.60e+088.95%
7179356124837020.00002919.000020.61532.61e+088.50%
72182551222703 infeasible 19.000020.23982.62e+086.53%
*72277721220913 integral 020.000020.15412.62e+080.77%
Found incumbent of value 19.999999 after 1932.55 sec. (890057.09 ticks)
*7247324926397 integral 020.000020.00582.63e+080.03%
Cone: 48
Found incumbent of value 20.000000 after 1940.95 sec. (892272.37 ticks)
7252826926392 infeasible 20.000020.00582.63e+080.03%
Elapsed time = 1945.70 sec. (892962.40 ticks, tree = 766.57 MB, solutions = 28)
728011464647 infeasible 20.000020.00022.64e+080.00%
730308740831 cutoff 20.000020.00002.64e+080.00%
732367615673 infeasible 20.000020.00002.66e+080.00%

Cover cuts applied: 1
Mixed integer rounding cuts applied: 20
Gomory fractional cuts applied: 10
Cone linearizations applied: 3328

Root node processing (before b&c):
Real time = 0.25 sec. (50.77 ticks)
Parallel b&c, 16 threads:
Real time = 1995.13 sec. (905620.31 ticks)
Sync time (average) = 265.77 sec.
Wait time (average) = 0.00 sec.
------------
Total (root+branch&cut) = 1995.38 sec. (905671.08 ticks)
MIQCP status(101): integer optimal solution


The solution looks like:


----    156 VARIABLE assign.L  client to facility

facility1 facility2 facility3 facility4

client1 1.000
client3 1.000
client4 1.000
client5 1.000
client6 1.000
client8 1.000
client9 1.000
client10 1.000
client11 1.000
client12 1.000
client14 1.000
client15 1.000
client16 1.000
client17 1.000
client18 1.000
client20 1.000
client21 1.000
client22 1.000
client23 1.000
client25 1.000


---- 156 VARIABLE facility.L coordinates

x y

facility1 17.14684.398
facility2 23.05566.584
facility3 29.22822.420
facility4 66.89243.536


---- 156 VARIABLE clientUsed.L client is assigned to facility

client1 1.000, client3 1.000, client4 1.000, client5 1.000
client6 1.000, client8 1.000, client9 1.000, client10 1.000
client11 1.000, client12 1.000, client14 1.000, client15 1.000
client16 1.000, client17 1.000, client18 1.000, client20 1.000
client21 1.000, client22 1.000, client23 1.000, client25 1.000


---- 156 PARAMETER DistanceReport distances for assignments

facility1 facility2 facility3 facility4

client1 0.084
client3 0.029
client4 22.499
client5 23.346
client6 35.973
client8 10.337
client9 13.634
client10 0.020
client11 14.488
client12 17.746
client14 0.047
client15 17.008
client16 20.391
client17 3.082
client18 6.750
client20 4.991
client21 16.154
client22 17.930
client23 6.298
client25 17.089


---- 156 PARAMETER CostAllocation cost = unit cost * distance for assignment
s

facility1 facility2 facility3 facility4

client1 0.557
client3 0.180
client4 63.867
client5 20.177
client6 36.877
client8 56.367
client9 4.298
client10 0.155
client11 10.542
client12 31.173
client14 0.352
client15 30.295
client16 6.962
client17 18.036
client18 41.934
client20 17.903
client21 39.260
client22 44.184
client23 8.219
client25 64.929


---- 156 PARAMETER CostReport total cost vs budget

total 496.267, limit 500.000

Basically: we can cover 20 clients with the given budget and 4 facilities. We recalculated the cost, as the optimization model may overestimate the distance and the cost (it only worries about the cost to be below the budget).

Solution

One thing we can see is that a small distance does not guarantee to be selected: it can be too costly because of high unit cost. We should also note that this is not a min cost solution, just a solution that obeys the budget constraint.

4. Approximation


In the previous models, we calculated two quantities:

  1. \(x,y\) location of each facility (continuous variables)
  2. Assignment of clients to a facility (binary variable) 

This turned out to be a difficult model to solve. The usual approach is: if a model is too difficult to solve, solve a different problem. If we set up a grid, and allow a facility only to be placed on the grid points, we get a much simpler model. We can pre-compute the distance and cost between each client location and each possible facility location. The end result is just a linear, but large MIP model.

The model looks like:

MIP Problem
\[\begin{align}\max & \sum_i \color{darkred}{\mathit{ClientUsed}}_i\\ & \color{darkred}{\mathit{ClientUsed}}_i = \sum_p \color{darkred}{\mathit{Assign}}_{i,p} \\ & \sum_p \color{darkred}{\mathit{PointUsed}}_p = \color{darkblue}{\mathit{NumFacilities}}\\ & \color{darkred}{\mathit{Assign}}_{i,p} \le \color{darkblue}{\mathit{NumClients}} \cdot\color{darkred}{\mathit{PointUsed}}_p\\ & \sum_{i,p} \color{darkblue}{\mathit{Cost}}_{i,p} \cdot \color{darkred}{\mathit{Assign}}_{i,p} \le \color{darkblue}{\mathit{Budget}} \\ & \color{darkred}{\mathit{Assign}}_{i,p} \in \{0,1\} \\ & \color{darkred}{\mathit{PointUsed}}_p \in \{0,1\} \end{align}\]

Here \(p\) are the grid points. As we have box of \([0,100]\times[0,100]\) for the client locations, I used \(101 \times 101 = 10201\) grid points. This is a fine grid, so we expect similar results as for the models where facilities can be located anywhere on the \(x\)-\(y\) plane. The parameter \(\mathit{Cost}_{i,p}\) is precalculated. The resulting model is very large (considering the original data set was small):


MODEL STATISTICS

BLOCKS OF EQUATIONS 5 SINGLE EQUATIONS 255,053
BLOCKS OF VARIABLES 4 SINGLE VARIABLES 265,252
NON ZERO ELEMENTS 1,030,352 DISCRETE VARIABLES 265,251

However it solves very fast (less than a minute). The results are:

----    121 VARIABLE assign.L  client to facility

gp1803 gp2953 gp6043 gp6811

client1 1.000
client3 1.000
client4 1.000
client5 1.000
client6 1.000
client9 1.000
client10 1.000
client11 1.000
client12 1.000
client13 1.000
client15 1.000
client16 1.000
client17 1.000
client18 1.000
client19 1.000
client20 1.000
client21 1.000
client22 1.000
client23 1.000
client25 1.000


---- 121 VARIABLE clientUsed.L client is assigned to gridpoint

client1 1.000, client3 1.000, client4 1.000, client5 1.000, client6 1.000, client9 1.000
client10 1.000, client11 1.000, client12 1.000, client13 1.000, client15 1.000, client16 1.000
client17 1.000, client18 1.000, client19 1.000, client20 1.000, client21 1.000, client22 1.000
client23 1.000, client25 1.000


---- 121 VARIABLE numClients.L = 20.000 objective variable

---- 121 PARAMETER DistanceReport distances for assignments

gp1803 gp2953 gp6043 gp6811

client1 0.696
client3 0.635
client4 17.994
client5 35.027
client6 36.025
client9 13.202
client10 0.546
client11 14.002
client12 17.751
client13 0.126
client15 16.483
client16 32.622
client17 2.452
client18 6.111
client19 10.735
client20 5.372
client21 16.678
client22 18.504
client23 6.820
client25 16.573


---- 121 PARAMETER CostAllocation cost = unit cost * distance for assignments

gp1803 gp2953 gp6043 gp6811

client1 4.599
client3 3.981
client4 51.078
client5 30.272
client6 36.931
client9 4.162
client10 4.328
client11 10.189
client12 31.181
client13 0.661
client15 29.360
client16 11.137
client17 14.346
client18 37.964
client19 41.799
client20 19.269
client21 40.534
client22 45.598
client23 8.901
client25 62.968


---- 121 PARAMETER CostReport total cost vs budget

total 489.260, limit 500.000


We see again that we can cover 20 clients with the given budget.

Conclusion


A variant of the facility location problem seeks to find how many clients we can serve for a given budget. The costs are expressed as \(c_i \times \mathbf{dist}(i,j)\), where \(c_i\) is the unit cost for client \(i\) and \(\mathbf{dist}(i,j)\) is the distance between client \(i\) and facility \(j\). 

We formulated the model in four different ways:

  • An MINLP model,
  • an MIQCP model,
  • an MISOCP model,
  • and an approximation: facilities are only to be placed on grid points instead of on any point \((x,y)\). This yielded a large but simple MIP model.

The MIP model was by far the easiest to solve.

References


  1. Uncapacitated facility location problem with at most k facilities and budget b, https://stackoverflow.com/questions/59327296/uncapacitated-facility-location-problem-with-at-most-k-facilities-and-budget-b
  2. Solving a facility location problem as an MIQCP, https://yetanothermathprogrammingconsultant.blogspot.com/2018/01/solving-facility-location-problem-as.html
  3. Jared Erickson, Robert Fourer, Detection and Transformation of Second-Order Cone Programming Problems in a General-Purpose Algebraic Modeling Language, 2019,  http://www.optimization-online.org/DB_HTML/2019/05/7194.html


Viewing all articles
Browse latest Browse all 809

Trending Articles