Quantcast
Channel: Yet Another Math Programming Consultant
Viewing all articles
Browse latest Browse all 809

Reallocate people: a very small but interesting LP

$
0
0

U-Haul Moving Van [link]


This is a strange little problem [1]. We have individuals (of different categories) and regions with a given capacity. The initial set of assigned persons fits within the capacity of each region. Now a new batch of individuals arrives with their chosen zones. Unfortunately, the total demand (old + new individuals) does not fit anymore in all cases. This means we need to reallocate people. We can do this at a price (the costs are given). What is the optimal (least cost) new allocation?


Data


The data we need to consider is:


----     35 PARAMETER popdatapopulation data

initial added final

catA.zone1 11523138
catA.zone2 12124145
catA.zone3 11222134
catA.zone4 761591
catB.zone1 702999
catB.zone2 592483
catB.zone3 8635121
catB.zone4 13957196
catC.zone1 14218160
catC.zone2 72981
catC.zone3 29433
catC.zone4 58866
catD.zone1 222547
catD.zone2 232649
catD.zone3 161834
catD.zone4 455196


---- 44 PARAMETER capacity total capacity per zone

zone1 465, zone2 393, zone3 500, zone4 331


---- 66 PARAMETER costcost moving to target zone

zone1 zone2 zone3 zone4

catA 0.10.10.11.3
catB 16.238.11.50.1
catC 0.112.797.746.3
catD 25.37.767.30.1


---- 80 PARAMETER countssummary data

zone1 zone2 zone3 zone4 total

initial 3492752433181185
final 4443583224491573
capacity 4653935003311689


We can see our initial set of people fitted within the capacity of each zone. However, the final set does not (zone 4 is out of room). We can assume that the total capacity is enough to accommodate all people. The optimization problem is: how to move people around at minimum cost, such that all capacity constraints are met.


Model


The LP model I suggest has two variables: \[\begin{align}\color{darkred}{\mathit{move}}_{c,z,z'}&:\>\text{number of people of category $c$ moved from zone $z$ to $z'$}\\ \color{darkred}{\mathit{alloc}}_{c,z}&:\>\text{new allocation of people of category $c$ to zone $z$}  \end{align}\] With these, the LP model can look like: 

LP model
\[\begin{align}\min&\sum_{c,z,z'} \color{darkblue}{\mathit{cost}}_{c,z'}\cdot\color{darkred}{\mathit{move}}_{c,z,z'} \\ & \color{darkred}{\mathit{alloc}}_{c,z} = \color{darkblue}{\mathit{demand}}_{c,z} + \sum_{z'}\color{darkred}{\mathit{move}}_{c,z',z} - \sum_{z'}\color{darkred}{\mathit{move}}_{c,z,z'} && \forall c,z \\ & \sum_c \color{darkred}{\mathit{alloc}}_{c,z} \le \color{darkblue}{\mathit{capacity}}_{z} && \forall z \\ & \color{darkred}{\mathit{move}}_{c,z,z}= 0 && \forall c,z \\ & \color{darkred}{\mathit{move}}_{c,z',z} \ge 0 \\ &\color{darkred}{\mathit{alloc}}_{c,z} \ge 0 \end{align}\]

Notes:
  • The first constraint says: \[\text{allocation$=$demand $+$ inflow $-$ outflow}\] This is just proper bookkeeping.
  • The model seems to deliver integer solutions, which makes it easier to interpret results. Not sure if this is guaranteed.
  • We cannot substitute out the variable \(\color{darkred}{\mathit{alloc}}_{c,z}\) as it is non-negative. I'll show below what happens when we do that anyway.
  • We can drop the constraint \(\color{darkred}{\mathit{move}}_{c,z,z}= 0\). This will automatically hold in the optimal solution as we are minimizing cost. If we choose to keep it, we can use fixing (setting the upper bound to zero) instead of writing a full-blown equation. For large models, it may be a good strategy to drop \(\color{darkred}{\mathit{move}}_{c,z,z} \) from the model altogether.
    In any case, a good presolver will remove \(\color{darkred}{\mathit{move}}_{c,z,z} \) from the model. These variables cancel out in the first constraint, so they only appear in the objective.
  • Here we assumed that all people can be moved. Later on, we see how things change when we only allow the new, additional individuals are allowed to be reassigned.

The solution looks like:


----    122 VARIABLE move.L  moves needed to meet capacity

zone1 zone2 zone3

catA.zone1 6
catA.zone4 2962
catC.zone4 27


---- 122 VARIABLE alloc.L new allocation

zone1 zone2 zone3 zone4

catA 132180196
catB 9983121196
catC 187813339
catD 47493496


---- 122 VARIABLE totcost.L = 12.400total cost

---- 126 PARAMETER countssummary data

zone1 zone2 zone3 zone4 total

initial 3492752433181185
final 4443583224491573
capacity 4653935003311689
newalloc 4653933843311573


This is an interesting solution. Obviously, we need to move people away from zone 4. The model first moves a few persons from zone 1 to zone 2 to make room in zone 1. I.e., we actually move two people here to move 1 person away from zone 4. Of course, this highly depends on the cost structure. 


Alternative 1: substitute out the variable \(\color{darkred}{\mathit{alloc}}_{c,z}\) 


It may be tempting to get rid of the variable \(\color{darkred}{\mathit{alloc}}_{c,z}\) and write the model as \[\begin{align}\min&\sum_{c,z,z'} \color{darkblue}{\mathit{cost}}_{c,z'}\cdot\color{darkred}{\mathit{move}}_{c,z,z'} \\ & \sum_c \left[\color{darkblue}{\mathit{demand}}_{c,z} + \sum_{z'}\color{darkred}{\mathit{move}}_{c,z',z} - \sum_{z'}\color{darkred}{\mathit{move}}_{c,z,z'} \right] \le  \color{darkblue}{\mathit{capacity}}_{z} &&\forall z \end{align}\]

This makes the model smaller. So what is not to like. Here are the results:


----    144 VARIABLE move.L  moves needed to meet capacity

zone2 zone3

catA.zone4 3583


---- 144 VARIABLE totcost.L = 11.800total cost

---- 144 PARAMETER alloc2 recalculate allocations

zone1 zone2 zone3 zone4

catA 138.000180.000217.000 -27.000
catB 99.00083.000121.000196.000
catC 160.00081.00033.00066.000
catD 47.00049.00034.00096.000


The objective improves (from 12.4 to 11.8)! Without further analysis, it is not immediately clear why this happens. So I added a post solution calculation of the new allocations: \[\color{darkblue}{\mathit{alloc2}}_{c,z} := \color{darkblue}{\mathit{demand}}_{c,z} + \sum_{z'}\color{darkred}{\mathit{move}}^*_{c,z',z} - \sum_{z'}\color{darkred}{\mathit{move}}^*_{c,z,z'}\] This shows we have a negative allocation for category 4 and zone 4. The reason is of course that with substituting out \(\color{darkred}{\mathit{alloc}}_{c,z}\) we also removed the non-negativity restriction \(\color{darkred}{\mathit{alloc}}_{c,z} \ge 0\). As a result we get an illegal solution. 

Conclusion: we cannot do this.


Alternative 2: only move new people


In the first model, we allowed both initial and added persons to be moved. We can easily change this to: allow only new persons to be moved and keep the initial persons at their location. We do this by: \[\begin{align} & \color{darkblue}{\mathit{demand}}_{c,z}:=\color{darkblue}{\mathit{popdata}}_{c,z,'\color{darkgreen}{\mathit{Added}}\:'} \\ & \color{darkblue}{\mathit{capacity}}_{z} := \color{darkblue}{\mathit{capacity}}_{z} - \color{darkblue}{\mathit{popdata}}_{c,z,'\color{darkgreen}{\mathit{Initial}}\:'}\end{align}\] and then just run the model again. We would expect a larger cost because we removed some flexibility from the model.




---    158 VARIABLE move.L  moves needed to meet capacity

zone1 zone2 zone3

catA.zone2 3
catA.zone4 132
catB.zone4 57
catC.zone4 8
catD.zone4 38


---- 158 VARIABLE alloc.L new allocation

zone1 zone2 zone3 zone4

catA 362127
catB 292492
catC 2694
catD 25641813


---- 158 VARIABLE totcost.L = 380.700 total cost

---- 162 PARAMETER counts summary data

zone1 zone2 zone3 zone4 total

initial 3492752433181185
final 4443583224491573
capacity 4653935003311689
newalloc 4653933843311573


Indeed the cost of moving just from the pool of added persons is much higher.


Network formulation


In the comments, Rob Pratt uses a network interpretation to argue that the model should indeed deliver integer solutions. A second comment further helped in simplifying the network model.

This is a good example to show how to set up a network LP in GAMS. The idea is to form a network as follows:


Network representation

The dashed arrows indicate fixed, exogenous in- or outflow. The inflow into the first layer is demand. The second layer contains the zones with the arcs indicating the new allocation after moving people. The final layer collects the total number of persons. The main use for that last layer is to provide capacitated arcs and to have a single outflow. The costs are placed on the arcs between the first and second layer.

To implement this in GAMS, I first create a 2-dimensional set with nodes. The dimensions are (category, zone). I use a 'all' when there is no category, and 'sink' for the aggregated zones. So the sink is identified by ('all','sink'). This means, our arcs have 4 dimensions. These two sets can look like:


----     98 SET nnodes

zone1 zone2 zone3 zone4 sink

catA YES YES YES YES
catB YES YES YES YES
catC YES YES YES YES
catD YES YES YES YES
all YES YES YES YES YES


---- 98 SET aarcs

all.zone1 all.zone2 all.zone3 all.zone4 all.sink

catA.zone1 YES YES YES YES
catA.zone2 YES YES YES YES
catA.zone3 YES YES YES YES
catA.zone4 YES YES YES YES
catB.zone1 YES YES YES YES
catB.zone2 YES YES YES YES
catB.zone3 YES YES YES YES
catB.zone4 YES YES YES YES
catC.zone1 YES YES YES YES
catC.zone2 YES YES YES YES
catC.zone3 YES YES YES YES
catC.zone4 YES YES YES YES
catD.zone1 YES YES YES YES
catD.zone2 YES YES YES YES
catD.zone3 YES YES YES YES
catD.zone4 YES YES YES YES
all .zone1 YES
all .zone2 YES
all .zone3 YES
all .zone4 YES


We also have some capacities, costs, and exogenous inflows. They are:


----    112 PARAMETER acost  cost of an arc

all.zone1 all.zone2 all.zone3 all.zone4

catA.zone1 0.10.11.3
catA.zone2 0.10.11.3
catA.zone3 0.10.11.3
catA.zone4 0.10.10.1
catB.zone1 38.11.50.1
catB.zone2 16.21.50.1
catB.zone3 16.238.10.1
catB.zone4 16.238.11.5
catC.zone1 12.797.746.3
catC.zone2 0.197.746.3
catC.zone3 0.112.746.3
catC.zone4 0.112.797.7
catD.zone1 7.767.30.1
catD.zone2 25.367.30.1
catD.zone3 25.37.70.1
catD.zone4 25.37.767.3


---- 112 PARAMETER inflow exogenous in- or outflow

zone1 zone2 zone3 zone4 sink

catA 13814513491
catB 9983121196
catC 160813366
catD 47493496
all -1573


---- 112 PARAMETER capcapacity of an arc

all.sink

all.zone1 465
all.zone2 393
all.zone3 500
all.zone4 331


With this we can form a standard min-cost-flow LP model:

Min-Cost-Flow Network LP Model
\[\begin{align}\min&\sum_{a} \color{darkblue}c_{a} \cdot \color{darkred}f_{a} \\ & \sum_{a(j,i)} \color{darkred}f_{j,i} + \color{darkblue}{\mathit{inflow}}_i = \sum_{a(i,j)} \color{darkred}f_{i,j} && \forall i \\ & \color{darkred}f_a \in [0,\color{darkblue}{\mathit{cap}}_a] \end{align}\] 

The constraint is just flow-conservation: what goes into node \(i\) should also go out. When I solve this, I get the same solution as before:


----    141 VARIABLE flow.L  flows along arcs

all.zone1 all.zone2 all.zone3 all.zone4 all.sink

catA.zone1 1326
catA.zone2 145
catA.zone3 134
catA.zone4 2962
catB.zone1 99
catB.zone2 83
catB.zone3 121
catB.zone4 196
catC.zone1 160
catC.zone2 81
catC.zone3 33
catC.zone4 2739
catD.zone1 47
catD.zone2 49
catD.zone3 34
catD.zone4 96
all .zone1 465
all .zone2 393
all .zone3 384
all .zone4 331


---- 141 VARIABLE totcost.L = 12.400total cost

---- 146 PARAMETER moves computed from flows

zone1 zone2 zone3

catA.zone1 6
catA.zone4 2962
catC.zone4 27


The GAMS code is shown below in appendix B.

Notes:

  • Although we formulate the problem as a network, I still solve it as an LP. For very large problems it is better to use a specialized network solver.
  • The arc costs need a bit of attention. It is important to give the arcs from layer 'demand' to 'zone' a zero cost when the zone doesn't change. In the original LP model we did not care, as we modeled this slightly differently.
  • When we want to move only newly added persons, we can preprocess the data, as was demonstrated for the LP formulation.
  • My first attempt had an extra layer in the graph. As a commenter noted, this could be simplified. My description and the network topology reflect this simplification.

Conclusion


This is a small but interesting model. In [1] an R/LPSolve implementation is given. When comparing this with my GAMS model (reproduced in the appendix below), I have to conclude that using R/LpSolve is slightly masochistic. The code is unwieldy and not very close to the original problem. This makes reasoning about the problem much more difficult.

Setting up the alternative network model is a very different exercise than the earlier LP models. The min-cost-flow model is quite standard, so we don't have to think long about the model equations. But we have to pay attention to the network topology: that is where the complexity is. It is interesting to compare the two modeling approaches.


References


  1. Minimise cost of reallocating individuals, https://stackoverflow.com/questions/69046256/minimise-cost-of-reallocating-individuals


Appendix A: GAMS LP model


set
   dummy
'for ordering (printing)'/initial,added,final/
   c
'category'/catA*catD/
   z
'zone'     /zone1*zone4/
;

*----------------------------------------------------------------
* data
*----------------------------------------------------------------

table popdata(c,z,*) 'population data'
                   
initial   final
   
catA.zone1         115     138
   
catA.zone2         121     145
   
catA.zone3         112     134
   
catA.zone4          76      91
   
catB.zone1          70      99
   
catB.zone2          59      83
   
catB.zone3          86     121
   
catB.zone4         139     196
   
catC.zone1         142     160
   
catC.zone2          72      81
   
catC.zone3          29      33
   
catC.zone4          58      66
   
catD.zone1          22      47
   
catD.zone2          23      49
   
catD.zone3          16      34
   
catD.zone4          45      96
;
popdata(c,z,
'added') = popdata(c,z,'final')-popdata(c,z,'initial');
option popdata:0;
display popdata;

parameter capacity(z) 'total capacity per zone'/
   
zone1   465
   
zone2   393
   
zone3   500
   
zone4   331
/;
option capacity:0;
display capacity;


parameter cost(c,z) 'cost moving to target zone'/
       
catA.zone1   0.1
       
catA.zone2   0.1
       
catA.zone3   0.1
       
catA.zone4   1.3
       
catB.zone1  16.2
       
catB.zone2  38.1
       
catB.zone3   1.5
       
catB.zone4   0.1
       
catC.zone1   0.1
       
catC.zone2  12.7
       
catC.zone3  97.7
       
catC.zone4  46.3
       
catD.zone1  25.3
       
catD.zone2   7.7
       
catD.zone3  67.3
       
catD.zone4   0.1
/;
option cost:1;
display cost;

*----------------------------------------------------------------
* summary data
*----------------------------------------------------------------

parameter counts(*,*) 'summary data';
counts(
'initial',z) = sum(c,popdata(c,z,'initial'));
counts(
'initial','total') = sum(z,counts('initial',z));
counts(
'final',z) = sum(c,popdata(c,z,'final'));
counts(
'final','total') = sum(z,counts('final',z));
counts(
'capacity',z) = capacity(z);
counts(
'capacity','total') = sum(z,capacity(z));
option counts:0;
display counts;


*----------------------------------------------------------------
* optimal reallocation
*----------------------------------------------------------------


alias(z,z1,z2);

parameter
   demand(c,z)
'current demand'
   cap(z)     
'capacity (net)'
;

demand(c,z) = popdata(c,z,
'final');
cap(z) = capacity(z);


positivevariable move(c,z1,z2) 'moves needed to meet capacity';
positivevariable alloc(c,z)  'new allocation';
variable totcost 'total cost';

equations
 eobj            
'objective'
 ealloc(c,z)      
'allocation bookkeeping'
 ecap            
'capacity limits'
;



eobj.. totcost =e=
sum((c,z1,z2),move(c,z1,z2)*cost(c,z2));
ealloc(c,z).. alloc(c,z) =e= demand(c,z) +
sum(z1,move(c,z1,z)) - sum(z2,move(c,z,z2));
ecap(z)..
sum(c,alloc(c,z)) =l= cap(z);

* not needed
move.fx(c,z,z) = 0;

model m /all/;
solve m minimizing totcost using lp;

option move:0, alloc:0;
display move.l,alloc.l,totcost.l;

counts(
'newalloc',z) = sum(c,alloc.l(c,z));
counts(
'newalloc','total') = sum(z,counts('newalloc',z));
display counts;

*----------------------------------------------------------------
* this is wrong: substitute out alloc
*----------------------------------------------------------------

equations
  ecap2(z)     
'capacity constraint (WRONG!)'
;

ecap2(z)..
sum(c, demand(c,z) + sum(z1,move(c,z1,z)) - sum(z2,move(c,z,z2))) =l= cap(z);

model m2 /eobj,ecap2/;
solve m2 minimizing totcost using lp;

parameter alloc2(c,z) 'recalculate allocations';
alloc2(c,z) = demand(c,z) +
sum(z1,move.l(c,z1,z)) - sum(z2,move.l(c,z,z2))

display move.l,totcost.l,alloc2;


*----------------------------------------------------------------
* optimal reallocation while keeping initial population where
* they were.
*----------------------------------------------------------------

demand(c,z) = popdata(c,z,
'added');
cap(z) = capacity(z) -
sum(c,popdata(c,z,'initial'));

solve m minimizing totcost using lp;

option move:0, alloc:0;
display move.l,alloc.l,totcost.l;

counts(
'newalloc',z) = counts('initial',z) + sum(c,alloc.l(c,z));
counts(
'newalloc','total') = sum(z,counts('newalloc',z));
display counts;


Appendix B: GAMS Network LP model


$ontext

  
Small LP about reallocating persons
  
Formulated as a network model

  
https://yetanothermathprogrammingconsultant.blogspot.com/2021/09/reallocate-people-very-small-but.html

  
erwin@amsterdamoptimization.com


$offtext


set
   dummy 
'for ordering (printing)'/initial,added,final/
   c0    
'category superset'       /catA*catD, all/
   c(c0) 
'category'                /catA*catD/
   z0    
'zone superset'           /zone1*zone4, sink/
   z(z0) 
'zone'                    /zone1*zone4/
;

*----------------------------------------------------------------
* data
*----------------------------------------------------------------

table popdata(c,z,*) 'population data'
                   
initial   final
   
catA.zone1         115     138
   
catA.zone2         121     145
   
catA.zone3         112     134
   
catA.zone4          76      91
   
catB.zone1          70      99
   
catB.zone2          59      83
   
catB.zone3          86     121
   
catB.zone4         139     196
   
catC.zone1         142     160
   
catC.zone2          72      81
   
catC.zone3          29      33
   
catC.zone4          58      66
   
catD.zone1          22      47
   
catD.zone2          23      49
   
catD.zone3          16      34
   
catD.zone4          45      96
;
popdata(c,z,
'added') = popdata(c,z,'final')-popdata(c,z,'initial');
option popdata:0;
display popdata;

parameter capacity(z) 'total capacity per zone'/
   
zone1   465
   
zone2   393
   
zone3   500
   
zone4   331
/;
option capacity:0;
display capacity;


parameter cost(c,z) 'cost moving to target zone'/
       
catA.zone1   0.1
       
catA.zone2   0.1
       
catA.zone3   0.1
       
catA.zone4   1.3
       
catB.zone1  16.2
       
catB.zone2  38.1
       
catB.zone3   1.5
       
catB.zone4   0.1
       
catC.zone1   0.1
       
catC.zone2  12.7
       
catC.zone3  97.7
       
catC.zone4  46.3
       
catD.zone1  25.3
       
catD.zone2   7.7
       
catD.zone3  67.3
       
catD.zone4   0.1
/;
option cost:1;
display cost;

*----------------------------------------------------------------
* network topology
*----------------------------------------------------------------

alias (z,z1,z2),(c,c1,c2);

set
  n(c0,z0)
'nodes'/
      
(catA*catD,all).(zone1*zone4)
      
all.sink
  
/
   a(c0,z0,c0,z0)
'arcs'
;

a(c,z1,
'all',z2) = yes;
a(
'all',z,'all','sink') = yes;

option a:0:2:2;
display n,a;

parameter
   acost(c0,z0,c0,z0)    
'cost of an arc'
   inflow(c0,z0)         
'exogenous in- or outflow'
   cap(c0,z0,c0,z0)      
'capacity of an arc'
;

acost(c,z1,
'all',z2)$(not sameas(z1,z2)) = cost(c,z2);
inflow(c,z) = popdata(c,z,
'final');
inflow(
'all','sink') = -sum((c,z),popdata(c,z,'final'));
cap(
'all',z,'all','sink') = capacity(z);

option acost:1:2:2,inflow:0,cap:0:2:2;
display acost,inflow,cap;


*----------------------------------------------------------------
* min cost flow network model
*----------------------------------------------------------------

positivevariable flow(c0,z0,c0,z0) 'flows along arcs';
variable totcost 'total cost';

equations
   nodbal(c0,z0)
'node balance'
   objective
;

alias(n,n1,n2);

objective.. totcost =e=
sum(a,acost(a)*flow(a));

nodbal(n1).. 
sum(a(n2,n1),flow(a)) + inflow(n1) =e= sum(a(n1,n2),flow(a));

flow.up(a)$cap(a) = cap(a);


model mincostflow /all/;
solve mincostflow minimizing totcost using lp;


option flow:0:2:2;
display flow.l, totcost.l;

parameter moves(c,z,z) 'computed from flows';
moves(c,z1,z2)$(
not sameas(z1,z2)) = flow.l(c,z1,'all',z2);
option moves:0;
display moves;




Viewing all articles
Browse latest Browse all 809

Trending Articles