![]() |
| U-Haul Moving Van [link] |
This is a strange little problem [1]. We have individuals (of different categories) and regions with a given capacity. The initial set of assigned persons fits within the capacity of each region. Now a new batch of individuals arrives with their chosen zones. Unfortunately, the total demand (old + new individuals) does not fit anymore in all cases. This means we need to reallocate people. We can do this at a price (the costs are given). What is the optimal (least cost) new allocation?
Data
The data we need to consider is:
---- 35 PARAMETER popdatapopulation data
initial added final
catA.zone1 11523138
catA.zone2 12124145
catA.zone3 11222134
catA.zone4 761591
catB.zone1 702999
catB.zone2 592483
catB.zone3 8635121
catB.zone4 13957196
catC.zone1 14218160
catC.zone2 72981
catC.zone3 29433
catC.zone4 58866
catD.zone1 222547
catD.zone2 232649
catD.zone3 161834
catD.zone4 455196
---- 44 PARAMETER capacity total capacity per zone
zone1 465, zone2 393, zone3 500, zone4 331
---- 66 PARAMETER costcost moving to target zone
zone1 zone2 zone3 zone4
catA 0.10.10.11.3
catB 16.238.11.50.1
catC 0.112.797.746.3
catD 25.37.767.30.1
---- 80 PARAMETER countssummary data
zone1 zone2 zone3 zone4 total
initial 3492752433181185
final 4443583224491573
capacity 4653935003311689
Model
| LP model |
|---|
| \[\begin{align}\min&\sum_{c,z,z'} \color{darkblue}{\mathit{cost}}_{c,z'}\cdot\color{darkred}{\mathit{move}}_{c,z,z'} \\ & \color{darkred}{\mathit{alloc}}_{c,z} = \color{darkblue}{\mathit{demand}}_{c,z} + \sum_{z'}\color{darkred}{\mathit{move}}_{c,z',z} - \sum_{z'}\color{darkred}{\mathit{move}}_{c,z,z'} && \forall c,z \\ & \sum_c \color{darkred}{\mathit{alloc}}_{c,z} \le \color{darkblue}{\mathit{capacity}}_{z} && \forall z \\ & \color{darkred}{\mathit{move}}_{c,z,z}= 0 && \forall c,z \\ & \color{darkred}{\mathit{move}}_{c,z',z} \ge 0 \\ &\color{darkred}{\mathit{alloc}}_{c,z} \ge 0 \end{align}\] |
Notes:
- The first constraint says: \[\text{allocation$=$demand $+$ inflow $-$ outflow}\] This is just proper bookkeeping.
- The model seems to deliver integer solutions, which makes it easier to interpret results. Not sure if this is guaranteed.
- We cannot substitute out the variable \(\color{darkred}{\mathit{alloc}}_{c,z}\) as it is non-negative. I'll show below what happens when we do that anyway.
- We can drop the constraint \(\color{darkred}{\mathit{move}}_{c,z,z}= 0\). This will automatically hold in the optimal solution as we are minimizing cost. If we choose to keep it, we can use fixing (setting the upper bound to zero) instead of writing a full-blown equation. For large models, it may be a good strategy to drop \(\color{darkred}{\mathit{move}}_{c,z,z} \) from the model altogether.
In any case, a good presolver will remove \(\color{darkred}{\mathit{move}}_{c,z,z} \) from the model. These variables cancel out in the first constraint, so they only appear in the objective. - Here we assumed that all people can be moved. Later on, we see how things change when we only allow the new, additional individuals are allowed to be reassigned.
---- 122 VARIABLE move.L moves needed to meet capacity
zone1 zone2 zone3
catA.zone1 6
catA.zone4 2962
catC.zone4 27
---- 122 VARIABLE alloc.L new allocation
zone1 zone2 zone3 zone4
catA 132180196
catB 9983121196
catC 187813339
catD 47493496
---- 122 VARIABLE totcost.L = 12.400total cost
---- 126 PARAMETER countssummary data
zone1 zone2 zone3 zone4 total
initial 3492752433181185
final 4443583224491573
capacity 4653935003311689
newalloc 4653933843311573
This is an interesting solution. Obviously, we need to move people away from zone 4. The model first moves a few persons from zone 1 to zone 2 to make room in zone 1. I.e., we actually move two people here to move 1 person away from zone 4. Of course, this highly depends on the cost structure.
Alternative 1: substitute out the variable \(\color{darkred}{\mathit{alloc}}_{c,z}\)
This makes the model smaller. So what is not to like. Here are the results:
---- 144 VARIABLE move.L moves needed to meet capacity
zone2 zone3
catA.zone4 3583
---- 144 VARIABLE totcost.L = 11.800total cost
---- 144 PARAMETER alloc2 recalculate allocations
zone1 zone2 zone3 zone4
catA 138.000180.000217.000 -27.000
catB 99.00083.000121.000196.000
catC 160.00081.00033.00066.000
catD 47.00049.00034.00096.000
The objective improves (from 12.4 to 11.8)! Without further analysis, it is not immediately clear why this happens. So I added a post solution calculation of the new allocations: \[\color{darkblue}{\mathit{alloc2}}_{c,z} := \color{darkblue}{\mathit{demand}}_{c,z} + \sum_{z'}\color{darkred}{\mathit{move}}^*_{c,z',z} - \sum_{z'}\color{darkred}{\mathit{move}}^*_{c,z,z'}\] This shows we have a negative allocation for category 4 and zone 4. The reason is of course that with substituting out \(\color{darkred}{\mathit{alloc}}_{c,z}\) we also removed the non-negativity restriction \(\color{darkred}{\mathit{alloc}}_{c,z} \ge 0\). As a result we get an illegal solution.
Conclusion: we cannot do this.
Alternative 2: only move new people
--- 158 VARIABLE move.L moves needed to meet capacity
zone1 zone2 zone3
catA.zone2 3
catA.zone4 132
catB.zone4 57
catC.zone4 8
catD.zone4 38
---- 158 VARIABLE alloc.L new allocation
zone1 zone2 zone3 zone4
catA 362127
catB 292492
catC 2694
catD 25641813
---- 158 VARIABLE totcost.L = 380.700 total cost
---- 162 PARAMETER counts summary data
zone1 zone2 zone3 zone4 total
initial 3492752433181185
final 4443583224491573
capacity 4653935003311689
newalloc 4653933843311573
Indeed the cost of moving just from the pool of added persons is much higher.
Network formulation
In the comments, Rob Pratt uses a network interpretation to argue that the model should indeed deliver integer solutions. A second comment further helped in simplifying the network model.
This is a good example to show how to set up a network LP in GAMS. The idea is to form a network as follows:
![]() |
| Network representation |
The dashed arrows indicate fixed, exogenous in- or outflow. The inflow into the first layer is demand. The second layer contains the zones with the arcs indicating the new allocation after moving people. The final layer collects the total number of persons. The main use for that last layer is to provide capacitated arcs and to have a single outflow. The costs are placed on the arcs between the first and second layer.
To implement this in GAMS, I first create a 2-dimensional set with nodes. The dimensions are (category, zone). I use a 'all' when there is no category, and 'sink' for the aggregated zones. So the sink is identified by ('all','sink'). This means, our arcs have 4 dimensions. These two sets can look like:
---- 98 SET nnodes
zone1 zone2 zone3 zone4 sink
catA YES YES YES YES
catB YES YES YES YES
catC YES YES YES YES
catD YES YES YES YES
all YES YES YES YES YES
---- 98 SET aarcs
all.zone1 all.zone2 all.zone3 all.zone4 all.sink
catA.zone1 YES YES YES YES
catA.zone2 YES YES YES YES
catA.zone3 YES YES YES YES
catA.zone4 YES YES YES YES
catB.zone1 YES YES YES YES
catB.zone2 YES YES YES YES
catB.zone3 YES YES YES YES
catB.zone4 YES YES YES YES
catC.zone1 YES YES YES YES
catC.zone2 YES YES YES YES
catC.zone3 YES YES YES YES
catC.zone4 YES YES YES YES
catD.zone1 YES YES YES YES
catD.zone2 YES YES YES YES
catD.zone3 YES YES YES YES
catD.zone4 YES YES YES YES
all .zone1 YES
all .zone2 YES
all .zone3 YES
all .zone4 YES
We also have some capacities, costs, and exogenous inflows. They are:
---- 112 PARAMETER acost cost of an arc
all.zone1 all.zone2 all.zone3 all.zone4
catA.zone1 0.10.11.3
catA.zone2 0.10.11.3
catA.zone3 0.10.11.3
catA.zone4 0.10.10.1
catB.zone1 38.11.50.1
catB.zone2 16.21.50.1
catB.zone3 16.238.10.1
catB.zone4 16.238.11.5
catC.zone1 12.797.746.3
catC.zone2 0.197.746.3
catC.zone3 0.112.746.3
catC.zone4 0.112.797.7
catD.zone1 7.767.30.1
catD.zone2 25.367.30.1
catD.zone3 25.37.70.1
catD.zone4 25.37.767.3
---- 112 PARAMETER inflow exogenous in- or outflow
zone1 zone2 zone3 zone4 sink
catA 13814513491
catB 9983121196
catC 160813366
catD 47493496
all -1573
---- 112 PARAMETER capcapacity of an arc
all.sink
all.zone1 465
all.zone2 393
all.zone3 500
all.zone4 331
With this we can form a standard min-cost-flow LP model:
| Min-Cost-Flow Network LP Model |
|---|
| \[\begin{align}\min&\sum_{a} \color{darkblue}c_{a} \cdot \color{darkred}f_{a} \\ & \sum_{a(j,i)} \color{darkred}f_{j,i} + \color{darkblue}{\mathit{inflow}}_i = \sum_{a(i,j)} \color{darkred}f_{i,j} && \forall i \\ & \color{darkred}f_a \in [0,\color{darkblue}{\mathit{cap}}_a] \end{align}\] |
---- 141 VARIABLE flow.L flows along arcs
all.zone1 all.zone2 all.zone3 all.zone4 all.sink
catA.zone1 1326
catA.zone2 145
catA.zone3 134
catA.zone4 2962
catB.zone1 99
catB.zone2 83
catB.zone3 121
catB.zone4 196
catC.zone1 160
catC.zone2 81
catC.zone3 33
catC.zone4 2739
catD.zone1 47
catD.zone2 49
catD.zone3 34
catD.zone4 96
all .zone1 465
all .zone2 393
all .zone3 384
all .zone4 331
---- 141 VARIABLE totcost.L = 12.400total cost
---- 146 PARAMETER moves computed from flows
zone1 zone2 zone3
catA.zone1 6
catA.zone4 2962
catC.zone4 27
Notes:
- Although we formulate the problem as a network, I still solve it as an LP. For very large problems it is better to use a specialized network solver.
- The arc costs need a bit of attention. It is important to give the arcs from layer 'demand' to 'zone' a zero cost when the zone doesn't change. In the original LP model we did not care, as we modeled this slightly differently.
- When we want to move only newly added persons, we can preprocess the data, as was demonstrated for the LP formulation.
- My first attempt had an extra layer in the graph. As a commenter noted, this could be simplified. My description and the network topology reflect this simplification.
Conclusion
This is a small but interesting model. In [1] an R/LPSolve implementation is given. When comparing this with my GAMS model (reproduced in the appendix below), I have to conclude that using R/LpSolve is slightly masochistic. The code is unwieldy and not very close to the original problem. This makes reasoning about the problem much more difficult.
Setting up the alternative network model is a very different exercise than the earlier LP models. The min-cost-flow model is quite standard, so we don't have to think long about the model equations. But we have to pay attention to the network topology: that is where the complexity is. It is interesting to compare the two modeling approaches.
References
- Minimise cost of reallocating individuals, https://stackoverflow.com/questions/69046256/minimise-cost-of-reallocating-individuals
Appendix A: GAMS LP model
set |
Appendix B: GAMS Network LP model
$ontext |

