Many practical optimization models contain some network structures. In virtually all cases, these networks are modeled as directed graphs. I.e., between nodes \(i\) and \(j\), we consider arcs \(i \rightarrow j\) and \(j \rightarrow i\). In graph theory, it is very customary to talk about undirected graphs. In this case, there is just one link between nodes \(i\) and \(j\). Why am I using directed graphs, with two uni-directional arcs for each pair of nodes? I will try to answer this in this post.
As an example, let's consider the shortest path problem. This is really the same as a min-cost problem. I took the following data (from [1]):
---- 36 SET i nodes
node0, node1, node2, node3, node4, node5, node6, node7, node8
---- 36 SET u undirected arcs
node1 node2 node3 node4 node5 node6 node7 node8
node0 YES YES
node1 YES YES
node2 YES YES YES
node3 YES YES
node4 YES
node5 YES
node6 YES YES
node7 YES
---- 36 PARAMETER ucost undirected cost
node1 node2 node3 node4 node5 node6 node7 node8
node0 4.0008.000
node1 8.00011.000
node2 7.0004.0002.000
node3 9.00014.000
node4 10.000
node5 2.000
node6 1.0006.000
node7 7.000
---- 36 SET a directed arcs
node0 node1 node2 node3 node4 node5 node6 node7 node8
node0 YES YES
node1 YES YES YES
node2 YES YES YES YES
node3 YES YES YES
node4 YES YES
node5 YES YES YES YES
node6 YES YES YES
node7 YES YES YES YES
node8 YES YES YES
---- 36 PARAMETER acost directed cost
node0 node1 node2 node3 node4 node5 node6 node7 node8
node0 4.0008.000
node1 4.0008.00011.000
node2 8.0007.0004.0002.000
node3 7.0009.00014.000
node4 9.00010.000
node5 4.00014.00010.0002.000
node6 2.0001.0006.000
node7 8.00011.0001.0007.000
node8 2.0006.0007.000
From this, it looks like that the undirected version is preferable: half of the arcs and half of the data. As verification:
---- 91 PARAMETER count number of arcs
undirected 14.000, directed 28.000
In addition, we need to know the source node, and the sink node. Here we model this as a sparse vector, with an exogenous supply of 1 (for the source node) and demand of 1 (for the sink node).
---- 43 PARAMETER ex exogenous supply/demand
node0 1.000, node4 -1.000
With this, a standard shortest path model with directed arcs can be formulated as:
| Shortest Path Model: Directed Arcs |
|---|
| \[\begin{align}\min\>&\sum_{(i,j)\in \color{darkblue}A} \mathit{\color{darkblue}{acost}}_{i,j} \cdot \color{darkred}f_{i,j}\\ & \sum_{(i,k)\in \color{darkblue}A} \color{darkred}f_{i,k} + \mathit{\color{darkblue}{ex}}_{k} = \sum_{(k,j)\in \color{darkblue}A} \color{darkred}f_{k,j} && \forall k\\ &\color{darkred}f_{i,j} \in [0,1] \end{align}\] |
The constraint says: the inflow for node \(k\) is equal to the outflow. The flows are positive, and we have an arc capacity of 1 (no need to ship more around). The results are:
---- 62 VARIABLE f.L flow
node4 node5 node6 node7
node0 1.000
node5 1.000
node6 1.000
node7 1.000
---- 62 VARIABLE z.L = 21.000 total cost (objective)
As expected, we see integer flows, even though we solved the problem as an LP. Pure networks yield integer solutions automatically.
So the question is how do we handle this with an undirected network. The first thing is that we need to allow for negative flows. E.g. a unit flow from node 5 to 4, will be seen as \(\color{darkred}f_{4,5}=-1\). I.e. our capacity restriction for all links with \((i,j)\in \color{darkblue}U\) becomes \(|\color{darkred}f_{i,j}| \le 1\) or \(\color{darkred}f_{i,j} \in [-1,1]\).
The flow conservation constraint does not change much, although the interpretation is a bit different. On the left-hand side, we no longer have pure inflow. We may see a negative flow \(\color{darkred}f_{i,k} \lt 0\). We could call it net inflow from nodes \(i\) with \((i,k)\in U\). The right-hand side has net-outflow to nodes \(j\) with \((k,j)\in U\) (again, we may see negative terms \(\color{darkred}f_{k,j} \lt 0\)).
The objective is more problematic. We can expect flows to be \(-1\), \(0\) or \(+1\). This means the objective becomes \[\min\sum_{(i,j)\in \color{darkblue}U} \mathit{\color{darkblue}{ucost}}_{i,j} \cdot |\color{darkred}f_{i,j}|\] This can be linearized, which leads to:
| Shortest Path Model: Undirected Arcs |
|---|
| \[\begin{align}\min\>&\sum_{(i,j)\in \color{darkblue}U} \mathit{\color{darkblue}{ucost}}_{i,j} \cdot \mathit{\color{darkred}{fabs}}_{i,j} \\ & \color{darkred}f_{i,j} \le \mathit{\color{darkred}{fabs}}_{i,j} && \forall (i,j) \in U\\ &\color{darkred}f_{i,j} \ge -\mathit{\color{darkred}{fabs}}_{i,j} && \forall (i,j) \in U \\ & \sum_{(i,k)\in \color{darkblue}U} \color{darkred}f_{i,k} + \mathit{\color{darkblue}{ex}}_{k} = \sum_{(k,j)\in \color{darkblue}U} \color{darkred}f_{k,j} && \forall k\\ &\color{darkred}f_{i,j} \in [-1,1] \\ &\mathit{\color{darkred}{fabs}}_{i,j} \in [0,1] \end{align}\] |
The results look like:
---- 85 VARIABLE f.L flow
node5 node6 node7
node0 1.000
node4 -1.000
node5 -1.000
node6 -1.000
---- 85 VARIABLE z.L = 21.000 total cost (objective)
Clearly, in this model, we have lost the advantage of the undirected graph to be smaller. The linearization of the absolute values, make this an unattractive approach.
Conclusion
Using directed networks is often easier when we need to model network flows in the mathematical programming model.