Quantcast
Channel: Yet Another Math Programming Consultant
Viewing all articles
Browse latest Browse all 809

MIP vs greedy search

$
0
0

Problem


Assume we have \(N\) points (with \(N\) even). Find \(N/2\) pairs of points, such that the sum of the lengths of the line segments based on these pairs, is minimized. This looks like an assignment problem where we do not know in advance the partition of the nodes in two equally sized sets. 

A picture is probably better than my arduous description:

Optimal MIP solution: sum of the lengths is minimized

Let's try a MIP model.

MIP Formulation


The first thing to do is to calculate a distance matrix between points \(i\) and \(j\). As this matrix is symmetric, we only need to store the upper-triangular part (with \(i\lt j\)). That means we only need to store a little bit less than half the number of entries: \[\mathit{ndist}=\frac{1}{2}N(N-1)\] 

If we use as variables \[x_{i,j} = \begin{cases} 1 & \text{if nodes $i$ and $j$ are connected}\\ 0 & \text{otherwise}\end{cases}\] we need to think a bit about symmetry. If nodes 1 and 2 are connected, then we only want to see \(x_{1,2}=1\) while ignoring \(x_{2,1}=1\). So again, we only consider the variable \(x_{i,j}\) when \(i \lt j\). Again, this saves about half the number of variables we otherwise would use.

The model can look like: 

MIP Model
\[\begin{align}\min& \sum_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i\\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

The above picture is the optimal solution from this MIP model.

The constraint looks a bit strange. It basically says: every node \(i\) is either a start- or an end-point of a single segment. Somewhat surprising, at first sight, is that this is enough to characterize our graph.

Sidenote

We can see in a picture how the constraint for \(i=4\) is formed:


The blue zone must contain exactly one 1. If we would pick \(j=7\), i.e. \(x_{4,7}=1\), the picture becomes:


It turns out this MIP model is fairly easy to solve. The example problem with \(N=50\) points solves in a matter of seconds. For \(N=500\) points, we get a large MIP with 124,750 binary variables and 500 constraints. Cplex solves this model quickly to optimality in 90 seconds on my laptop.

Greedy heuristic


An obvious greedy heuristic is as follows:

  1. Find the pair with the shortest distance \[\min_{i \lt j} d_{i,j}\] and record this segment.
  2. Remove the two nodes that belong to this segment from the problem.
  3. If there are still nodes left, go to step 1.

This heuristic is exceedingly simple to implement. Some results for our 50 point example:

MIP vs Greedy
PointsOptimal MIP ObjectiveGreedy Objective
50241.2667339.236

The greedy algorithm is really astonishingly bad. The animation below shows why: with a simple greedy algorithm we are doing very good initially, but pay the price in the end.

Greedy algorithm in action


Bonus picture


A slightly different way to look at this behavior is to observe the distribution of the lengths of the segments.

Lengths of the segments: greedy looses badly at the end

The \(x\)-axis is just the sorted lengths for the MIP solution. For the greedy algorithm, the \(x\)-axis is actually the order in which the pairs are found.  We can distinguish three areas:

  • For the first, very small segments, the MIP solution and the greedy algorithm pick the same pairs. No difference.
  • In the middle, the greedy algorithm is doing a little bit better: it picks shorter segments than the MIP solver.
  • But at the end, the greedy algorithm totally collapses: it has to choose very long segments.

References




Viewing all articles
Browse latest Browse all 809

Trending Articles