Problem
The problem is as follows [1]:
- We have \(N\) persons.
- They have to be assigned to \(M\) groups.
- Each group has a certain, given, size.
- Each person has certain preferences to be in the same group with some other persons. For this, each person provides a list of persons (s)he would like or dislike to be in the same group with.
- Design an assignment plan that takes these preferences into account.
If we don't allow empty spots in groups, we have the following condition on the data: \[\sum_g \mathit{size}_g = N\]
Data
The data is provided as a fancy spreadsheet:
We see we have \(N=28\) persons.
The matrix with preferences has three different colored symbols:
- Yellow exclamation mark for 0
- Green tick mark for +1
- and a red x for -1
The matrix is not symmetric. E.g. there are some columns with all zero entries. Such rows do not exist. We need to read the matrix row wise: each row represents the preferences of a person. It is noted that the diagonal has all zeros, which is what I would expect.
For the problem, I want to have the total group capacity equal to \(N=28\), so I consider:
- 4 groups of 6
- 1 group of 4
High-level Model
A first attempt to model this problem is as follows. We use: \[x_{p,g} = \begin{cases} 1 & \text{if person $p$ is assigned to group $g$} \\ 0 & \text{otherwise}\end{cases} \] as our decision variable. The objective is to maximize the total number of preferences \(+1\) or \(-1\) that are honored. With this we can write:
Quadratic Model |
---|
\[\begin{align}\max & \sum_{p_1,p_2,g} \color{darkblue}{\mathit{Pref}}_{p_1,p_2} \color{darkred} x_{p_1,g} \color{darkred} x_{p_2,g} \\ & \sum_g \color{darkred}x_{p,g}=1 & \forall p \\ & \sum_p \color{darkred}x_{p,g} = \color{darkblue}{\mathit{Size}}_{g} &\forall g \\ & \color{darkred} x_{p,g} \in \{0,1\}\end{align}\] |
We can feed this into an MIQP solver. The solution looks like:
---- 80 PARAMETER solution using MIQP model
group1 group2 group3 group4 group5
aimee 1
amber-la 1
amber-le 1
andrina 1
catelyn-t 1
charlie 1
charlotte 1
cory 1
daniel 1
ellie 1
ellis 1
eve 1
grace-c 1
grace-g 1
holly 1
jack 1
jade 1
james 1
kadie 1
kieran 1
kristiana 1
lily 1
luke 1
naz 1
nibah 1
niko 1
wiki 1
zeina 1
COUNT 66664
Linearization
We can linearize this model in different ways:
- Let the solver reformulate. Cplex, for instance, will automatically reformulate this problem and will actually solve this as a linear MIP.
- DIY. We can linearize the products: \[y_{p_1,p_2,g} = x_{p_1,g}\cdot x_{p_2,g}\] using \[\begin{align} & y_{p_1,p_2,g} \le x_{p_1,g} \\& y_{p_1,p_2,g} \le x_{p_2,g} \\ & y_{p_1,p_2,g} \ge x_{p_1,g}+x_{p_2,g}-1 \end{align} \] We can save a bit by only using \(y_{p_1,p_2,g}\) for \(p_1 \le p_2\). This requires a bit of careful modeling. Let's give this a try.
Linear Model |
---|
\[\begin{align}\max & \sum_{p_1\lt p_2,g} (\color{darkblue}{\mathit{Pref}}_{p_1,p_2}+\color{darkblue}{\mathit{Pref}}_{p_2,p_1}) \color{darkred} y_{p_1,p_2,g} \\ & \sum_g \color{darkred}x_{p,g}=1 && \forall p \\ & \sum_p \color{darkred}x_{p,g} = \color{darkblue}{\mathit{Size}}_{g} && \forall g \\ & \color{darkred}y_{p_1,p_2,g} \le \color{darkred}x_{p_1,g} && \forall p_1\lt p_2, g \\ & \color{darkred}y_{p_1,p_2,g} \le \color{darkred}x_{p_2,g} && \forall p_1\lt p_2, g \\ &\color{darkred}y_{p_1,p_2,g} \ge \color{darkred}x_{p_1,g}+\color{darkred}x_{p_2,g} && \forall p_1\lt p_2, g\\ & \color{darkred} x_{p,g} \in \{0,1\} \\&\color{darkred}y_{p_1,p_2,g} \in [0,1] && \forall p_1\lt p_2, g \end{align}\] |
model | MIQP | MIP |
---|---|---|
objective | 84 | 84 |
iterations | 41841 | 67026 |
nodes | 822 | 308 |
time (seconds) | 6 | 15 |
Of course, to make me look a bit smarter, I should have only shown the node count.
Equality
In this model we optimize the sum of the achieved preferences for all persons. This may actually hurt some individuals. Let's look at the number of +1 and -1 preferences in the data and how many were honored in the solution:
---- 96 PARAMETER count preferences stated and achieved
+1 prefs -1 prefs +/-1 prefs +1 ok -1 ok +/-1 ok
aimee 9312437
amber-la 325325
amber-le 268167
andrina 5544
catelyn-t 5544
charlie 641044
charlotte 1111
cory 448448
daniel 437437
ellie 6644
ellis 11112314
eve 7744
grace-c 2222
grace-g 426426
holly 325325
jack 38113710
jade 167167
james 6410549
kadie 617415
kieran 3322
kristiana 426426
lily 27977
luke 538538
naz 628426
nibah 6644
niko 448448
wiki 9944
zeina 325325
sum 127691968667153
The left three columns summarize the input data. The last three columns is what has materialized in the solution. There is some inequality here. E.g. Charlie gets only 4 of his 10 wishes, but Jack gets 10 of his 11 preferences obeyed. It would be interesting if we can find a more equitable solution that still has a good overall performance.
Looking at the above table we can see that 86 of 127 +1 preferences were taken care of (68%) while 67 of 69 -1 preferences were obeyed (97%). There are more seats not at the table than at the table, so that asymmetry makes sense.
It is further noted that the objective value (84) can be recovered from: \[86 - (69-67) = 84\]
To increase our objective, we can increase the size of the groups (and having fewer groups). For instance if we use groups of sizes 8,8,6,6, I achieve an optimal objective of 99.
References
- Algorithms for optimal student seating arrangements, https://stackoverflow.com/questions/59599718/algorithms-for-optimal-student-seating-arrangements