Quantcast
Channel: Yet Another Math Programming Consultant
Viewing all articles
Browse latest Browse all 809

Modeling permutations

$
0
0

Introduction


In [1] the following question was posted:

Given a (square) matrix \(A\), reorder the columns such that the sum of the diagonals is maximized. We are working in an R environment.

My answer involves permutation matrices and assignment constraints. This is a useful concept for quite a few models.

Permutation Matrix


When dealing with permutations it is often a good idea to look at this in terms of a Permutation Matrix [2]. A permutation matrix \(P\) is an identity matrix with rows or columns permuted.

Interestingly, it does not matter whether we swap columns or rows here. The results are the same:


> n <- 5
> # identity matrix
> I <- diag(n)
> print(I)
[,1] [,2] [,3] [,4] [,5]
[1,] 10000
[2,] 01000
[3,] 00100
[4,] 00010
[5,] 00001
> # swap columns 2 and 4
> P1 <- I
> P1[,c(2,4)] <- I[,c(4,2)]
> print(P1)
[,1] [,2] [,3] [,4] [,5]
[1,] 10000
[2,] 00010
[3,] 00100
[4,] 01000
[5,] 00001
> # swap rows 2 and 4
> P2 <- I
> P2[c(2,4),] <- I[c(4,2),]
> print(P2)
[,1] [,2] [,3] [,4] [,5]
[1,] 10000
[2,] 00010
[3,] 00100
[4,] 01000
[5,] 00001
>


A permutation matrix has the following properties:

  1. Exactly one element is 1 in each row
  2. and exactly one element is 1 in each column 

In an optimization model, this can be modeled as a set of assignment constraints: \[\begin{align}&\sum_j p_{i,j} = 1 && \forall i \\ &\sum_i p_{i,j} =1 && \forall j \\ &p_{i,j} \in \{0,1\}\end{align}\]

Row and column permutations


From [2] we can see:


  • Applying a row permutation to an \((m\times n)\)  matrix \(A\) can be viewed as a pre-multiplication with an \((m \times m)\) permutation matrix: \[\widetilde{A}=PA\]
  • and applying a column permutation to an \((m\times n)\)  matrix \(A\) can be viewed as a post-multiplication with an \((n \times n)\) permutation matrix: \[\widetilde{A}=AP\]
As seeing is believing, here is a little experiment:


> m <- 2
> n <- 3
> # random matrix A
> set.seed(123)
> A <- matrix(runif(m*n,min=-10,max=+10),nrow=m,ncol=n)
> print(A)
[,1] [,2] [,3]
[1,] -4.248450-1.8204628.809346
[2,] 5.7661037.660348-9.088870
> # swap rows 1 and 2
> P1 <- diag(m)
> P1[,c(1,2)] <- diag(m)[,c(2,1)]
> P1 %*% A
[,1] [,2] [,3]
[1,] 5.7661037.660348-9.088870
[2,] -4.248450-1.8204628.809346
> # swap columns 1 and 3
> P2 <- diag(n)
> P2[,c(1,3)] <- diag(n)[,c(3,1)]
> A %*% P2
[,1] [,2] [,3]
[1,] 8.809346-1.820462-4.248450
[2,] -9.0888707.6603485.766103
>


Note: If we want to apply both row and column permutations, we would write: \[\widetilde{A}=P A Q\] with \(P\) and \(Q\) permutation matrices.

Optimization model


The original problem can now be formulated as a Mixed Integer Programming model. The mathematical model can look like:

Mixed Integer Programming Model
\[\begin{align}\max&\sum_i \color{darkred}y_{i,i} \\ & \color{darkred}y_{i,j} = \sum_k \color{darkblue}a_{i,k} \cdot \color{darkred}p_{k,j} \\ & \sum_j \color{darkred}p_{i,j} = 1 &&\forall i \\ & \sum_i \color{darkred}p_{i,j} = 1 &&\forall j \\ & \color{darkred}p_{i,j} \in \{0,1\} \\ & \color{darkred}y_{i,j} \>\mathbf{free}\end{align}\]


Implementation 1: OMPR


The model above can be implemented in a straightforward manner in OMPR.


> library(ompr)
> library(ompr.roi)
> library(dplyr)
> library(ROI)
> library(ROI.plugin.symphony)
> library(tidyr)
> n <- 10
> set.seed(123)
> a <- matrix(runif(n^2,min=-1,max=1),nrow=n,ncol=n)
> print(a)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] -0.424844960.913666690.779078630.92604847-0.7144000-0.90833770.33023040.50895032-0.5127611-0.73860862
[2,] 0.57661027-0.093331690.385606810.80459809-0.1709073-0.1155999-0.81031870.258442260.33611120.30620385
[3,] -0.182046160.355141270.281013630.38141056-0.17255130.5978497-0.23206070.42036480-0.1647064-0.31296706
[4,] 0.766034810.145266800.988539550.59093484-0.2623091-0.7562015-0.4512327-0.998750450.57639170.31351626
[5,] 0.88093457-0.794150630.31141160-0.95077263-0.69511050.12189600.6292801-0.04936685-0.7942707-0.35925352
[6,] -0.908887000.799649940.41706094-0.04440806-0.7223879-0.5869372-0.1029673-0.55976223-0.1302145-0.62461776
[7,] 0.05621098-0.507824530.088132050.51691908-0.5339318-0.74493670.6201287-0.240366920.96991400.56458860
[8,] 0.78483809-0.915880930.18828404-0.56718413-0.06807510.50661570.62477900.225542010.7861022-0.81281003
[9,] 0.10287003-0.34415856-0.42168053-0.36363798-0.46805470.79009070.5886846-0.296404180.7729381-0.06644192
[10,] -0.086770530.90900730-0.70577271-0.536748430.7156554-0.2510744-0.1203366-0.77772915-0.64989470.02301092
> m <- MIPModel() %>%
+ add_variable(p[i,j], i=1:n, j=1:n, type="binary") %>%
+ add_variable(y[i,j], i=1:n, j=1:n) %>%
+ add_constraint(y[i,j] == sum_expr(a[i,k]*p[k,j],k=1:n),i=1:n,j=1:n) %>%
+ add_constraint(sum_expr(p[i,j],j=1:n) == 1,i=1:n) %>%
+ add_constraint(sum_expr(p[i,j],i=1:n) == 1,j=1:n) %>%
+ set_objective(sum_expr(y[i,i], i=1:n),"max") %>%
+ solve_model(with_ROI(solver = "symphony",verbosity=1))
Starting Preprocessing...
Preprocessing finished...
with no modifications...
Problem has
120 constraints
200 variables
1300 nonzero coefficients

Total Presolve Time: 0.000000...

Solving...

granularity set at 0.000000
solving root lp relaxation
The LP value is: -7.422 [0,20]


****** Found Better Feasible Solution !
****** Cost: -7.422180


****************************************************
* Optimal Solution Found *
* Now displaying stats and best solution found... *
****************************************************

====================== Misc Timing =========================
Problem IO 0.000
======================= CP Timing ===========================
Cut Pool 0.000
====================== LP/CG Timing =========================
LP Solution Time 0.000
LP Setup Time 0.000
Variable Fixing 0.000
Pricing 0.000
Strong Branching 0.000
Separation 0.000
Primal Heuristics 0.000
Communication 0.000
Total User Time 0.000
Total Wallclock Time 0.000

====================== Statistics =========================
Number of created nodes : 1
Number of analyzed nodes: 1
Depth of tree: 0
Size of the tree: 1
Number of solutions found: 1
Number of solutions in pool: 1
Number of Chains: 1
Number of Diving Halts: 0
Number of cuts in cut pool: 0
Lower Bound in Root: -7.422

======================= LP Solver =========================
Number of times LP solver called: 1
Number of calls from feasibility pump: 0
Number of calls from strong branching: 0
Number of solutions found by LP solve: 1
Number of bounds changed by strong branching: 0
Number of nodes pruned by strong branching: 0
Number of bounds changed by branching presolver: 0
Number of nodes pruned by branching presolver: 0

==================== Primal Heuristics ====================
Time #Called #Solutions
Rounding I 0.00
Rounding II 0.00
Diving 0.00
Feasibility Pump 0.00
Local Search 0.0010
Restricted Search 0.00
Rins Search 0.00
Local Branching 0.00

=========================== Cuts ==========================
Accepted: 0
Added to LPs: 0
Deleted from LPs: 0
Removed because of bad coeffs: 0
Removed because of duplicacy: 0
Insufficiently violated: 0
In root: 0

Time in cut generation: 0.00
Time in checking quality and adding: 0.00

Time #Called In Root Total
Gomory 0.00
Knapsack 0.00
Clique 0.00
Probing 0.00
Flowcover 0.00
Twomir 0.00
Oddhole 0.00
Mir 0.00
Rounding 0.00
LandP-I 0.00
LandP-II 0.00
Redsplit 0.00

===========================================================
Solution Found: Node 0, Level 0
Solution Cost: -7.4221803085
> cat("Status:",solver_status(m),"\n")
Status: optimal
> cat("Objective:",objective_value(m),"\n")
Objective: 7.42218
> df <- get_solution(m,y[i, j])
> spread(df,key=j,value=value)[,-c(1,2)]
12345678910
10.92604847-0.738608620.508950320.77907863-0.424844960.91366669-0.51276110.3302304-0.9083377-0.7144000
20.804598090.306203850.258442260.385606810.57661027-0.093331690.3361112-0.8103187-0.1155999-0.1709073
30.38141056-0.312967060.420364800.28101363-0.182046160.35514127-0.1647064-0.23206070.5978497-0.1725513
40.590934840.31351626-0.998750450.988539550.766034810.145266800.5763917-0.4512327-0.7562015-0.2623091
5-0.95077263-0.35925352-0.049366850.311411600.88093457-0.79415063-0.79427070.62928010.1218960-0.6951105
6-0.04440806-0.62461776-0.559762230.41706094-0.908887000.79964994-0.1302145-0.1029673-0.5869372-0.7223879
70.516919080.56458860-0.240366920.088132050.05621098-0.507824530.96991400.6201287-0.7449367-0.5339318
8-0.56718413-0.812810030.225542010.188284040.78483809-0.915880930.78610220.62477900.5066157-0.0680751
9-0.36363798-0.06644192-0.29640418-0.421680530.10287003-0.344158560.77293810.58868460.7900907-0.4680547
10-0.536748430.02301092-0.77772915-0.70577271-0.086770530.90900730-0.6498947-0.1203366-0.25107440.7156554
>



Implementation 2: CVXR


The model can also be expressed conveniently in CVXR:


> library(CVXR)
> set.seed(123)
> n <- 10
> A <- matrix(runif(n^2,min=-1,max=1),nrow=n,ncol=n)
> print(A)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] -0.424844960.913666690.779078630.92604847-0.7144000-0.90833770.33023040.50895032-0.5127611-0.73860862
[2,] 0.57661027-0.093331690.385606810.80459809-0.1709073-0.1155999-0.81031870.258442260.33611120.30620385
[3,] -0.182046160.355141270.281013630.38141056-0.17255130.5978497-0.23206070.42036480-0.1647064-0.31296706
[4,] 0.766034810.145266800.988539550.59093484-0.2623091-0.7562015-0.4512327-0.998750450.57639170.31351626
[5,] 0.88093457-0.794150630.31141160-0.95077263-0.69511050.12189600.6292801-0.04936685-0.7942707-0.35925352
[6,] -0.908887000.799649940.41706094-0.04440806-0.7223879-0.5869372-0.1029673-0.55976223-0.1302145-0.62461776
[7,] 0.05621098-0.507824530.088132050.51691908-0.5339318-0.74493670.6201287-0.240366920.96991400.56458860
[8,] 0.78483809-0.915880930.18828404-0.56718413-0.06807510.50661570.62477900.225542010.7861022-0.81281003
[9,] 0.10287003-0.34415856-0.42168053-0.36363798-0.46805470.79009070.5886846-0.296404180.7729381-0.06644192
[10,] -0.086770530.90900730-0.70577271-0.536748430.7156554-0.2510744-0.1203366-0.77772915-0.64989470.02301092
> cat("sum diag of A:",sum(diag(A)))
sum diag of A: 0.7133438
> P <- Variable(n,n,boolean=T)
> Y <- Variable(n,n)
> problem <- Problem(Maximize(matrix_trace(Y)),
+ list(Y==A %*% P,
+ sum_entries(P,axis=1) == 1,
+ sum_entries(P,axis=2) == 1))
> result <- solve(problem,verbose=T)
GLPK Simplex Optimizer, v4.47
120 rows, 200 columns, 1300 non-zeros
0: obj = 0.000000000e+000 infeas = 2.000e+001 (120)
* 147: obj = -8.522275744e-002 infeas = 1.743e-014 (1)
* 213: obj = -7.422180308e+000 infeas = 4.854e-016 (1)
OPTIMAL SOLUTION FOUND
GLPK Integer Optimizer, v4.47
120 rows, 200 columns, 1300 non-zeros
100 integer variables, all of which are binary
Integer optimization begins...
+ 213: mip = not found yet >= -inf (1; 0)
+ 213: >>>>> -7.422180308e+000>= -7.422180308e+0000.0% (1; 0)
+ 213: mip = -7.422180308e+000 >= tree is empty 0.0% (0; 1)
INTEGER OPTIMAL SOLUTION FOUND
> cat("status:",result$status)
status: optimal
> cat("objective:",result$value)
objective: 7.42218
> print(result$getValue(Y))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0.92604847-0.738608620.508950320.77907863-0.424844960.91366669-0.51276110.3302304-0.9083377-0.7144000
[2,] 0.804598090.306203850.258442260.385606810.57661027-0.093331690.3361112-0.8103187-0.1155999-0.1709073
[3,] 0.38141056-0.312967060.420364800.28101363-0.182046160.35514127-0.1647064-0.23206070.5978497-0.1725513
[4,] 0.590934840.31351626-0.998750450.988539550.766034810.145266800.5763917-0.4512327-0.7562015-0.2623091
[5,] -0.95077263-0.35925352-0.049366850.311411600.88093457-0.79415063-0.79427070.62928010.1218960-0.6951105
[6,] -0.04440806-0.62461776-0.559762230.41706094-0.908887000.79964994-0.1302145-0.1029673-0.5869372-0.7223879
[7,] 0.516919080.56458860-0.240366920.088132050.05621098-0.507824530.96991400.6201287-0.7449367-0.5339318
[8,] -0.56718413-0.812810030.225542010.188284040.78483809-0.915880930.78610220.62477900.5066157-0.0680751
[9,] -0.36363798-0.06644192-0.29640418-0.421680530.10287003-0.344158560.77293810.58868460.7900907-0.4680547
[10,] -0.536748430.02301092-0.77772915-0.70577271-0.086770530.90900730-0.6498947-0.1203366-0.25107440.7156554
>


It helps that we can use the built-in functions matrix_trace and sum_entries. A CVXR model in pure matrix algebra would be more difficult to read. A model in pure matrix algebra could look like:

Model in Matrix Algebra
\[\begin{align}\max\>&\mathbf{tr}(\color{darkred}Y) \\ & \color{darkred}Y = \color{darkblue}A \color{darkred}P \\ & \color{darkred}P^T \color{darkblue}e = \color{darkblue}e \\ & \color{darkred}P\color{darkblue}e = \color{darkblue}e \\ & \color{darkred}P \in \{0,1\}^{n \times n} \\ & \color{darkred}Y \>\mathbf{free}\end{align}\]

where \(e\) is a column of ones.

It is noted that this size (\(n=10\)) can still be handled by complete enumeration. There are \(n!=3,628,800\) column permutations.

Optimize the formulation


It is possible to optimize the formulation. First, we can observe that not all \(y\)'s are used. Only the diagonal elements are of interest. We can remove all off-diagonal variables. Second, we can substitute out the remaining \(y\) variables.
This would lead to:

Mixed Integer Programming Model v2
\[\begin{align}\max&\sum_i \sum_k \color{darkblue}a_{i,k} \cdot \color{darkred}p_{k,i} \\ & \sum_j \color{darkred}p_{i,j} = 1 &&\forall i \\ & \sum_i \color{darkred}p_{i,j} = 1 &&\forall j \\ & \color{darkred}p_{i,j} \in \{0,1\} \end{align}\]

In practical models, I am usually not so worried about a bunch of extra continuous variables. Often I leave them in, so I can inspect their values if I need to debug the model. In the implementations above I printed out the \(y\) variables to observe how columns were permuted.

References


  1. Maximise diagonal of matrix by permuting columns in R, https://stackoverflow.com/questions/61565176/maximise-diagonal-of-matrix-by-permuting-columns-in-r
  2. Permutation matrix, https://en.wikipedia.org/wiki/Permutation_matrix

Viewing all articles
Browse latest Browse all 809

Trending Articles