The questions for this post are:
- Can we model the inversion of a matrix as an LP problem? (A: yes)
- Should we do that? (A: no)
- If we insist, can we make it a bit quicker (A: yes)
Inversion
Solving a system of linear equations \(\color{darkblue}A\color{darkred}x=\color{darkblue}b\) is easy to with an LP solver. Just add a dummy objective, and you are ready to go. Inverting a matrix is not much more difficult. Just form the equations: \[\color{darkblue}A\color{darkred}X=\color{darkblue}I\] Here \(\color{darkblue}I\) is the identity matrix. After the solve. the matrix \(\color{darkred}X\) will contain the inverse of \(\color{darkblue}A\).
Test
As a test I created a \((200\times 200)\) "minij" matrix: \[\color{darkblue}a_{i,j} = \min(i,j)\] This generates an LP with 40,000 rows and columns and 8 million nonzero elements. Some results using 1 thread with Cplex as LP solver:
Default | Optimized | |
---|---|---|
GAMS generation time | 50 | 15 |
Cplex solution time | 35 | 5 |
presolve | 28 | 0 |
iteration count | 1121 | 0 |
max err | 2.8E-12 | 4.2E-11 |
So what did I do to get the faster results:
- The GAMS generation is slow for this problem. This is mainly because of this big, somewhat dense matrix with 8 million elements. Inside GAMS all data is stored in a sparse data structure. A not so sparse matrix is not really suited for this. Note: typically for a sparse problem we have fewer than 10 nonzeros per column. Often just a handful. Here we have 200 elements per column.
The equation looks like:matmul(i,j)..
sum(k,A(i,k)*Ainv(k,j)) =e= ident(i,j);
Here the index k runs the fastest. GAMS works best if the last index is the fastest, so the access pattern of Ainv is not optimal. We can rewrite the equation as:matmul(i,j)..
sum(k,A(i,k)*Ainv(j,k)) =e= ident(i,j);
This will deliver a transposed inverse. But note that my input matrix is symmetric. The inverse of a symmetric matrix is also symmetric, so actually, we get the correct inverse. This change has a significant impact on the model generation time. - The Cplex time was reduced by two measures. First I disabled the presolve. The presolver did not do much but was expensive. The second thing was to provide an advanced basis. For this LP we know the optimal basis in advance. We don't know the levels, but we know which rows and columns are basic or non-basic. All the free decision variables should become basic in the solution, and all the slacks (or rows) should become non-basic. In GAMS we can set the marginals to convey this information:
* basic
Ainv.m(i,j) = 0;
* non-basic
matmul.m(i,j) = 1;
* use advanced basis
option bratio=0;
In a solution the marginals contain the duals (for equations) or the reduced costs (for variables). On input, the zero/non-zero pattern of the marginals can be used to setup a basis. A zero marginal indicates the variable is basic, and a non-zero marginal means: non-basic. The actual value does not matter, so I chose here 1. In addition, we tell GAMS to pass on this advanced basis to the solver. The bratio option is a bit unusual. It is used as a heuristic measure for how likely the marginals form a good basis. Bratio=0 means: always accept the marginals to form an advanced basis, while bratio=1 means: always reject the use use of an advanced basis. Note: I also set the marginals for the objective variable/constraint (not shown here).
When we use the advanced basis, Cplex needs 0 simplex iterations. It just needs to do a factorization, and that's it. - The max err entry indicates the largest absolute error of the solution \[\max_{i,j} \left|\sum_k \color{darkblue}a_{i,k} \cdot \color{darkblue} x^*_{k,j} - \color{darkblue}I_{i,j}\right|\]
So we achieved some remarkable improvement. The total turnaround time went from 85 seconds to 20 seconds. Is it enough? No, not by a long shot. Let's do the same experiment in R:
Conclusion
There are better ways to invert a (dense) matrix than using an LP solver. However, there are a number of interesting tricks we can use to make things a bit less embarrassing.