RAS, Entropy and Exponential Cones

RAS Method

RAS is a well-known algorithm to perform matrix balancing:

\[\begin{align} \min \>& \mathit{dist}(A,A^0)\\ & \sum_j a_{i,j} = u_i &&\forall i\\&\sum_i a_{i,j} = v_j &&\forall j \\ & a_{i,j} \gt 0 \end{align} \]
i.e. find a matrix \(A\) that is as close as possible to a given matrix \(A^0\) while obeying row and column sum constraints. This method is often used to "clean-up" data sets that contain data from different sources. The goal is to achieve a consistent data set as is often required by subsequent modelling efforts.

The RAS method looks like:

RAS Algorithm
Step 1	Initialization	\[A := A^0\]
Step 2	Row Scaling	\[\begin{align}&\rho_i := \frac{u_i}{\displaystyle\sum_j a_{i,j}}\\ &a_{i,j} := \rho_i a_{i,j}\end{align}\]
Step 3	Column Scaling	\[\begin{align}&\sigma_j := \frac{v_j}{\displaystyle\sum_i a_{i,j}}\\ &a_{i,j} := \sigma_j a_{i,j}\end{align}\]
Step 4	If not converged go to step 2

This algorithm works very well and is widely used. However it does not allow to add constraints on \(A\). For many applications it is important to be able to add additional constraints on \(A\). The entropy method will allow you to do that.

Entropy Method

The following optimization model gives identical results as the RAS method:

\[\begin{align} \min \>& \sum_{i,j} a_{i,j} \log \left( \frac{a_{i,j}}{a^0_{i,j}} \right) \\ & \sum_j a_{i,j} = u_i &&\forall i\\&\sum_i a_{i,j} = v_j &&\forall j \\ & a_{i,j} \gt 0 \end{align} \]

Notes:

The objective can be rewritten as: \(\sum_{i,j} [a_{i,j} \log(a_{i,j}) - a_{i,j} \log(a^0_{i,j}) ]\). The second term is linear.
Usually we set a lower bound: \(a_{i,j} \ge \varepsilon\) for some \(\varepsilon>0\).
If we really want to allow \(a_{i,j}=0\) we can replace \(\log(a_{i,j})\) by \(\log(a_{i,j}+\varepsilon ) \).
We can add extra constraints to this model. An example is when we deal with world trade flows: the sum of all net exports (i.e. export \(-\) import) should be zero.
This is a convex problem but it leads to many superbasic variables. This means NLP solvers like MINOS, SNOPT and CONOPT are struggling a bit: they like models with relative few superbasic variables. Interior point solvers like IPOPT and KNITRO solve this problem more quickly.
Mosek versions from before 2018 can also solve this as a general convex NLP. From 2018 we need to reformulate the problem (see below) [1].

In addition we may have some zeroes in the matrix \(A^0\) which we want to be kept zero. In other words, we want to preserve the sparsity pattern. The \(A^0\) matrix elements are called priors in the world of Entropy estimation. In GAMS such a model can look like:

variables A(i,j),z;

equations
   objective
   rowsum(i)
   colsum(j)
;

objective.. z =e= sum((i,j)$A0(i,j), A(i,j)*log(A(i,j)/A0(i,j)));
rowsum(i).. sum(j$A0(i,j), A(i,j)) =e= A0(i,'rowTotal');
colsum(j).. sum(i$A0(i,j), A(i,j)) =e= A0('colTotal',j);

A.L(i,j) = A0(i,j);
A.lo(i,j)$A0(i,j) = 0.0001;

model m1 /all/;
solve m1 minimizing z using nlp;

Exponential Cone

Conic programming has become an important way to model and solve certain convex problems. Some solvers support an Exponential Cone. This is a constraint of the form:

\[ x \ge y \exp\left(\frac{z}{y}\right) \]

with \(y > 0\). With a little bit of effort we can show that \(\min (x \log x) \) can be implemented using this artifact. The first thing to do is to write \(\min z\) subject to \(z \ge x \log x\). Now we can do:

\[\begin{align} & z \ge x \log x \Rightarrow \\ & -z \le x \log\left( \frac{1}{x} \right) \Rightarrow \\ & - \frac{z}{x} \le \log\left( \frac{1}{x} \right) \Rightarrow \\ & \exp\left( - \frac{z}{x} \right) \le \frac{1}{x} \Rightarrow \\ & x \exp\left( \frac{-z}{x} \right) \le 1 \end{align}\]

and we have our Exponential Cone. To verify this, we can solve the convex NLP with the cone simulated with a nonlinear constraint:

\[\begin{align} \min \>& \sum_{i,j} z_{i,j} - \sum_{i,j} a_{i,j} \log (a^0_{i,j}) \\ & \sum_j a_{i,j} = u_i &&\forall i\\&\sum_i a_{i,j} = v_j &&\forall j \\ & a_{i,j} \exp \left( \frac{-z_{i,j}}{a_{i,j}} \right) \le 1 && \forall i,j\\ & a_{i,j} \gt 0\end{align} \]

This is only to see if our derivation is correct. This formulation has not much practical use. With this model we added extra variables \(z_{i,j}\) and extra equations. We also moved the non-linearities from the objective into the constraints. For a general purpose NLP solver this is bad news. We can see that with a small test model:

Model	Solver	Iterations	Objective	Evaluation Errors
Entropy	Conopt	15	-15.7687	0
Simulated Exponential Cone	Conopt	47	-15.7687	4
Entropy	IPOPT	5	-15.7687	0
Simulated Exponential Cone	IPOPT	15	-15.7687	0
Entropy	Mosek 8	8	-15.7687	0
Simulated Exponential Cone	Mosek 8	13	-15.7687	0

Note that iteration counts are not comparable between different solvers. Obviously and as expected this conic reformulation makes no sense when using a general NLP solver. Iteration counts go up and we seem to have some overflows in the evaluation of the nonlinear constraint function (or gradient). The expression \(\exp(z/y)\) is inherently dangerous. Note that conic solvers handle this very differently inside.

Conic Modeling

Mosek will drop support for general convex NLPs [1]. It is suggested to use a conic formulation. So lets try out some other solvers that support exponential cones.

To model this for use with a convex solver, one would typically use a built-in function to express the entropy function. In CVXPY [2] we can use entr which is defined as \(-x\log(x)\). Here is a try to solve this small problem with CVXPY:

from cvxpy import *
from numpy import sign, reshape

A0 =  [[ 230 , 375 , 375 , 100 ,   0 , 685 , 215 ,   0 ,  50 ,   0 ],
       [ 330 , 405 , 419 , 175 ,  90 , 504 , 515 ,   0 , 240 , 105 ],
       [ 268 , 225 , 242 ,   0 ,  30 , 790 , 301 ,  44 , 100 ,   0 ],
       [ 595 , 380 , 638 , 275 ,  30 , 685 , 605 ,  88 , 100 , 160 ],
       [ 340 , 360 , 440 , 200 ,  30 , 755 , 475 ,  44 , 150 ,   0 ],
       [ 132 , 190 , 200 ,   0 ,   0 , 432 , 130 ,   0 ,   0 ,   0 ],
       [ 309 , 330 , 350 , 125 ,   0 , 612 , 474 ,   0 ,  50 ,  50 ],
       [ 365 , 400 , 330 , 150 ,  50 , 575 , 600 ,  44 , 150 , 110 ],
       [ 210 , 250 , 308 , 125 ,   0 , 720 , 256 ,   0 , 100 ,  50 ]]

u = [2029,2798,1998,3566,2794,1071,2305,2747,2015]
v = [2772,2910,3300,1150,240,5760,3526,220,950,495]

m = len(u)
n = len(v)

# we need to exclude cases with A0[i,j]=0 in obj and constraints
indic = [[sign(A0[i][j]) for j in range(n)] for i in range(m)]

a = Variable(n,m)

lb = a >= mul_elemwise(0.0001,indic)
ub = a <= mul_elemwise(10000.0,indic)
colsum = sum_entries(a, axis=0) == reshape(u,(1,m)) 
rowsum = sum_entries(a, axis=1) == v
cons = [lb,ub,colsum,rowsum]
obj = Minimize(sum_entries(mul_elemwise(indic,-entr(a)-mul_elemwise(log(A0),a))))
prob = Problem(obj, cons)
#prob.solve(solver=SCS, verbose=True)
prob.solve(solver=ECOS, verbose=True)

This may not be the best way to implement things: I was struggling a bit with handling the zeroes in the matrix \(A^0\). Instead of fixing \(a_{i,j}=0\), I probably should not even generate those variables. In the GAMS model above this was easy. If the solver has good presolve capabilities this does not matter as much. Any idea how to model this properly in CVXPY let me know.

The results are mixed. The output of ECOS is:

ECOS 2.0.4 - (C) embotech GmbH, Zurich Switzerland, 2012-15. Web: www.embotech.com/ECOS

It     pcost       dcost      gap   pres   dres    k/t    mu     step   sigma     IR    |   BT
0  +0.000e+00  -7.401e+05  +5e+021e+007e-011e+001e+00    ---    ---    00  - |  -  - 
1  +1.202e+03  -7.382e+05  +5e+011e+007e-018e+001e-010.88751e-02111 |  00
2  +5.098e+03  -7.301e+05  +1e+011e+008e-014e+013e-020.81904e-02111 |  00
3  +8.759e+03  -7.219e+05  +6e+001e+008e-017e+011e-020.50131e-01111 |  13
4  +2.579e+04  -6.784e+05  +2e+001e+008e-012e+024e-030.71513e-02111 |  01
5  +4.993e+04  -6.066e+05  +8e-019e-017e-014e+022e-030.57894e-02111 |  12
6  +1.094e+05  -3.718e+05  +2e-017e-015e-019e+025e-040.72952e-03111 |  01
7  +9.292e+04  -2.105e+05  +1e-014e-014e-017e+022e-040.57385e-02111 |  12
8  +4.286e+04  -6.659e+04  +3e-021e-011e-013e+028e-050.70201e-02111 |  01
9  +1.456e+04  -1.859e+04  +1e-024e-024e-028e+012e-050.71872e-02111 |  01
10  +4.757e+03  -5.169e+03  +3e-031e-021e-022e+017e-060.71361e-02100 |  01
11  +1.895e+03  -2.133e+03  +1e-035e-035e-039e+003e-060.62665e-02200 |  12
12  +3.885e+02  -5.152e+02  +3e-041e-031e-032e+006e-070.77363e-03200 |  01
13  +1.566e+02  -2.156e+02  +1e-045e-044e-049e-013e-070.62665e-02200 |  22
14  +2.161e+01  -6.152e+01  +3e-051e-041e-042e-016e-080.78331e-02100 |  11
15  -3.765e-02  -3.416e+01  +1e-055e-054e-058e-022e-080.62665e-02100 |  22
16  -1.240e+01  -2.001e+01  +2e-061e-059e-062e-025e-090.78331e-02100 |  11
17  -1.437e+01  -1.747e+01  +9e-074e-064e-068e-032e-090.62665e-02000 |  22
18  -1.546e+01  -1.616e+01  +2e-079e-078e-072e-035e-100.78339e-03100 |  11
19  -1.564e+01  -1.593e+01  +8e-084e-073e-077e-042e-100.62665e-02100 |  22
20  -1.574e+01  -1.580e+01  +2e-088e-087e-082e-044e-110.78339e-03100 |  11
21  -1.576e+01  -1.578e+01  +8e-093e-083e-087e-052e-110.62665e-02000 |  22
22  -1.577e+01  -1.577e+01  +2e-098e-097e-091e-054e-120.78339e-03100 |  11

OPTIMAL (within feastol=7.7e-09, reltol=1.1e-10, abstol=1.7e-09).
Runtime: 0.012709 seconds.

Obj = -15.76630664400414

We get the correct objective although with more iterations than IPOPT needs to solve the model with entropy objective directly as an NLP. With the SCS solver, I see more problems:

----------------------------------------------------------------------------
 SCS v1.2.6 - Splitting Conic Solver
 (c) Brendan O'Donoghue, Stanford University, 2012-2016
----------------------------------------------------------------------------
Lin-sys: sparse-indirect, nnz in A = 540, CG tol ~ 1/iter^(2.00)
eps = 1.00e-03, alpha = 1.50, max_iters = 2500, normalize = 1, scale = 1.00
Variables n = 180, constraints m = 469
Cones: primal zero / dual free vars: 19
 linear vars: 180
 exp vars: 270, dual exp vars: 0
Setup time: 2.41e-03s
----------------------------------------------------------------------------
 Iter | pri res | dua res | rel gap | pri obj | dua obj | kap/tau | time (s)
----------------------------------------------------------------------------
     0| 1.#Je+00  1.#Je+00 -1.#Je+00 -1.#Je+00 -1.#Je+00  1.#Je+00  6.79e-03 
   100| 5.58e-03  3.02e-03  1.44e-01 -6.58e+04 -4.92e+04  2.61e-10  1.07e-01 
   200| 3.31e-03  2.28e-03  1.82e-01 -5.56e+04 -3.85e+04  3.70e-10  2.36e-01 
   300| 2.52e-03  1.97e-03  2.11e-01 -4.69e+04 -3.06e+04  2.61e-10  3.53e-01 
   400| 1.92e-03  1.92e-03  2.43e-01 -3.94e+04 -2.40e+04  3.61e-10  4.70e-01 
   500| 1.55e-03  1.80e-03  2.76e-01 -3.27e+04 -1.86e+04  2.44e-10  5.84e-01 
   600| 1.25e-03  1.64e-03  3.14e-01 -2.68e+04 -1.40e+04  3.22e-10  6.98e-01 
   700| 1.10e-03  1.44e-03  3.54e-01 -2.16e+04 -1.03e+04  0.00e+00  8.07e-01 
   800| 1.14e-03  1.23e-03  3.84e-01 -1.71e+04 -7.63e+03  2.55e-10  9.14e-01 
   900| 8.94e-04  1.07e-03  4.16e-01 -1.36e+04 -5.59e+03  3.05e-10  1.02e+00 
  1000| 6.46e-04  8.99e-04  4.34e-01 -1.07e+04 -4.22e+03  3.54e-10  1.12e+00 
  1100| 5.35e-04  7.22e-04  4.47e-01 -8.39e+03 -3.20e+03  3.99e-10  1.21e+00 
  1200| 5.01e-04  5.90e-04  4.39e-01 -6.62e+03 -2.58e+03  2.19e-10  1.32e+00 
  1300| 3.90e-04  4.84e-04  4.21e-01 -5.29e+03 -2.16e+03  2.36e-10  1.43e+00 
  1400| 2.60e-04  4.10e-04  3.83e-01 -4.31e+03 -1.92e+03  2.49e-10  1.54e+00 
  1500| 3.38e-04  3.31e-04  3.38e-01 -3.52e+03 -1.74e+03  2.60e-10  1.64e+00 
  1600| 1.96e-04  3.09e-04  2.90e-01 -3.00e+03 -1.65e+03  2.69e-10  1.74e+00 
  1700| 2.13e-04  2.43e-04  2.32e-01 -2.61e+03 -1.62e+03  2.75e-10  1.83e+00 
  1800| 2.05e-04  2.24e-04  1.81e-01 -2.28e+03 -1.59e+03  2.80e-10  1.92e+00 
  1900| 1.49e-04  2.19e-04  1.42e-01 -2.07e+03 -1.56e+03  2.83e-10  2.01e+00 
  2000| 1.44e-04  2.12e-04  1.13e-01 -1.92e+03 -1.53e+03  2.86e-10  2.12e+00 
  2100| 1.41e-04  2.09e-04  8.66e-02 -1.79e+03 -1.51e+03  2.88e-10  2.22e+00 
  2200| 1.32e-04  2.06e-04  6.30e-02 -1.71e+03 -1.50e+03  2.89e-10  2.31e+00 
  2300| 1.35e-04  2.04e-04  4.97e-02 -1.64e+03 -1.49e+03  2.90e-10  2.40e+00 
  2400| 1.30e-04  2.05e-04  3.88e-02 -1.59e+03 -1.47e+03  2.91e-10  2.49e+00 
  2500| 1.26e-04  2.03e-04  3.02e-02 -1.56e+03 -1.47e+03  2.92e-10  2.57e+00 
----------------------------------------------------------------------------
Status: Solved/Inaccurate
Hit max_iters, solution may be inaccurate
Timing: Solve time: 2.57e+00s
 Lin-sys: avg # CG iterations: 3.88, avg solve time: 5.10e-05s
 Cones: avg projection time: 9.44e-04s
----------------------------------------------------------------------------
Error metrics:
dist(s, K) = 1.0027e-03, dist(y, K*) = 0.0000e+00, s'y/|s||y| = -5.5744e-11
|Ax + s - b|_2 / (1 + |b|_2) = 1.2581e-04
|A'y + c|_2 / (1 + |c|_2) = 2.0276e-04
|c'x + b'y| / (1 + |c'x| + |b'y|) = 3.0169e-02
----------------------------------------------------------------------------
c'x = -1560.6383, -b'y = -1469.2022
============================================================================

Obj = -1560.638313679461

This does not look very good.

We are working with very large data sets on similar (but more complicated) models (size indication: number of cells to estimate varies between 1e4 and 1e6). These models are currently using Mosek using their general convex NLP functionality. I am curious how this will work out with these new exponential cones. As a backup we always can resort to using NLP solvers like IPOPT and Knitro.

References

Mosek, Version 9 Roadmap, https://themosekblog.blogspot.de/2018/01/version-9-roadmap_31.html
http://www.cvxpy.org/en/latest/

RAS, Entropy and Exponential Cones

RAS Method

Entropy Method

Exponential Cone

Conic Modeling

References

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List