CVXPY[1] is a popular modeling tool for convex models. It rigorously checks the model is convex which is very convenient: many convex solvers are thoroughly confused when passing on a non-convex model. This is a bit different from say passing on a non-convex model to a local NLP solver. In that case the solver will accept it and try to find local solutions.
In addition CVXPY provides many high level functions (e.g. for different norms, etc.). and a very compact matrix based modeling paradigm. Although matrix notation can be very powerful, there are limits. When overdoing things, the notation becomes less intuitive.
Example: Transportation model
In this section I'll discuss some modeling issues when implementing a simple transportation model in CVXPY, and compare this to a standard GAMS implementation.
As an example consider the standard transportation model.
Transportation Model |
---|
\[\begin{align}\min&\sum_{i,j}\color{darkblue}c_{i,j}\color{darkred}x_{i,j}\\ & \sum_i\color{darkred}x_{i,j} \ge \color{darkblue} d_j && \forall j\\ &\sum_j\color{darkred}x_{i,j}\le \color{darkblue} s_i && \forall i \\ & \color{darkred}x_{i,j} \ge 0 \end{align}\] | \[\begin{align}\min\>\>&\mathbf{tr}(\color{darkblue}C^T \color{darkred}X) \\ &\color{darkred}X^T \color{darkblue}e \ge \color{darkblue} d \\ &\color{darkred}X \color{darkblue}e \le \color{darkblue}s\\ & \color{darkred}X\ge 0\end{align}\] |
Here \(e\) indicates a column vector of ones of appropriate size. The model in matrix notation is identical to the equation based model on the left. The matrix based model is compacter, but arguably a bit more difficult to read for most readers (me included).
One important way to help matrix models to be more readable, is to add a summation function. As a matrix model does not have indices, we need other ways to indicate what to sum over. This is expressed as the (optional)
axis argument. This means the model above can be expressed in CVXPY as
importnumpyasnp
importcvxpyascp
#----- data -------
capacity = np.array([350, 600])
demand = np.array([325, 300, 275])
distance = np.array([[2.5, 1.7, 1.8],
[2.5, 1.8, 1.4]])
freight =90
cost = freight*distance/1000
#------ set up LP data --------
C = cost
d = demand
s = capacity
#---- matrix formulation ----
ei = np.ones(s.shape)
ej = np.ones(d.shape)
X = cp.Variable(C.shape,"X")
prob = cp.Problem(
cp.Minimize(cp.trace(C.T@X)),
[X.T@ei>= d,
X@ej<= s,
X>=0])
prob.solve(verbose=True)
print("status:",prob.status)
print("objective:",prob.value)
print("levels:",X.value)
#---- summations ----
prob2 = cp.Problem(
cp.Minimize(cp.sum(cp.multiply(C,X))),
[cp.sum(X,axis=0) >= d,
cp.sum(X,axis=1) <= s,
X >=0])
prob2.solve(verbose=True)
print("status:",prob2.status)
print("objective:",prob2.value)
print("levels:",X.value)
The matrix model follows the mathematical model closely. The second model with the summations requires some explanation. In Python * is for elementwise multiplication and @ for matrix multiplication. In CVXPY however, @, * and
matmul are for matrix multiplication while
multiply is for elementwise multiplication. A
sum without axis argument sums over all elements, while
axis=0 sums over the first index, and
axis=1 sums over the second index.
The data for this model is from [2]. The lack of explicit indexing has another consequence. All data is determined by its position. I.e. we need to remember that position 0 means a canning plant in Seattle or a demand market in New York,
I am not so sure about the readability of these two CVXPY models. I think I still prefer the indexed model.
CVXPY uses OSQP [3] as default LP and QP solver. Here is the log:
-----------------------------------------------------------------
OSQP v0.6.0- Operator Splitting QP Solver
(c) Bartolomeo Stellato, Goran Banjac
University of Oxford - Stanford University 2019
-----------------------------------------------------------------
problem: variables n = 6, constraints m = 11
nnz(P) + nnz(A) = 18
settings: linear system solver = qdldl,
eps_abs = 1.0e-05, eps_rel = 1.0e-05,
eps_prim_inf = 1.0e-04, eps_dual_inf = 1.0e-04,
rho = 1.00e-01 (adaptive),
sigma = 1.00e-06, alpha = 1.60, max_iter = 10000
check_termination: on (interval 25),
scaling: on, scaled_termination: off
warm start: on, polish: on, time_limit: off
iter objective pri res dua res rho time
1-2.4113e+003.32e+027.31e+001.00e-015.47e-05s
2001.5368e+022.77e-032.44e-071.23e-034.30e-02s
status: solved
solution polish: unsuccessful
number of iterations: 200
optimal objective: 153.6751
run time: 4.31e-02s
optimal rho estimate: 2.60e-03
status: optimal
objective: 153.67514323621972
levels: [[ 3.17321312e+013.00004719e+022.71849569e-03]
[ 2.93267895e+02-2.77023874e-032.74995427e+02]]
The levels indicate OSQP is pretty sloppy here: we see some negative values of the order -1e-3. The real optimal objective is 153.675000 (so even though the solution is slightly infeasible, this did not help in achieving a better objective). Sometimes OSQP does better: if the
solution polishing step works. This polishing is a bit like a poor man's crossover: it tries to guess the active constraints. In this case polishing did not work.
The number of iterations is large. This is normal: this solver uses a first order algorithm. They typically require a lot of iterations.
For completeness, the original GAMS model looks like:
Set
i 'canning plants' / seattle, san-diego /
j 'markets' / new-york, chicago, topeka /;
Parameter
a(i) 'capacity of plant i in cases'
/ seattle 350
san-diego 600 /
b(j) 'demand at market j in cases'
/ new-york 325
chicago 300
topeka 275 /;
Table d(i,j) 'distance in thousands of miles'
new-york chicago topeka
seattle 2.51.71.8
san-diego 2.51.81.4;
Scalar f 'freight in dollars per case per thousand miles' / 90 /;
Parameter c(i,j) 'transport cost in thousands of dollars per case';
c(i,j) = f*d(i,j)/1000;
Variable
x(i,j) 'shipment quantities in cases'
z 'total transportation costs in thousands of dollars';
Positive Variable x;
Equation
cost 'define objective function'
supply(i) 'observe supply limit at plant i'
demand(j) 'satisfy demand at market j';
cost.. z =e= sum((i,j), c(i,j)*x(i,j));
supply(i).. sum(j, x(i,j)) =l= a(i);
demand(j).. sum(i, x(i,j)) =g= b(j);
Model transport / all /;
solve transport using lp minimizing z;
display x.l, x.m;
The main differences with the CVXPY model are:
- GAMS indexes by names (set elements), CVXPY uses positions in a matrix or vector
- GAMS is equation based while CVXPY uses matrices
- For this model, the CVXPY representation is be very compact, terse but depending on the formulation it requires familiarity with matrix notation.
- Arguably, the most important feature of a modeling tool is readability. I have a preference to the GAMS notation here: it is closer to the original mathematical model in a notation I am used to.
- When printing the results, GAMS is a bit more intuitive:
GAMS:
---- 66 VARIABLE x.L shipment quantities in cases
new-york chicago topeka
seattle 50.000300.000
san-diego 275.000275.000
Python:
[[ 3.17321312e+013.00004719e+022.71849569e-03]
[ 2.93267895e+02 -2.77023874e-032.74995427e+02]]
Example: non-convex binary quadratic optimization
Consider the binary quadratic programming problem (in indexed format and in matrix notation):
Binary Quadratic Model |
---|
\[\begin{align}\min&\sum_{i,j} \color{darkred}x_i \color{darkblue}q_{i,j}\color{darkred}x_j\\ & \color{darkred}x_i \in \{0,1\} \end{align}\] | \[\begin{align}\min\>\>& \color{darkred}x^T\color{darkblue}Q\color{darkred}x \\ &\color{darkred}x \in \{0,1\} \end{align}\] |
We don't assume the Q matrix is positive (semi) definite or even symmetric. This makes the problem non-convex. Interestingly, when we feed this problem into solvers like Cplex or Gurobi, they have no problem in finding the optimal solution. The reason is that they apply a trick to make this problem linear.
We can linearize the binary product \(y_{i,j} = x_i x_j\) by \[\begin{align} & y_{i,j} \le x_i \\& y_{i,j} \le x_j \\& y_{i,j} \ge x_i + x_j -1 \\ & x_i, y_{i,j} \in \{0,1\}\end{align}\] If we want we can relax \(y\) to be continuous between 0 and 1.
After applying this linearization, we have:
Linearized Binary Quadratic Model |
---|
\[\begin{align}\min&\sum_{i,j} \color{darkblue}q_{i,j}\color{darkred}y_{i,j}\\ & \color{darkred}y_{i,j} \le \color{darkred}x_i\\ & \color{darkred}y_{i,j} \le \color{darkred}x_j\\ & \color{darkred}y_{i,j} \ge \color{darkred}x_i+ \color{darkred}x_j - 1\\ & \color{darkred}x_i,\color{darkred}y_{i,j} \in \{0,1\} \end{align}\] | \[\begin{align}\min\>\>& \mathbf{tr}(\color{darkblue}Q^T\color{darkred}Y) \\ & \color{darkred}Y \le \color{darkred}x \cdot\color{darkblue}e^T \\ & \color{darkred}Y \le \color{darkblue}e \cdot\color{darkred}x^T \\& \color{darkred}Y \ge \color{darkred}x \cdot\color{darkblue}e^T + \color{darkblue}e \cdot\color{darkred}x^T - \color{darkblue}e \cdot \color{darkblue}e^T\\ &\color{darkred}x, \color{darkred}Y \in \{0,1\} \end{align}\] |
The matrix form of the objective is similar to the one we saw in the section on the transportation problem. The constraints are a little bit more complicated due to the outer products.
Although a solver like Cplex and Gurobi can solve the quadratic formulation directly, CVXPY will complain with the message:
cvxpy.error.DCPError: Problem does not follow DCP rules. Specifically:
The objective is not DCP. Its following subexpressions are not:x * [[-6.56505736e+006.86533416e+001.00750712e+00 -3.97724192e+00
-4.15575766e+00 -5.51894266e+00 -3.00338992e+007.12540694e+00
-8.65772554e+004.21338000e-03]
[ 9.96235254e+001.57466756e+009.82266078e+005.24500934e+00
-7.38615034e+002.79437518e+00 -6.80964272e+00 -4.99838934e+00
3.37857218e+00 -1.29287238e+00]
[-2.80599468e+00 -2.97117264e+00 -7.37016820e+00 -6.99796424e+00
1.78227300e+006.61785624e+00 -5.38368524e+003.31468920e+00
5.51715212e+00 -3.92683046e+00]
[-7.79015418e+004.76973200e-02 -6.79654476e+007.44924622e+00
-4.69770910e+00 -4.28371356e+001.87911844e+004.45438142e+00
2.56497354e+00 -7.24042700e-01]
[-1.73386012e+00 -7.64609286e+00 -3.71575466e+00 -9.06896972e+00
-3.22899456e+00 -6.35800814e+002.91454254e+001.21491094e+00
5.39923440e+00 -4.04388272e+00]
[ 3.22212522e+005.11643348e+002.54894998e+00 -4.32271604e+00
-8.27150752e+00 -7.94970662e+002.82502302e+009.06189960e-01
-9.36950296e+005.84721284e+00]
[-8.54466004e+00 -6.48677902e+005.12652260e-015.00415338e+00
-6.43752572e+00 -9.31718028e+001.70262346e+002.42459968e+00
-2.21276200e+00 -2.82571694e+00]
[-5.13930766e+00 -5.07156922e+00 -7.38994394e+008.66899440e+00
-2.40124188e+005.66800922e+00 -3.99931484e+00 -7.49033556e+00
4.97748210e+00 -8.61535074e+00]
[-5.95968886e+00 -9.89868284e+00 -4.60773896e+00 -2.97050000e-03
-6.97428262e+00 -6.51661090e+00 -3.38724532e+00 -3.66187892e+00
-3.55826090e+009.27953282e+00]
[ 9.87204410e+00 -2.60193890e+00 -2.54222866e+005.43956660e+00
-2.06631716e+008.26192650e+00 -7.60844540e+004.70957778e+00
-8.89163050e+001.52599610e+00]] * x
A linearized formulation can look like:
import numpy as np
import cvxpy as cp
# -------- data ---------
Q = np.array([
[-6.56505736, 6.86533416, 1.00750712, -3.97724192, -4.15575766, -5.51894266, -3.00338992, 7.12540694, -8.65772554, 0.00421338],
[ 9.96235254, 1.57466756, 9.82266078, 5.24500934, -7.38615034, 2.79437518, -6.80964272, -4.99838934, 3.37857218, -1.29287238],
[-2.80599468, -2.97117264, -7.3701682 , -6.99796424, 1.782273 , 6.61785624, -5.38368524, 3.3146892 , 5.51715212, -3.92683046],
[-7.79015418, 0.04769732, -6.79654476, 7.44924622, -4.6977091 , -4.28371356, 1.87911844, 4.45438142, 2.56497354, -0.7240427 ],
[-1.73386012, -7.64609286, -3.71575466, -9.06896972, -3.22899456, -6.35800814, 2.91454254, 1.21491094, 5.3992344 , -4.04388272],
[ 3.22212522, 5.11643348, 2.54894998, -4.32271604, -8.27150752, -7.94970662, 2.82502302, 0.90618996, -9.36950296, 5.84721284],
[-8.54466004, -6.48677902, 0.51265226, 5.00415338, -6.43752572, -9.31718028, 1.70262346, 2.42459968, -2.212762 , -2.82571694],
[-5.13930766, -5.07156922, -7.38994394, 8.6689944 , -2.40124188, 5.66800922, -3.99931484, -7.49033556, 4.9774821 , -8.61535074],
[-5.95968886, -9.89868284, -4.60773896, -0.0029705 , -6.97428262, -6.5166109 , -3.38724532, -3.66187892, -3.5582609 , 9.27953282],
[ 9.8720441 , -2.6019389 , -2.54222866, 5.4395666 , -2.06631716, 8.2619265 , -7.6084454 , 4.70957778, -8.8916305 , 1.5259961 ]])
n = Q.shape[0]
# ---- linearized model, matrix format -----
x = cp.Variable((n,1),"x",boolean=True)
Y = cp.Variable((n,n),"Y")
e = np.ones((n,1))
prob = cp.Problem(cp.Minimize(cp.trace(Q.T@Y)),
[Y <= x@e.T,
Y <= e@x.T,
Y >= x@e.T + e@x.T - e@e.T,
Y >= 0,
Y <= 1])
prob.solve(solver=cp.GLPK_MI,verbose=True)
print("status:",prob.status)
print("objective:",prob.value)
print("levels:",x.value)
Notes:
- The objective can be replaced by cp.Minimize(cp.multiply(Q,Y))
- I relaxed the Y variables
- We use glpk as the MIP solver
- CVXPY comes with an integer solver called ECOS_BB. This solver seems to choke on this problem.
The original GAMS model for this problem was as follows:
set i /i1*i10/;
alias(i,j);
parameter q(i,j);
q(i,j) = uniform(-10,10);
binary variable x(i);
variable z;
equation obj;
obj.. z =e= sum((i,j), x(i)*q(i,j)*x(j));
model m /obj/;
option miqcp=cplex,optcr=0;
solve m minimizing z using miqcp;
display z.l,x.l;
Cplex will automatically linearize this model.
CVXPY Sparse Variables
CVXPY has some severe limitations on how variables can look like.
First we cannot use three (or more) dimensional variables. So something like
x[i,j,k] is not supported. A declaration like:
X = cp.Variable((n,n,n),"X")
gives:
ValueError: Expressions of dimension greater than 2 are not supported.This is a rather severe restriction. Many practical model have variables exceeding 2 dimensions. Of course matrix notation becomes impractical for symbols with more than 2 dimensions. Which is probably the reason why CVXPY ony wants to handle scalars, vectors and matrices.
Furthermore sparse variables are not supported either. Everything is fully allocated. As an example consider the toy model:
n =100
X = cp.Variable((n,n),"X")
prob = cp.Problem(
cp.Minimize(0),
[X[0,0]==1])
Solver Log:
problem: variables n =10000, constraints m =1
My hope was that I could just declare a large variable matrix and that CVXPY would only export the used variables to the solver. Instead of the solver seeing a model with one variable, it receives a model with 10,000 variables.
Example: a Sparse Network Model
A linear programming formulation for a max-flow network problem can look like:
Max-flow Sparse Network Model |
---|
\[\begin{align}\max\>\>&\color{darkred}f\\ & \sum_{j|\color{darkblue}{\mathit{arc}}(j,i)}\color{darkred}x_{j,i} = \sum_{j|\color{darkblue}{\mathit{arc}}(i,j)}\color{darkred}x_{i,j} + \color{darkred}f\cdot \color{darkblue}b_i && \forall i \\ & 0 \le \color{darkred}x_{i,j} \le \color{darkblue}{\mathit{capacity}_i} && \forall i,j|\color{darkblue}{\mathit{arc}}(i,j) \end{align}\] |
Here \(arc(i,j)\) indicates whether link \(i \rightarrow j\) exists. The sparse data vector \(b_i\) is defined by \[b_i = \begin{cases} -1 & \text{if node $i$ is the source node}\\ +1 & \text{if node $i$ is the sink node}\\ 0 & \text{otherwise} \end{cases}\]
This mathematical model translates directly into a GAMS model:
$ontext
max flow network example
Data from example in
Mitsuo Gen, Runwei Cheng, Lin Lin
Network Models and Optimization: Multiobjective Genetic Algorithm Approach
Springer, 2008
Erwin Kalvelagen, Amsterdam Optimization, May 2008
$offtext
sets
i 'nodes' /node1*node11/
source(i) /node1/
sink(i) /node11/
;
alias(i,j);
parameter capacity(i,j) /
node1.node2 60
node1.node3 60
node1.node4 60
node2.node3 30
node2.node5 40
node2.node6 30
node3.node4 30
node3.node6 50
node3.node7 30
node4.node7 40
node5.node8 60
node6.node5 20
node6.node8 30
node6.node9 40
node6.node10 30
node7.node6 20
node7.node10 40
node8.node9 30
node8.node11 60
node9.node10 30
node9.node11 50
node10.node11 50
/;
set arcs(i,j);
arcs(i,j)$capacity(i,j) = yes;
display arcs;
parameter rhs(i);
rhs(source) = -1;
rhs(sink) = 1;
variables
x(i,j) 'flow along arcs'
f 'total flow'
;
positive variables x;
x.up(i,j) = capacity(i,j);
equations
flowbal(i) 'flow balance'
;
flowbal(i).. sum(arcs(j,i), x(j,i)) - sum(arcs(i,j), x(i,j)) =e= f*rhs(i);
model m /flowbal/;
solve m maximizing f using lp;
The GAMS model exploits that GAMS stores data sparsely. The variables
x(i,j) are only allocated when they are used inside the equations. This usage is restricted to cases where
arcs(i,j) exist. I.e. the number of variables
x(i,j) is 22 instead of \(11 \times 11\).
As we discussed in the previous section, CVXPY does not support sparse variables like GAMS. So instead of variables
x(i,j) we'll use
x[k] where
k indicates the arc number. CVXPY supports sparse data matrices through
scipy.sparse. In the code below we set up a sparse matrix
A with entries as follows:- A[i,k] = -1 if arc \(k\) represents an outgoing link \(i \rightarrow j\)
- A[i,k] = +1 if arc \(k\) represents an incoming link \(j \rightarrow i\)
With this we can formulate the model:
import numpy as np
import scipy.sparse as sparse
import cvxpy as cp
# ------ data --------
data = {
'nodes':['A','B','C','D','E','F','G','H','I','J','K'],
'from':['A','A','A','B','B','B','C','C','C','D','E',
'F','F','F','F','G','G','H','H','I','I','J'],
'to': ['B','C','D','C','E','F','D','F','G','G','H',
'E','H','I','J','F','J','I','K','J','K','K'],
'capacity': [60,60,60,30,40,30,30,50,30,40,60,20,30,40,30,20,40,30,60,30,50,50],
'source' : 'A',
'sink' : 'K'
}
numnodes = len(data['nodes'])
numarcs = len(data['capacity'])
print("Number of nodes: {}".format(numnodes))
print("Number of arcs: {}".format(numarcs))
# ------ lp data --------
# map node name to index
map = dict(zip(data['nodes'],range(numnodes)))
# coefficients
irow = np.zeros(2*numarcs,int)
jcol = np.zeros(2*numarcs,int)
val = np.zeros(2*numarcs)
# arc k: i->j has coefficient -1 in row i, column k
# +1 j
for k in range(numarcs):
i = map[data['from'][k]]
j = map[data['to'][k]]
kk = 2*k
irow[kk] = i
jcol[kk] = k
val[kk] = -1
kk = 2*k+1
irow[kk] = j
jcol[kk] = k
val[kk] = 1
A = sparse.csc_matrix((val,(irow,jcol)))
b = np.zeros(numnodes)
b[map[data['source']]] = -1
b[map[data['sink']]] = 1
cap = data['capacity']
# ------ lp model --------
x = cp.Variable(numarcs,"x")
f = cp.Variable(1,"f")
prob = cp.Problem(cp.Maximize(f),
[A@x == f*b, x >= 0, x <= cap])
prob.solve(verbose=True)
print(prob)
With this experiment, we have confirmed that sparse data matrices work just fine with CVXPY. However, this makes the model not that straightforward anymore. This is more complicated than the corresponding GAMS model.
See [4] for an alternative approach.
Example: Matrix Balancing
In this example we want to estimate the inner part of a matrix subject to row- and column-total constraints. This problem is frequently encountered in economic modeling. An additional constraint is that we want to maintain the sparsity pattern of the matrix. Basically the model is:
Matrix Balancing Model |
---|
\[\begin{align}\min\>\>&\mathbf{dist}(\color{darkred}A,\color{darkblue}A^0)\\ & \sum_i\color{darkred}a_{i,j} = \color{darkblue} v_j && \forall j\\ &\sum_j\color{darkred}a_{i,j} = \color{darkblue} u_i && \forall i \\ & \color{darkblue}a^0_{i,j}=0 \Rightarrow\color{darkred}a_{i,j} = 0 \end{align}\] |
There are different possibilities for the distance function. E.g. it can be a quadratic function \[\mathbf{dist}(A,A^0) = \sum_{i,j} (a_{i,j}-a^0_{i,j})^2\] or in this case an entropy function \[\mathbf{dist}(A,A^0) =\sum_{i,j} a_{i,j}\log\left(\frac{a_{i,j}}{a^0_{i,j}}\right)\]
Often the implication is enforced by just ignoring or skipping all elements \(a_{i,j}\) where \(a^0_{i,j}=0\). This leads again to a sparse representation of the variables. In GAMS this is quite easy.
$ontext
Example from
Using PROC IML to do Matrix Balancing
Carol Alderman, University of Kansas
Institute for Public Policy and Business Research
MidWest SAS Users Group MWSUG 1992
$offtext
sets
p 'products' /pA*pI/
s 'salesmen' /s1*s10/
;
table A0(*,*) 'estimated matrix, known totals'
s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 rowTotal
pA 230375375100685215502029
pB 330405419175905045152401052798
pC 26822524230790301441001998
pD 59538063827530685605881001603566
pE 34036044020030755475441502794
pF 1321902004321301071
pG 30933035012561247450502305
pH 36540033015050575600441501102747
pI 210250308125720256100502015
colTotal 277229103300115024057603526220950495
;
alias (p,i);
alias (s,j);
variables
A(i,j) 'new values'
z 'objective (minimized)'
;
equations
objective
rowsum(i)
colsum(j)
;
objective.. z =e= sum((i,j)$A0(i,j), A(i,j)*log(A(i,j)/A0(i,j)));
rowsum(i).. sum(j$A0(i,j), A(i,j)) =e= A0(i,'rowTotal');
colsum(j).. sum(i$A0(i,j), A(i,j)) =e= A0('colTotal',j);
A.L(i,j) = A0(i,j);
A.lo(i,j)$A0(i,j) = 0.0001;
model m /all/;
solve m minimizing z using nlp;
display A.L,z.l;
When solved with the general purpose NLP solver CONOPT, we get the following results:
---- 58 VARIABLE A.L new values
s1 s2 s3 s4 s5 s6 s7 s8 s9
pA 229.652374.869375.099100.028686.238212.54250.572
pB 330.587406.192420.492175.62694.300506.575510.790243.545
pC 267.313224.685241.80931.297790.595297.24644.017101.037
pD 595.262380.610639.416275.61431.391687.580599.25288.299101.341
pE 339.659360.058440.341200.15831.346756.750469.80944.086151.793
pF 130.330187.815197.821427.953127.080
pG 309.478330.895351.165125.418614.984470.01650.727
pH 360.594395.631326.596148.45551.665569.947586.86743.598150.111
pI 209.123249.246307.260124.701719.378252.398100.874
+ s10
pB 109.894
pD 167.233
pG 52.318
pH 113.535
pI 52.019
---- 58 VARIABLE z.L = -15.769 objective (minimized)
CVXPY does not allow to skip elements like GAMS does. Well, unless we build the whole model in scalar mode: element by element. That is not very attractive so let's try another way. It will be a bit of a struggle. Let's define a (data) matrix D as \[d_{i,j} = \begin{cases} 1 & \text{if $a^0_{i,j} \ne 0$}\\ 0 & \text{if $a^0_{i,j} = 0$}\end{cases}\] This matrix can be used to skip linear terms that are not needed. The objective is another story. CVXPY has the function
entr() which is defined by: \(-x\log(x)\). We expand the objective as: \[\sum_{i,j} a_{i,j}\log\left(\frac{a_{i,j}}{a^0_{i,j}}\right) = \sum_{i,j} - \mathbf{entr}(a_{i,j}) - a_{i,j}\log(a^0_{i,j})\] Finally we insert \(d\) to ignore the zero's: \[ \min \sum_{i,j} - d_{i,j} \mathbf{entr}(a_{i,j}) - d_{i,j} a_{i,j}\log(a^0_{i,j}+1-d_{i,j})\] This is quite some gymnastics to shoehorn our model into an acceptable CVXPY format.
import numpy as np
import cvxpy as cp
# -------- data ----------
A0 = [[ 230 , 375 , 375 , 100 , 0 , 685 , 215 , 0 , 50 , 0 ],
[ 330 , 405 , 419 , 175 , 90 , 504 , 515 , 0 , 240 , 105 ],
[ 268 , 225 , 242 , 0 , 30 , 790 , 301 , 44 , 100 , 0 ],
[ 595 , 380 , 638 , 275 , 30 , 685 , 605 , 88 , 100 , 160 ],
[ 340 , 360 , 440 , 200 , 30 , 755 , 475 , 44 , 150 , 0 ],
[ 132 , 190 , 200 , 0 , 0 , 432 , 130 , 0 , 0 , 0 ],
[ 309 , 330 , 350 , 125 , 0 , 612 , 474 , 0 , 50 , 50 ],
[ 365 , 400 , 330 , 150 , 50 , 575 , 600 , 44 , 150 , 110 ],
[ 210 , 250 , 308 , 125 , 0 , 720 , 256 , 0 , 100 , 50 ]]
u = [2029,2798,1998,3566,2794,1071,2305,2747,2015]
v = [2772,2910,3300,1150,240,5760,3526,220,950,495]
m = len(u)
n = len(v)
# -------- model ----------
D = np.sign(A0)
Dloga0 = D * np.log(A0+np.ones_like(D)-D)
A = cp.Variable((m,n),"A")
obj = cp.Minimize(cp.sum(cp.multiply(D,-cp.entr(A)) - cp.multiply(A,Dloga0)))
cons = [cp.sum(cp.multiply(D,A),axis=1)==u,
cp.sum(cp.multiply(D,A),axis=0)==v,
A >= 0]
prob = cp.Problem(obj,cons)
prob.solve(solver=cp.SCS,verbose=True,max_iters=200000)
Solving this tiny model is very difficult. It is using exponential cones and that is not very robustly implemented in the solvers. With SCS and a lot of iterations, we finally see:
101400| 7.74e-059.08e-057.16e-04 -1.56e+01 -1.56e+013.30e-111.14e+02
101500| 7.72e-059.10e-055.74e-04 -1.54e+01 -1.54e+013.30e-111.14e+02
101600| 7.69e-059.13e-055.30e-04 -1.51e+01 -1.51e+017.77e-111.14e+02
101700| 7.66e-059.17e-054.02e-04 -1.49e+01 -1.49e+015.64e-111.14e+02
101800| 7.62e-059.22e-053.96e-04 -1.46e+01 -1.46e+013.30e-111.14e+02
101900| 7.59e-059.28e-052.10e-04 -1.44e+01 -1.44e+013.30e-111.14e+02
101980| 7.56e-059.33e-053.35e-05 -1.42e+01 -1.42e+011.17e-111.14e+02
----------------------------------------------------------------------------
Status: Solved
Timing: Solve time: 1.14e+02s
Lin-sys: nnz in L factor: 1089, avg solve time: 1.87e-05s
Cones: avg projection time: 1.06e-03s
Acceleration: avg step time: 2.62e-07s
----------------------------------------------------------------------------
Error metrics:
dist(s, K) = 4.8795e-06, dist(y, K*) = 0.0000e+00, s'y/|s||y| = -5.3311e-12
primal res: |Ax + s - b|_2 / (1 + |b|_2) = 7.5554e-05
dual res: |A'y + c|_2 / (1 + |c|_2) = 9.3272e-05
rel gap: |c'x + b'y| / (1 + |c'x| + |b'y|) = 3.3548e-05
----------------------------------------------------------------------------
c'x = -14.2093, -b'y = -14.2103
============================================================================
The objective is in the neighborhood of what CONOPT found for the GAMS model (CONOPT required just a few iterations).
This model is not exactly a showcase for CVXPY.
Conclusion
CVXPY can be very convenient to model certain classes of models: convex and easy too write in matrix notation. For other models it is not very suited. We saw some examples where things become really hairy when using CVXPY while the underlying mathematical model is really quite simple. CVXPY is really a special purpose modeling tool and you may want to consider other tools when the model does not really fits CVXPY's matrix philosophy.
References
- https://www.cvxpy.org/
- https://www.gams.com/products/simple-example/
- https://osqp.org/
- https://www.cvxpy.org/examples/applications/OOCO.html