Best Factorization X=AB

Restated from an originally posting in [1]:

Given a matrix \(X\) find a factorization \(X\approx AB\) such that all elements of \(A,B\) are between 0 and 1 and the matrix product \(AB\) is as close as possible to \(X\). Can we solve this as an NLP (nonlinear programming) model?

My answer: yes, but....

The model can simply look like: \[\begin{align} \min\> & ||AB-X||^2 \\ & 0 \le A,B \le 1 \end{align}\] or \[\begin{align} \min &\sum_{i,j} \left ( \sum_k a_{i,k} b_{k,j} - x_{i,j} \right )^2\\ & 0 \le a_{i,k},b_{k,j} \le 1\end{align} \] However, this is a rather nasty non-convex problem. As I expected to find local optima, I solved this in a loop with a different random starting point for \(A\) and \(B\) (each uniform between 0 and 1). With a small data set \[\begin{align}&i=1,\dots,8\\ &j=1,\dots,12\\ & k=1,\dots,6\end{align}\] and random values for \(X\) (uniform between 0 and 6), we see:

Results with NLP solver CONOPT


----     59 PARAMETER results  

               obj   modelstat        time        best

iter1      129.370    localopt       0.062129.370
iter2      129.088    localopt       0.094129.088
iter3      126.113    localopt       0.047126.113
iter4      127.768    localopt       0.031
iter5      128.103    localopt       0.046
iter6      128.178    localopt       0.094
iter7      126.644    localopt       0.031
iter8      129.179    localopt       0.032
iter9      128.201    localopt       0.047
iter10     130.898    localopt       0.125
iter11     125.803    localopt       0.063125.803
iter12     127.407    localopt       0.063
iter13     127.898    localopt       0.031
iter14     127.768    localopt       0.047
iter15     129.823    localopt       0.031
iter16     125.803    localopt       0.047
iter17     129.823    localopt       0.063
iter18     128.901    localopt       0.047
iter19     126.105    localopt       0.047
iter20     128.164    localopt       0.032


Results with NLP solver IPOPT

----     59 PARAMETER results  

               obj   modelstat        time        best

iter1      126.113    localopt       0.344126.113
iter2      128.129    localopt       0.266
iter3      126.113    localopt       0.313
iter4      128.103    localopt       0.218
iter5      128.901    localopt       0.266
iter6      125.803    localopt       0.344125.803
iter7      129.330    localopt       0.375
iter8      128.816    localopt       0.360
iter9      128.103    localopt       0.406
iter10     130.276    localopt       0.390
iter11     127.255    localopt       0.328
iter12     126.113    localopt       0.281
iter13     129.002    localopt       0.297
iter14     128.149    localopt       0.235
iter15     129.823    localopt       0.266
iter16     126.545    localopt       0.328
iter17     127.094    localopt       0.204
iter18     128.164    localopt       0.344
iter19     128.201    localopt       0.313
iter20     128.723    localopt       0.281

Both solvers find a solution with an objective of 125.803. This is a small problem with \(8 \times 6 + 6 \times 12 = 120\) nonlinear variables. This looks doable for a global solver like Baron or Antigone. The problem is that we have \(8 \times 6\times 12 = 576\) bilinear terms \(a_{i,k}b_{k,j}\). This will make life really difficult for the global solvers. Let's try this, while using the 125.803 solution as an initial point and with a time limit of 1000 seconds.

Results with BARON


  Iteration    Open nodes         Time (s)    Lower bound      Upper bound
113.0054.4268125.803
*         438.0054.4268125.715
412838.0057.2903125.715
765469.0058.6781125.715
11582100.0059.7597125.715
155111131.0060.2728125.715
190137162.0060.5990125.715
229164192.0060.8507125.715
266192223.0061.1276125.715
301216254.0061.3283125.715
335239284.0061.4689125.715
373265315.0061.7422125.715
414296346.0061.9213125.715
454321377.0062.1128125.715
492348408.0062.3822125.715
528370438.0062.4439125.715
567392469.0062.5986125.715
598412499.0062.6449125.715
631433531.0062.6994125.715
664458561.0062.7280125.715
695478593.0062.7932125.715
728501624.0062.8952125.715
754520655.0062.9707125.715
782539685.0063.0497125.715
809558716.0063.1053125.715
834577746.0063.1215125.715
866599777.0063.1805125.715
894620808.0063.1963125.715
920636839.0063.3036125.715
949655869.0063.3194125.715
978677900.0063.4053125.715
1007695930.0063.4625125.715
1036717961.0063.5189125.715
1066737992.0063.5608125.715
10967581023.0063.6598125.715
11227781054.0063.7217125.715
11497971085.0063.8112125.715
11608041097.0063.8213125.715

                    *** Max. allowable time exceeded ***      


Results with Antigone

-------------------------------------------------------------------------------
Time (s) Nodes explored Nodes remaining Best possible   Best found Relative Gap
-------------------------------------------------------------------------------

     Searching for feasible solutions with 4 starting points at tree level 0 --
2511    +9.689e+01   +1.258e+02   +2.298e-01
     Adding 0 total cutting planes at tree level 0 ----------------------------
     Adding 739 total cutting planes at tree level 0 --------------------------
     Strong Branching at tree level 1 -----------------------------------------
     Strong Branching at tree level 1 -----------------------------------------
56112    +9.893e+01+1.257e+02   +2.131e-01
     Adding 0 total cutting planes at tree level 1 ----------------------------
     Generating Edge-Concave Cuts; 0 generated so far -------------------------
     Strong Branching at tree level 2 -----------------------------------------
     Strong Branching at tree level 2 -----------------------------------------
68423    +9.893e+01   +1.257e+02   +2.131e-01
     Adding 0 total cutting planes at tree level 1 ----------------------------
     Strong Branching at tree level 2 -----------------------------------------
     Strong Branching at tree level 2 -----------------------------------------
     Strong Branching at tree level 2 -----------------------------------------
80434    +9.894e+01   +1.257e+02   +2.130e-01
     Adding 0 total cutting planes at tree level 2 ----------------------------
     Strong Branching at tree level 3 -----------------------------------------
     Strong Branching at tree level 3 -----------------------------------------
90545    +9.894e+01   +1.257e+02   +2.130e-01
92156    +9.897e+01   +1.257e+02   +2.128e-01
     Adding 0 total cutting planes at tree level 3 ----------------------------
96267    +9.897e+01   +1.257e+02   +2.128e-01
97078    +9.897e+01   +1.257e+02   +2.128e-01
97989    +9.897e+01   +1.257e+02   +2.128e-01
986910    +9.897e+01   +1.257e+02   +2.128e-01
9931011    +9.897e+01   +1.257e+02   +2.128e-01

                                               Reached time limit of 1000 CPU s
10001011    +9.897e+01   +1.257e+02   +2.128e-01

Both solvers have troubles closing the gap. Antigone seems to do better initially, but rest assured: it will slow down considerably and bounds are just not getting closer reasonably fast. But also: both solvers are able to improve the solution from 125.803 to 125.715. Interesting results.

We can make things less non-linear by considering absolute deviations:\[\begin{align}\min & \sum_{i,j} \left| \sum_k a_{i,k} b_{k,j} - x_{i,j}\right |\\ & 0 \le a_{i,k},b_{k,j} \le 1 \end{align}\] or \[\begin{align} \min\> & \sum_{i,j} y_{i,j} \\ & -y_{i,j} \le \sum_k a_{i,k} b_{k,j} - x_{i,j} \le y_{i,j} \\ & 0 \le a_{i,k},b_{k,j} \le 1 \\ & y_{i,j} \ge 0 \end{align}\] Note: when implementing this we should make sure not to duplicate the expression \(\sum_k a_{i,k} b_{k,j} - x_{i,j}\). That means: introduce another set of variables. An alternative least absolute deviations formulation uses variable splitting and can look like: \[\begin{align} \min\> & \sum_{i,j} \left ( y^+_{i,j}+y^-_{i,j}\right) \\ & y^+_{i,j}- y^-_{i,j} = \sum_k a_{i,k} b_{k,j} - x_{i,j}\\ & 0 \le a_{i,k},b_{k,j} \le 1 \\ & y^+_{i,j},y^-_{i,j}\ge 0\end{align} \] These least absolute deviations models do not buy us much w.r.t. performance: we are stuck with the bilinear forms \(a_{i,k} b_{k,j}\).

References

Matrix factorization with constraints, https://stackoverflow.com/questions/51924180/matrix-factorization-with-constraints-r-matlab

Best Factorization X=AB

References

Trending Articles

RAMAYAMPET Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers Medak...

लड़कियां सेक्स के दौरान क्यों करती है उह! आह!लड़कियां सेक्स के दौरान क्यों करती...

Neem Baba Extra Questions Answer Class 6 English Poorvi

Throw Back: 4×4 — Sikilitele (Ft Castro) Prod by JQ

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Lowe faces four theft charges

Practice Sheet of Right form of verbs for HSC Students

Mafia, Murder & Mayhem In The Motor City: Detroit Mob Hit Timeline (1937-2007)

The 10 Tennessee Cities With The Largest Black Population For 2021

Materials Around Us Class 6 Worksheet Science Chapter 6

デスクトップヒープの枯渇

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Kanulanu Thaake Lyrics and translation | Manam (2014)

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Teen Shot In Miami Drive-By Dies From Injuries

Download: IQ Muzatasha feat Shy D & Pmj – Ulesi NiFertilizer Yamavuto

Mahakal Attitude Status

Property developer set up cannabis factory to help pay off debts...

♡

KB: How to troubleshoot issues when adding a Hyper-V host in System Center...