Choosing boxes: set covering model

row, building, shelf, business, rack, storage

From [1]:

I have the following problem:

I have N square paper documents with side lengths between 150mm and 860mm. I know each document side's length.
I need to create 3−4 differently sized boxes to fit all the documents, e.g. Three box types: Box 1 side L1=300mm, Box 2 side L2=600mm, Box 3: L3=860mm.
There are as many boxes as documents, i.e. each document goes into its own separate box (of the smallest possible size so as to minimize waste of cardboard).
What is the best way to decide on the size of the boxes, so as to minimize the total amount of (surface area) of cardboard used?

The data looks like:

Original Data (partial view)

We have 1166 records. We see there are items with the same length. These duplicates will always go in the same box type. So we don't have to distinguish them in the optimization model. As a result, we can reorganize the data into unique values and their count:

---    392 PARAMETER data  

            size       count

i1       156.0001.000
i2       162.0001.000
i3       168.0001.000
i4       178.0002.000
i5       180.0001.000
i6       185.0002.000
...
i379     806.0001.000
i380     820.0001.000
i381     823.0001.000
i382     827.0002.000
i383     855.0002.000
i384     864.0001.000

This table has 384 records. A big improvement over 1166 items! This will make our optimization models much smaller. Note that we make sure things are ordered by size (increasing). We will exploit this in one of the models below.

Assignment formulation

It is tempting to start with an assignment variable such as: \[x_{i,j} = \begin{cases} 1 & \text{if item $i$ is assigned to box type $j=1,\dots\,T$}\\ 0 & \text{otherwise}\end{cases}\] Here $T$ is the number of differently sized boxes. Furthermore let \[a_j = \text{area of box $j$}\] This can lead to a model like: \[\bbox[lightcyan,10px,border:3px solid darkblue] {\begin{align}\min & \sum_{i,j} \> \mathit{count}_i \> a_j \> x_{i,j} \\ & \sum_j x_{i,j} = 1 & \forall i & \> \> \text{(assignment)} \\ & x_{i,j} = 1 \Rightarrow a_j \ge \mathit{size}_i^2 & &\>\>\text{(fit)} \\ & x_{i,j} \in \{0,1\} \\ & a_j \ge 0 && \>\>\text{(area of box type $j$)}\end{align}}\] This is problematic: we have a non-convex quadratic objective. This model can be linearized quite easily: \[\bbox[lightcyan,10px,border:3px solid darkblue] {\begin{align}min & \sum_{i,j} \> \mathit{count}_i \> y_{i,j} \\ & a_j \ge \mathit{size}_i^2 \> x_{i,j} \\ & y_{i,j} \ge a_j - \mathit{Amax} (1-x_{i,j}) \\ & a_1 = \mathit{Amax} \\ & a_{j+1} \le a_j - 1 \end{align}}\] where $\mathit{Amax}=864^2$.

This model is very difficult to solve. After 20 minutes it still had a gap of 30%.

Covering formulation

This model is organized around the variable \[x_{i,j} = \begin{cases} 1 & \text{if items $i$ through $j$ are put in the same size box (with size $j$)} \\ 0 & \text{otherwise}\end{cases}\] Here we see why the ordering is useful: if $x_{i,j}=1$ then all items $i,i+1,\dots,j-1,j$ are placed in the size box.

The basic idea is to cover each item exactly once by a variable $x_{i,j}$. In addition we want exactly $T$ variables that have a value of one. Here is how we organize our variables $x_{i,j}$:

Variables x(i,j)

A feasible solution has two properties:

Each item (column) selects exactly one variable (row),
The total number of selected variables is $T$.

A feasible solution can look like:

Feasible solution for T=3

The complete model is surprisingly simple: \[\bbox[lightcyan,10px,border:3px solid darkblue] {\begin{align}min & \sum_{i,j} c_{i,j} x_{i,j}\\ & \sum_{i \le k \le j} x_{i,j} = 1 & \forall k\\ & \sum_{i\le j} x_{i,j} = T \\ & x_{i,j} \in \{0,1\}\end{align}}\] The cost coefficients can be calculated as: \[c_{i,j} = \sum_{i\le k\le j} \mathit{count}_k \>\mathit{size}_j^2 \>\>\>\>\forall i\le j\] This like a set covering or better set partitioning problem.

This leads to a large, but easy to solve MIP model: for $T=4$ I see:

MODEL STATISTICS

BLOCKS OF EQUATIONS           3     SINGLE EQUATIONS          386
BLOCKS OF VARIABLES           2     SINGLE VARIABLES       73,921
NON ZERO ELEMENTS     9,658,881     DISCRETE VARIABLES     73,920

This looks scary, but actually solves very fast:

Root relaxation solution time = 6.73 sec. (3375.50 ticks)

        Nodes                                         Cuts/
   Node  Left     Objective  IInf  Best Integer    Best Bound    ItCnt     Gap

*     0+    07.96958e+080.0000100.00%
Found incumbent of value 7.9695838e+08 after 102.45 sec. (46615.58 ticks)
*     00      integral     03.29199e+083.29199e+0821110.00%
Elapsed time = 102.56 sec. (46669.97 ticks, tree = 0.00 MB, solutions = 2)
Found incumbent of value 3.2919912e+08 after 102.56 sec. (46669.97 ticks)

The results look like:

----    447 PARAMETER results  

                 count     minsize     maxsize        area    sum area

i1   .i155     549.000156.000388.000150544.0008.264866E+7
i156 .i268     377.000389.000550.000302500.0001.140425E+8
i269 .i351     187.000552.000705.000497025.0009.294368E+7
i352 .i384      53.000710.000864.000746496.0003.956429E+7
total.        1166.0003.291991E+8

This means we put items 1 through 155 in the smallest box with size 388 (area = $388^2$), etc.

When we run the model with $T=3$ we see:

----    447 PARAMETER results  

                 count     minsize     maxsize        area    sum area

i1   .i155     549.000156.000388.000150544.0008.264866E+7
i156 .i284     423.000389.000574.000329476.0001.393683E+8
i285 .i384     194.000576.000864.000746496.0001.448202E+8
total.        1166.0003.668372E+8

Indeed the total cost (area) goes up. Interestingly the smallest box size stays the same.

GAMS implementation

set i /i1*i384/;
alias(i,j,k);

set
ij(i,j)      'allowed i,j combinations: i<=j'
ijk(i,j,k)   'allowed i,j,k combinations: i<=k<=j'
;
ij(i,j) = ord(i)<=ord(j);
ijk(ij(i,j),k) = ord(k)>=ord(i) andord(k)<=ord(j);

table data(i,*)
           size      count
i1          156        1
i2          162        1
i3          168        1
i4          178        2

. . .

i381        823        1
i382       827        2
i383        855        2
i384        864        1
;

parameter count(i), size(i);
count(i) = data(i,'count');
size(i) = data(i,'size');

scalar T 'number of different box sizes'/4/;

binaryvariables x(i,j) 'items i-j in a single box type';
variable totalarea;

parameter c(i,j) 'cost when items i-j are in box with size size(j)';
c(ij(i,j)) = sum(ijk(ij,k), count(k)*sqr(size(j)));

equations
   cover(k)   'cover each item k exactly once'
   numboxes   'T different box sizes'
   objective 'minimize total area'
;

cover(k).. sum(ijk(ij,k),x(ij)) =e= 1;
numboxes.. sum(ij,x(ij)) =e= T;
objective.. totalarea =e= sum(ij, c(ij)*x(ij));

model m /all/;
option optcr=0;
solve m minimizing totalarea using mip;

set xij(i,j) 'selected x(i,j)';
xij(i,j) = x.l(i,j) > 0.5;

parameter results(*,*,*);
results(xij(i,j),'minsize') = size(i);
results(xij(i,j),'maxsize') = size(j);
results(xij(i,j),'area') = sqr(size(j));
results(xij,'count') = sum(ijk(xij,k),count(k));
results(xij,'sum area') = c(xij);
results('total','','sum area') = totalarea.l;
results('total','','count') = sum(ijk(xij,k),count(k));
display results;

Notes:

Intermediate sets are used to simplify things. I use this approach a lot, Sets are easier to debug than equations.
Reporting is important. I try to collect meaningful information in the 3d parameter results. Hopefully the results can be interpreted even when not knowing the model or even when not knowing GAMS.
The set xij(i,j) will indicate where we have x.l(i,j)=1. In practice, values for a binary variable can be 0.99999 or 1.00001. So I don't test against 1.0 exactly.
I left out the data step where I combine duplicate item sizes. That was done in a separate piece of GAMS code.

Network model

This problem can also be modeled as a network problem. See [2] for more details.

References

Need to create 3−4 different box sizes and to minimize material waste for a set of n objects that need to fit into these boxes, https://math.stackexchange.com/questions/2843990/need-to-create-3-4-different-box-sizes-and-to-minimize-material-waste-for-a-se/2850260
On the scheduling of reading book chapters, https://yetanothermathprogrammingconsultant.blogspot.com/2018/02/on-scheduling-of-reading-book-chapters.html

Choosing boxes: set covering model

Assignment formulation

Covering formulation

GAMS implementation

Network model

References

Trending Articles

RAMAYAMPET Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers Medak...

लड़कियां सेक्स के दौरान क्यों करती है उह! आह!लड़कियां सेक्स के दौरान क्यों करती...

Neem Baba Extra Questions Answer Class 6 English Poorvi

Throw Back: 4×4 — Sikilitele (Ft Castro) Prod by JQ

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Lowe faces four theft charges

Practice Sheet of Right form of verbs for HSC Students

Mafia, Murder & Mayhem In The Motor City: Detroit Mob Hit Timeline (1937-2007)

The 10 Tennessee Cities With The Largest Black Population For 2021

Materials Around Us Class 6 Worksheet Science Chapter 6

デスクトップヒープの枯渇

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Kanulanu Thaake Lyrics and translation | Manam (2014)

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Teen Shot In Miami Drive-By Dies From Injuries

Download: IQ Muzatasha feat Shy D & Pmj – Ulesi NiFertilizer Yamavuto

Mahakal Attitude Status

Property developer set up cannabis factory to help pay off debts...

♡

KB: How to troubleshoot issues when adding a Hyper-V host in System Center...