Quantcast
Channel: Yet Another Math Programming Consultant
Viewing all articles
Browse latest Browse all 809

Choosing boxes: set covering model

$
0
0
row, building, shelf, business, rack, storage

From [1]:

I have the following problem:

  1. I have N square paper documents with side lengths between 150mm and 860mm. I know each document side's length.
  2. I need to create 3−4 differently sized boxes to fit all the documents, e.g. Three box types: Box 1 side L1=300mm, Box 2 side L2=600mm, Box 3: L3=860mm.
  3. There are as many boxes as documents, i.e. each document goes into its own separate box (of the smallest possible size so as to minimize waste of cardboard).
  4. What is the best way to decide on the size of the boxes, so as to minimize the total amount of (surface area) of cardboard used?

The data looks like:

Original Data (partial view)

We have 1166 records. We see there are items with the same length. These duplicates will always go in the same box type. So we don't have to distinguish them in the optimization model. As a result, we can reorganize the data into unique values and their count:


---    392 PARAMETER data  

size count

i1 156.0001.000
i2 162.0001.000
i3 168.0001.000
i4 178.0002.000
i5 180.0001.000
i6 185.0002.000
...
i379 806.0001.000
i380 820.0001.000
i381 823.0001.000
i382 827.0002.000
i383 855.0002.000
i384 864.0001.000

This table has 384 records. A big improvement over 1166 items! This will make our optimization models much smaller. Note that we make sure things are ordered by size (increasing). We will exploit this in one of the models below.

Assignment formulation


It is tempting to start with an assignment variable such as: \[x_{i,j} = \begin{cases} 1 & \text{if item $i$ is assigned to box type $j=1,\dots\,T$}\\ 0 & \text{otherwise}\end{cases}\] Here \(T\) is the number of differently sized boxes. Furthermore let \[a_j = \text{area of box $j$}\] This can lead to a model like: \[\bbox[lightcyan,10px,border:3px solid darkblue] {\begin{align}\min & \sum_{i,j} \> \mathit{count}_i \> a_j \> x_{i,j} \\ & \sum_j x_{i,j} = 1 & \forall i  & \> \> \text{(assignment)} \\ & x_{i,j} = 1 \Rightarrow a_j \ge \mathit{size}_i^2  & &\>\>\text{(fit)} \\ & x_{i,j} \in \{0,1\} \\ & a_j \ge 0 && \>\>\text{(area of box type $j$)}\end{align}}\] This is problematic: we have a non-convex quadratic objective. This model can be linearized quite easily: \[\bbox[lightcyan,10px,border:3px solid darkblue] {\begin{align}min & \sum_{i,j} \> \mathit{count}_i \>  y_{i,j} \\ & a_j \ge \mathit{size}_i^2 \> x_{i,j} \\ & y_{i,j} \ge a_j - \mathit{Amax} (1-x_{i,j}) \\ & a_1 = \mathit{Amax} \\ & a_{j+1} \le a_j - 1 \end{align}}\] where \(\mathit{Amax}=864^2\).

This model is very difficult to solve. After 20 minutes it still had a gap of 30%.

Covering formulation


This model is organized around the variable \[x_{i,j} = \begin{cases} 1 & \text{if items $i$ through $j$ are put in the same size box (with size $j$)} \\ 0 & \text{otherwise}\end{cases}\] Here we see why the ordering is useful: if \(x_{i,j}=1\) then all items \(i,i+1,\dots,j-1,j\) are placed in the size box.

The basic idea is to cover each item exactly once by a variable \(x_{i,j}\). In addition we want exactly \(T\) variables that have a value of one. Here is how we organize our variables \(x_{i,j}\):

Variables x(i,j)

A feasible solution has two properties:

  • Each item (column) selects exactly one variable (row),
  • The total number of selected variables is \(T\). 
A feasible solution can look like:

Feasible solution for T=3

The complete model is surprisingly simple: \[\bbox[lightcyan,10px,border:3px solid darkblue] {\begin{align}min & \sum_{i,j} c_{i,j} x_{i,j}\\ & \sum_{i \le k \le j} x_{i,j} = 1 & \forall k\\ & \sum_{i\le j} x_{i,j} = T \\ & x_{i,j} \in \{0,1\}\end{align}}\] The cost coefficients can be calculated as: \[c_{i,j} = \sum_{i\le k\le j} \mathit{count}_k \>\mathit{size}_j^2 \>\>\>\>\forall i\le j\] This like a set covering or better set partitioning problem.

This leads to a large, but easy to solve MIP model: for \(T=4\) I see:


MODEL STATISTICS

BLOCKS OF EQUATIONS 3 SINGLE EQUATIONS 386
BLOCKS OF VARIABLES 2 SINGLE VARIABLES 73,921
NON ZERO ELEMENTS 9,658,881 DISCRETE VARIABLES 73,920

This looks scary, but actually solves very fast:


Root relaxation solution time = 6.73 sec. (3375.50 ticks)

Nodes Cuts/
Node Left Objective IInf Best Integer Best Bound ItCnt Gap

* 0+ 07.96958e+080.0000100.00%
Found incumbent of value 7.9695838e+08 after 102.45 sec. (46615.58 ticks)
* 00 integral 03.29199e+083.29199e+0821110.00%
Elapsed time = 102.56 sec. (46669.97 ticks, tree = 0.00 MB, solutions = 2)
Found incumbent of value 3.2919912e+08 after 102.56 sec. (46669.97 ticks)

The results look like:


----    447 PARAMETER results  

count minsize maxsize area sum area

i1 .i155 549.000156.000388.000150544.0008.264866E+7
i156 .i268 377.000389.000550.000302500.0001.140425E+8
i269 .i351 187.000552.000705.000497025.0009.294368E+7
i352 .i384 53.000710.000864.000746496.0003.956429E+7
total. 1166.0003.291991E+8

This means we put items 1 through 155 in the smallest box with size 388 (area = \(388^2\)), etc.

When we run the model with \(T=3\) we see:


----    447 PARAMETER results  

count minsize maxsize area sum area

i1 .i155 549.000156.000388.000150544.0008.264866E+7
i156 .i284 423.000389.000574.000329476.0001.393683E+8
i285 .i384 194.000576.000864.000746496.0001.448202E+8
total. 1166.0003.668372E+8

Indeed the total cost (area) goes up. Interestingly the smallest box size stays the same.

GAMS implementation


set i /i1*i384/;
alias(i,j,k);

set
  ij(i,j)     
'allowed i,j combinations: i<=j'
  ijk(i,j,k)  
'allowed i,j,k combinations: i<=k<=j'
;
ij(i,j) =
ord(i)<=ord(j);
ijk(ij(i,j),k) =
ord(k)>=ord(i) andord(k)<=ord(j);


table data(i,*)
          
size      count
i1          156        1
i2          162        1
i3          168        1
i4          178        2
. . .

i381        823        1
i382        827        2
i383        855        2
i384        864        1
;

parameter count(i), size(i);
count(i) = data(i,
'count');
size(i) = data(i,
'size');

scalar T 'number of different box sizes'/4/;

binaryvariables x(i,j)  'items i-j in a single box type';
variable totalarea;

parameter c(i,j) 'cost when items i-j are in box with size size(j)';
c(ij(i,j)) =
sum(ijk(ij,k), count(k)*sqr(size(j)));


equations
   cover(k)  
'cover each item k exactly once'
   numboxes  
'T different box sizes'
   objective 
'minimize total area'
;

cover(k)..
sum(ijk(ij,k),x(ij)) =e= 1;
numboxes..
sum(ij,x(ij)) =e= T;
objective.. totalarea =e=
sum(ij, c(ij)*x(ij));

model m /all/;
option optcr=0;
solve m minimizing totalarea using mip;

set xij(i,j) 'selected x(i,j)';
xij(i,j) = x.l(i,j) > 0.5;

parameter results(*,*,*);
results(xij(i,j),
'minsize') = size(i);
results(xij(i,j),
'maxsize') = size(j);
results(xij(i,j),
'area') = sqr(size(j));
results(xij,
'count') = sum(ijk(xij,k),count(k));
results(xij,
'sum area') = c(xij);
results(
'total','','sum area') = totalarea.l;
results(
'total','','count') = sum(ijk(xij,k),count(k));
display results;



Notes:

  • Intermediate sets are used to simplify things. I use this approach a lot, Sets are easier to debug than equations.
  • Reporting is important. I try to collect meaningful information in the 3d parameter results. Hopefully the results can be interpreted even when not knowing the model or even when not knowing GAMS.
  • The set xij(i,j) will indicate where we have x.l(i,j)=1. In practice, values for a binary variable can be 0.99999 or 1.00001. So I don't test against 1.0 exactly.
  • I left out the data step where I combine duplicate item sizes. That was done in a separate piece of GAMS code.

Network model


This problem can also be modeled as a network problem. See [2] for more details.

References


  1. Need to create 3−4 different box sizes and to minimize material waste for a set of n objects that need to fit into these boxes, https://math.stackexchange.com/questions/2843990/need-to-create-3-4-different-box-sizes-and-to-minimize-material-waste-for-a-se/2850260
  2. On the scheduling of reading book chapters, https://yetanothermathprogrammingconsultant.blogspot.com/2018/02/on-scheduling-of-reading-book-chapters.html

Viewing all articles
Browse latest Browse all 809

Trending Articles