Quantcast
Channel: Yet Another Math Programming Consultant
Viewing all articles
Browse latest Browse all 804

Another fast MIP model: covering

$
0
0

In [1], the following problem is stated:

  • There is a collection of \(n=1,000\) test questions.
  • Each question covers a number of skills.
  • Given is a requirement for a number of questions for each required skill (e.g., 4 questions about skill 1, 3 questions about skill 2, etc.).
  • Create a test with the minimum number of questions that fulfills the requirements.
Of course, we see some remarks about NP-Hardness. Complexity theory is about bounds, about worst-case and asymptotic behavior. Even if we can't prove that a particular problem with a particular dataset given to a particular solver must be fast, it does not mean the opposite: that it must be slow. It is not at all unusual that we can solve things very quickly, even though the complexity bounds are terrible. Of course, this depends on the model, on the data set, and on the solver. Here, we see an open-source MIP solver can solve this particular problem in less than 0.01 seconds. Many well-educated people misunderstand complexity theory – to their detriment. They miss out on great ways to solve actual problems! Obviously, this is a problem with how complexity theory is taught. 

Ok, here we go (ignoring the NP-hard doomsayers). We don't have data, so the first thing to do is to invent some. I assume a library of \(n=1,000\) questions and some 25 possible skills. So we have the sets \(q\) and \(s\). A 2-dimensional set \(\color{darkblue}qs_{q,s}\) indicates which skills are covered by which questions. Also, we need to invent some requirements.

A partial view of the randomly-generated sparse question-skill data is:

----     11 SET qsskills covered by questions

skill_1 skill_2 skill_3 skill_4 skill_5 skill_6 skill_7 skill_8 skill_9

question3 YES
question4 YES
question5 YES
question8 YES YES
question9 YES
question11 YES
question14 YES YES
question20 YES YES
. . .


while the requirements are:

----     11 PARAMETER arassessment requirements

skill_2 3.000, skill_6 10.000, skill_14 6.000, skill_17 4.000, skill_21 7.000, skill_22 2.000
skill_24 3.000

The mathematical model can look like:

Mathematical Model
\[ \begin{align} \text{Sets} \\ & q \text{: questions} \\ & s \text{: skills} \\ & \color{darkblue}qs_{q,s}\text{: skill is covered by question} \\ & \hline \\ \text{Data} \\ & \color{darkblue}{\mathit{ar}}_s \text{: assessment requirements}\\ & \hline \\ \text{Variables} \\ & \color{darkred}{\mathit numq} \text{: number of questions needed} \\ & \color{darkred}{\mathit question}_{s} = \begin{cases}1&\text{if question $s$ is used}\\ 0&\text{otherwise}\end{cases} \\ & \hline \\ \text{Model} \\ \min\>& \color{darkred}{\mathit numq} = \sum_{q} \color{darkred}{\mathit question}_{q} \\ & \sum_{q|\color{darkblue}qs_{q,s}}\color{darkred}{\mathit question}_{q} \ge \color{darkblue}{\mathit{ar}}_s && \forall s|\color{darkblue}{\mathit{ar}}_s\gt 0 \\ & \color{darkred}{\mathit question}_s \in \{0,1\} \end{align}\]

This small model solves very quickly. Using the open-source solver CBC, I see:

 
COIN-OR CBC      42.2.0 ef14ea53 Feb 16, 2023          WEI x86 64bit/MS Window

COIN-OR Branch and Cut (CBC Library 2.10.8)
written by J. Forrest

Calling CBC main solution routine...
Integer solution of 19 found by feasibility pump after 0 iterations and 0 nodes (0.00 seconds)
Search completed - best objective 19, took 0 iterations and 0 nodes (0.00 seconds)
Maximum depth 0, 0 variables fixed on reduced cost

Solved to optimality (within gap tolerances optca and optcr).
MIP solution: 1.900000e+01 (0 nodes, 0.006 seconds)

Best possible: 1.900000e+01
Absolute gap: 0.000000e+00 (absolute tolerance optca: 0)
Relative gap: 0.000000e+00 (relative tolerance optcr: 0.0001)

Resolve with fixed discrete variables.
Optimal - objective value 19

The solution time was 0.006 seconds, and the total turn-around time was less than 0.1 seconds. The solution looks like:

 
----     27 VARIABLE numq.L                =       19.000number of questions needed

---- 27 VARIABLE question.L 1 if question is included in test

question9 1.000, question44 1.000, question76 1.000, question90 1.000, question149 1.000, question239 1.000
question347 1.000, question416 1.000, question446 1.000, question484 1.000, question506 1.000, question540 1.000
question579 1.000, question600 1.000, question622 1.000, question633 1.000, question771 1.000, question873 1.000
question923 1.000

A more complete solution report can look like:

----     61 PARAMETER solution  

skill_2 skill_3 skill_5 skill_6 skill_8 skill_9 skill_11 skill_12 skill_13

question9 1.000
question44 1.0001.000
question90 1.000
question149 1.000
question239 1.0001.000
question347 1.000
question416 1.0001.0001.000
question446 1.000
question540 1.000
question579 1.0001.0001.000
question600 1.000
question622 1.0001.000
question633 1.0001.000
question873 1.000
question923 1.000
total 3.0001.0002.00010.0001.0001.0003.0001.0001.000
required 3.00010.000

+ skill_14 skill_15 skill_17 skill_19 skill_20 skill_21 skill_22 skill_24

question9 1.0001.000
question44 1.000
question76 1.0001.000
question90 1.000
question149 1.0001.000
question239 1.000
question446 1.0001.000
question484 1.0001.0001.000
question506 1.0001.000
question540 1.000
question622 1.000
question771 1.0001.000
question873 1.0001.0001.000
question923 1.0001.000
total 6.0001.0004.0001.0001.0007.0002.0003.000
required 6.0004.0007.0002.0003.000


These types of covering models often solve very fast. There is no good reason to not consider a MIP solution for a problem like this. 

Note: increasing the size of the problem to several thousand questions, does not make the problem much more difficult to solve. An advantage of using random data is that we can trivially change the size of the problem.

References


  1. Dynamic programming solution for assigning least number of questions that satisfy skills, https://stackoverflow.com/questions/75488358/dynamic-programming-solution-for-assigning-least-number-of-questions-that-satisf


Appendix: GAMS model

$ontext

 

   Design a test with minimum number of questions such that

   skill coverage is obeyed.

  

   Reference:

   https://stackoverflow.com/questions/75488358/dynamic-programming-solution-for-assigning-least-number-of-questions-that-satisf

  

   This is an easy set covering type of problem. They often solve very fast.

 

$offtext

 

 

*------------------------------------------------------------------

* random data set

*------------------------------------------------------------------

 

set

   'questions' /question1*question1000/

   'skills'    /skill_1*skill_25/

   qs(q,s'skills covered by questions'

;

qs(q,s) = uniform(0,1) < 0.04;

 

Parameter ar(s) 'assessment requirements';

ar(s)$(uniform(0,1)<0.2) = uniformInt(1,10);

display q,s,qs,ar;

 

 

*------------------------------------------------------------------

* model

*------------------------------------------------------------------

 

binary variable question(q) '1 if question is included in test';

variable numq 'number of questions needed';

 

equations

   cover(s) 'required number of questions with skill s'

   count    'number of questions in test'

;

 

* skip if ar(s)=0

cover(s)$ar(s).. sum(qs(q,s), question(q)) =g= ar(s);

count.. numq =e= sum(q,question(q));

 

model m /all/;

option mip=cbc;

solve m minimizing numq using mip;

 

display numq.lquestion.l;

 

 

*------------------------------------------------------------------

* solution report

*------------------------------------------------------------------

 

parameter solution(*,*);

solution(qs(q,s)) = question.l(q)>0.5;

solution('total',s) = sum(qs(q,s)$(question.l(q)>0.5),1);

solution('required',s) = ar(s);

display solution;

 

 


Viewing all articles
Browse latest Browse all 804

Trending Articles