Quantcast
Channel: Yet Another Math Programming Consultant
Viewing all articles
Browse latest Browse all 809

Dump solution data into a MySQL database II

$
0
0

This is a follow-up on this post. For large datasets using individual inserts is not as efficient as applying bulk operations. In MySQL bulk inserts can be done with the LOAD DATA LOCAL INFILE command. This statement takes a local (i.e. at the client) text file, copies it to the server and then does a bulk insert of the whole thing. This approach is built in the tool gdx2mysql. If a symbol has more than N records (N=500 by default), then we write a text file and call LOAD DATA LOCAL INFILE. If a symbol has fewer records then we just a standard prepared insert statement. A verbose log will show what happens.

set i /i1*i100/;
alias
(i,j,k);
parameter
a(i,j,k);
a(i,j,k) = uniform(0,100);
execute_unload"test"
,i,a;
execute"gdx2mysql -i test.gdx -s tmp -u test -p test -v";

In the above test model we generate a set with 100 elements and a parameter with 1003=1,000,000 elements. The verbose log looks like

--- Job Untitled_1.gms Start 04/22/16 04:01:39 24.6.1 r55820 WEX-WEI x86 64bit/MS Windows
GAMS 24.6.1   Copyright (C) 1987-2016 GAMS Development. All rights reserved
Licensee: Erwin Kalvelagen                               G150803/0001CV-GEN
          Amsterdam Optimization Modeling Group                     DC10455
--- Starting compilation
--- Untitled_1.gms(6) 3 Mb
--- Starting execution: elapsed 0:00:00.007
--- Untitled_1.gms(5) 36 Mb
--- GDX File C:\Users\Erwin\Documents\Embarcadero\Studio\Projects\gdx2mysql\Win32\Debug\test.gdx
--- Untitled_1.gms(6) 36 Mb
GDX2MySQL v 0.1
Copyright (c) 2015-2016 Amsterdam Optimization Modeling Group LLC

   GDX Library      24.6.1 r55820 Released Jan 18, 2016 VS8 x86 32bit/MS Windows
   GDX:Input file: test.gdx
   GDX:Symbols: 2
   GDX:Uels: 100
   GDX:Loading Uels
   SQL:Selected driver: MySQL ODBC 5.3 ANSI Driver
   SQL:Connection string: Driver={MySQL ODBC 5.3 ANSI Driver};Server=localhost;User=xxx;Password=xxx
      set autocommit=0
      select @@version_comment
   SQL:RDBMS: MySQL Community Server (GPL)
      select @@version
   SQL:RDBMS version: 5.6.26-log
      select count(*) from information_schema.schemata where schema_name = 'tmp'
   -----------------------
   i (100 records)
      drop table if exists `tmp`.`i`
      create table `tmp`.`i`(`i` varchar(4))
      insert into `tmp`.`i` values (?)
      sqlexecute(100 times)
      commit
      Time : 0.6
   a (1000000 records)
      drop table if exists `tmp`.`a`
      create table `tmp`.`a`(`i` varchar(4),`j` varchar(4),`k` varchar(4),`value` double)
      temp file: [C:\Users\Erwin\AppData\Local\Temp\tmpA046.tmp]
      writing C:\Users\Erwin\AppData\Local\Temp\tmpA046.tmp
      load data local infile 'C:\\Users\\Erwin\\AppData\\Local\\Temp\\tmpA046.tmp' into table `tmp`.`a`
      rows affected: 1000000
      commit
      Time : 39.6
      deleting [C:\Users\Erwin\AppData\Local\Temp\tmpA046.tmp]

*** Status: Normal completion
--- Job Untitled_1.gms Stop 04/22/16 04:02:20 elapsed 0:00:41.353

The smaller set i is imported using normal inserts, while the larger parameter a is imported through an intermediate text file. This is much more efficient than using the standard inserts. There is a gdx2mysql option to force larger symbols to use standard inserts, so we can compare timings:

a (1000000 records)
   drop table if exists `tmp`.`a`
   create table `tmp`.`a`(`i` varchar(4),`j` varchar(4),`k` varchar(4),`value` double)
   insert into `tmp`.`a` values (?,?,?,?)
   sqlexecute(1000000 times)
   commit 100 times
   Time : 257.5

So we are about 6.5 times as fast. (We can expect even larger differences in other cases).

A final way to make imports faster is to use ISAM (or rather MyISAM) tables. ISAM is an older storage format (MySQL nowadays uses the InnoDB storage engine by default). However ISAM is still faster for our simple (but large) tables, as can be seen when running with the –isam flag:

i (100 records)
   drop table if exists `tmp`.`i`
   create table `tmp`.`i`(`i` varchar(4)) engine=myisam
   insert into `tmp`.`i` values (?)
   sqlexecute(100 times)
   commit
   Time : 0.3
a (1000000 records)
   drop table if exists `tmp`.`a`
   create table `tmp`.`a`(`i` varchar(4),`j` varchar(4),`k` varchar(4),`value` double) engine=myisam
   temp file: [C:\Users\Erwin\AppData\Local\Temp\tmpBF31.tmp]
   writing C:\Users\Erwin\AppData\Local\Temp\tmpBF31.tmp
   load data local infile 'C:\\Users\\Erwin\\AppData\\Local\\Temp\\tmpBF31.tmp' into table `tmp`.`a`
   rows affected: 1000000
   commit
   Time : 9.9

This again makes a substantial difference in getting data into MySQL.


Viewing all articles
Browse latest Browse all 809

Trending Articles