Quantcast
Channel: Yet Another Math Programming Consultant
Viewing all articles
Browse latest Browse all 809

Aggregation: database vs GAMS

$
0
0

The following aggregation code in GAMS aggregates PRODUCTION and AREA HARVESTED from a relatively large data set from FAO (2.2 million records). Afterwards we recalculate YIELDS, as aggregating yields can not be done in the same way as PRODUCTION and AREA (this also famously occurs when aggregating prices: first do volume and value and then recalculate prices).


*-------------------------------------------
* Aggregation
*-------------------------------------------

set
   y(year)
/1990*2000/
   aggr_type(type)
'these quantities are aggregated directly' /
         
'Area Harvested'             Ha
         
'Production Quantity'
        Tonnes
   
/

;
parameter AggrCropData(cropgroup,region,y,type);

AggrCropData(cropgroup,region,y,aggr_type) =
    
sum
((cropmap(crop,cropgroup),regionmap(country, region),unit,flag),
             cropdata(country,crop,aggr_type,y,unit,flag));


*-------------------------------------------

* Recalculate Yield (Hg/Ha)
*-------------------------------------------

AggrCropData(cropgroup,region,y,
'Yield')$AggrCropData(cropgroup,region,y,'Area Harvested') =
    1e5*AggrCropData(cropgroup,region,y,
'Production Quantity'
) /
      AggrCropData(cropgroup,region,y,
'Area Harvested'
)

See also: http://yetanothermathprogrammingconsultant.blogspot.com/2013/05/large-scale-aggregation-example.html.

Of course we can also do this in SQL:

---

--- Aggregation

---

useaggregation;

 

---

--- if target table exists, drop it

---

IFEXISTS(SELECT*FROMsys.objects  WHEREobject_id=OBJECT_ID('[dbo].[AggregatedData]')ANDtype='U')

DROPTABLEAggregatedData;

 

---

--- Step 1:

--- Aggregate Area and Production

---

 

SELECTC.CropGroup,R.Region,A.Element,A.[Year],Sum(A.Value)AS[Value]

INTOAggregatedData

FROMProduction_Crops_E_All_DataASA,

     cropmapASC,

     regionmapASR

WHERE

     A.ElementIn('Area Harvested','Production Quantity')And

     A.[Year]>='1990'AndA.[Year]<='2000'And

     A.Country=R.CountryAnd

     A.Item=C.Crop

GROUPBYC.CropGroup,R.Region,A.Element,A.[Year];

 

 

---

--- Step 2:

--- Recalculate Yield

---

INSERTINTOAggregatedData(CropGroup,Region,Element,[Year],[Value])

SELECTA.CropGroup,A.Region,'Yield',A.[Year], 1.0e5*A.[Value]/B.[Value]

FROMAggregatedDataASA,

     AggregatedDataASB

WHERE

     A.CropGroup=B.CropGroupAND

     A.Region=B.RegionAND

     A.[Year]=B.[Year]AND

     A.Element='Production Quantity'AND

     B.Element='Area Harvested'AND

     B.Value> 0;

 

Interesting to see how well SQL server is doing compared to GAMS. Here are timings in seconds:

 Step1Step2
GAMS1.40.08
MS Access5.40.04
SQL Server1.40.06

Viewing all articles
Browse latest Browse all 809

Trending Articles