Thursday, September 29, 2011

An easy solution for Multi-Sheet EXCEL reporting

Currently the only way to output SAS datasets as a multi-sheet EXCEL workbook for reporting is to use ExcelXP ODS tagset. I like this method a lot, because it can generate stylish multiple EXCEL sheets and is highly customizable. However, in practice it has some weaknesses. 1 - Running this tagset is resource-costly, since it depends on an 8k lines SAS codes - ExcelXP.sas. While dealing with a large SAS dataset, it always gets jammed. 2- It only allows one grouping variable by the BY statement inside the output procedures (PROC REPORT, PROC PRINT, etc.). 3 - The user often has to estimate the width for each column in EXCEL.

Actually we can use SAS macro and VBA macro together to obtain high-quality multi-sheet EXCEL workbook. The workflow is pretty simple: first a SAS macro splits a SAS dataset into many XLS files in a folder through ODS HTML targset. Second a VBA macro merges those single XLS files as sheets in to a workbook. For example, SAS shipped with a sample dataset SASHELP.PRDSAL2 with 23040 observations and 11 variables. If we want to generate a multi-sheet EXCEL workbook grouped by two variables such as ‘state’ and ‘year’, we can set up an empty directory in the hard disk and run a macro like below. As a result, we will have a number of small XLS files.

%macro split(data = , folder = , clsvar1 = , clsvar2 = );
   options nocenter nodate nonumber ps = 9000; 
   title; footnote;
   ods listing close;
   proc sql noprint;
      create table _tmp01 as 
      select &clsvar1, &clsvar2, count(*) as number
      from &data
      group by &clsvar1, &clsvar2
      order by &clsvar1, &clsvar2
   ;quit;
   data _tmp02;
      set _tmp01 nobs = nobs;
      where number gt 0;
      index = _n_;
      call symput('nobs', nobs);
   run;
   %do i = 1 %to &nobs;
      proc sql noprint;
         select &clsvar1, &clsvar2
            into:clsvar1name,:clsvar2name
         from _tmp02
         where index = &i
      ;quit;
      %let filepath = &folder\%sysfunc(dequote(&clsvar1name))_%sysfunc(dequote(&clsvar2name)).xls;
      ods html file = "&filepath " style = minimal; 
      proc print data = &data noobs label;
         where &clsvar1 = "&clsvar1name" and &clsvar2 = &clsvar2name;
      run;
   %end;
   ods listing; 
   ods html close;
%mend;
%split(data = sashelp.PRDSAL2, folder = C:\test1, clsvar1 = state , clsvar2 = year)
Then we can open EXCEL, press ALT+F11, paste the VBA code below and run it. Then we will be able to have a decent multi-sheet EXCEL workbook. The biggest strength for this method is that it is very fast – the overall process (running SAS macro and VBA macro) only takes less than a minute for this relatively large dataset SASHELP.PRDSAL2. And it can be expanded to many grouping variables by modifying the SAS macro a little. In conclusion, for big data EXCEL reporting, combining SAS macro and VBA macro together is a good alternative other than ExcelXP ODS tagset.
VBA Draft 3.1 Blogpost

Good math, bad engineering

As a formal statistician and a current engineer, I feel that a successful engineering project may require both the mathematician’s abilit...