Wednesday, April 27, 2011

A macro calls R in SAS for paneled 3d plotting

SAS and R could complement each other. SAS is a versatile ETL (extraction, transformation and loading) machine and its statistical procedures based on generalized linear model are impeccable. R would bring cutting-edge data mining and data visualization technologies at low cost (or no cost). Although the two packages dwell in distinctive ecosystems (for example: different OS/ETL/database/reporting layers) [Ref. 1], mixed programming by combining them together would make an analytics shop invincible.

Some SAS programmers like to use SAS/IML to call R’s functions [Ref. 2]. However, it seems that SAS/IML fails to work with the latest versions of R since 2.12 [Ref. 3]. Others tend to play tricks to call R into SAS’s data step to meet their daily needs [Ref. 4]. In this post, the macro below would call the ‘lattice’ package of R in SAS, on a PC platform, to draw paneled three dimension images, since currently SAS’s SG procedures don’t own such an option. The good thing is that there is no need to check the version of R installed before running it. And the modification of this macro can be extended to other applications to call R in SAS.

1. ‘Keep an Eye on the emerging Open-Source Analytics Stack’. Revolution R Blog.
2. Zhengping Ma. ‘Data mining in SAS with open source software’. SAS Global 2011.
3. ‘SAS/IML incompatible with latest releases of R’. SAS-L. 11APR2011.
4. Liang Xie. ‘Regularized Discriminant Analysis’.

/*******************READ ME*********************************************
* VERSION:     SAS 9.2(ts2m0), windows 64bit
* DATE:        25apr2011
****************END OF READ ME*********************j********************/

****************(1) MODULE-BUILDING STEP******************;
%macro scatter3dpanel(data = , x = , y = , z = , factor = , 
                      width = , height = , outfile = );
   *  MACRO:      scatter3dpanel()
   *  PARAMETERS: data   = dataset for plotting
   *              x      = x-axis variable
   *              y      = y-axis variable
   *              z      = z-axis variable
   *              factor = partition factor variable
   *              width  = width of output graph
   *              height = height of output graph
   *              outfile= location of output image
  proc export data = &data outfile = "d:\tmp.csv" replace;
  proc sql;
    create table _tmp0 (string char(80));
    insert into _tmp0 
    set string = 'tmp=read.csv("d:/tmp.csv", header=T)'
    set string = 'attach(tmp)'
    set string = 'library(lattice)'
    set string = 'windows()'
    set string = 'cloud(sas_zvar~sas_xvar+sas_yvar|as.factor(sas_factor), pretty=T)'
    set string = 'dev.print(device=png, width=sas_width, height=sas_height, file="sas_file")';
  data _tmp1;
    set _tmp0;
    string = tranwrd(string, "sas_xvar", propcase("&x"));
    string = tranwrd(string, "sas_yvar", propcase("&y"));
    string = tranwrd(string, "sas_zvar", propcase("&z"));
    string = tranwrd(string, "sas_factor", propcase("&factor"));
    string = tranwrd(string, "sas_width", "&width");
    string = tranwrd(string, "sas_height", "&height");
    string = tranwrd(string, "sas_file", translate("&outfile", "/", "\"));

  data _null_;
    set _tmp1;
    file "d:\callRinSAS.r";
    put string;

  options noxsync noxwait;
  x ' "d:\Program Files\R\R-2.12.1\bin\R.exe" CMD BATCH --vanilla --slave "d:\callRinSAS.r" ';

****************(2) TESTING STEP******************;
%scatter3dpanel(data =, x = length, y = wheelbase, z = horsepower, 
            factor = type, width = 1200, height = 600, outfile = d:\test1.png );

****************END OF ALL CODING***************************************;

Good math, bad engineering

As a formal statistician and a current engineer, I feel that a successful engineering project may require both the mathematician’s abilit...