Thursday, April 19, 2012

Stored Processes: SAS's voice on Business Intelligence

Everyday I write SAS scripts to extract, transform and load data from various sources, which is a step before the database, and also pull out data to do analysis such as aggregation and regression in SAS, which is a step after the database. According to Norman Nie, a data shop has a four-layer structure: ETL layer, database layer, analytics layer and BI layer. It seems that recently the database layer and the analytics layer look more and more identical. The relational databases start to fiercely create their statistics arms, such as the SQL Server Analysis Services and Oracle R Enterprise. Tricia and Angela’s new book, The 50 Keys to Learning SAS Stored Processes, discloses SAS’s revenge on Business intelligence layer with its stored processes, which are similiar to the stored procedures usually carried out by the database systems.

This book has 12 chapters, including how to create stored processes, optimize the stored processes and implement the stored processes. The book is easy to read and contains detailed illustrations with lots of colorful graphics. Following the examples, I may create my own stored processes in SAS, for example, a simple query to count numbers of a small data set SASHELP.CLASS by gender. It is also interesting to compare SAS’s stored process with the databases' 
stored procedure such as SQL Server -- they have the same functionality. Therefore, on a Windows server, a developer will be able to choose either SQL Server or SAS to build the web applications.

Now with the stored processes, a SAS programmer may evolve toward a SAS developer. Tricia and Angela’s book will be a good reference for this role transition.

/*********(1) LOAD DATA(SASHELP.CLASS) TO SQL SERVER OR SAS *************/
create table class (name char(8), sex char(1), 
age numeric, height numeric, weight numeric );
insert into class values('ALFRED','M',14,69,112.5);
insert into class values('ALICE','F',13,56.5,84);
insert into class values('BARBARA','F',13,65.3,98);
insert into class values('CAROL','F',14,62.8,102.5);
insert into class values('HENRY','M',14,63.5,102.5);
insert into class values('JAMES','M',12,57.3,83);
insert into class values('JANE','F',12,59.8,84.5);
insert into class values('JEFFREY','M',13,62.5,84);
insert into class values('JOHN','M',12,59,99.5);
insert into class values('JOYCE','F',11,51.3,50.5);
insert into class values('JUDY','F',14,64.3,90);
insert into class values('LOUISE','F',12,56.3,77);
insert into class values('MARY','F',15,66.5,112);
insert into class values('PHILIP','M',16,72,150);
insert into class values('ROBERT','M',12,64.8,128);
insert into class values('RONALD','M',15,67,133);
insert into class values('THOMAS','M',11,57.5,85);
insert into class values('WILLIAM','M',15,66.5,112);

/**********************(2) A STORED PROCEDURE IN SQL SERVER **************/
IF OBJECT_ID('CntBySex') IS NOT NULL
   drop proc CntBySex;
go

create proc CntBySex @sex char(1) = null, @Cnt int output
as
   select @Cnt = COUNT(*)
   from class
   where sex = ISNULL(@sex, sex)
go

declare @OutCnt int
exec CntBySex @Cnt = @OutCnt output
select @OutCnt as AllCnt

exec CntBySex @sex = 'F', @Cnt = @OutCnt output
select @OutCnt as FemaleCnt

exec CntBySex @sex = 'M', @Cnt = @OutCnt output
select @OutCnt as MaleCnt
go


/*************(3) A STORED PROCESS IN SAS *********************/
%macro query;
proc sql;
   select count(*) 
   from class
   %if %length(&sex) > 0 %then %do;
      where sex = &sex;
   %end; 
;
quit;
%mend query;

%stpbegin; 
   %query;
%stpend;

Good math, bad engineering

As a formal statistician and a current engineer, I feel that a successful engineering project may require both the mathematician’s abilit...