Thursday, April 26, 2012

10 keywords taken out from SAS Global Forum 2012


1. In-memory
SAS is famous for hitting hard disk at every operation, which is a proved strategy to save memory.  To speed up the processing of ‘Big Data’, SAS at the server side will aggregate memories, load data into memory and then deal with data there, which is 1000 times faster than the hard disk based operation.

2. Hadoop
Informationweek described that Dr. Goodnight, CEO of SAS, loathes Hadoop, the distributed open source platform. However, this time SAS presented its DI Studio and SAS/ACCESS interface, which now allows data access by Hive and Pig. It looks like a challenge for SAS to run its statistical procedures on HDFS.

3. Web application
SAS’s applications obviously start to move toward web, like SAS Visual Analytics, which also fits various mobile devices. There is a way to distinguish a desktop application and a web application in SAS: the former’s default background color is white and the latter is black.

4. High performance procedures
SAS is vigorously developing the procedures with HP as prefix, mostly for the servers. Currently 10-20 procedures in SAS/BASE and SAS/STAT can find their counterpart, such as PROC HPSummary and PROC HPLogistic. Those procedures can also run locally but won't not improve the efficiency significantly.

5. JMP
JMP remains as a lean desktop analysis package while SAS evolves toward gigantic enterprise solution platforms. One interesting thing -- you can always find the motion chart (which JMP can do and SAS can’t) and John Sall at the demo area.

6. SAS 12.1?
Next release of SAS is not 9.4, but 12.1. SAS version 9, including 9.1, 9.2 and 9.3, dwells for almost a decade. Thus, the version update from 9.3 to 12.1 is quite a great leap forward.
Correction - thanks to Chris Hemedinger, 'the new release numbering applies only to the analytical products (STAT, ETS, and so on)'.

7. Data Step 2?
To support data management along with the high performance procedures at the servers, a language called DS2 is under development. It is a strong typing language more like Java or C++ more than Data Step. However, SAS has a macro which can transform Data Step codes to Data Step 2.

Thanks to the corrections by Jason Secosky, who is the development manager for DS2 --
"While DS2 is based on the DATA step, its name is just "DS2" not "DATA step 2". DS2 is statically typed, not strongly typed. Ok, ok, there is no implicit type conversion between some types, like double and timestamp, yet there are functions to explicitly convert these values.
And, there is a PROC that can be used to translate DATA steps generated by SAS Enterprise Miner to DS2. The PROC isn't intended to convert *any* DATA step to DS2."

8. In-database
SAS’s in-database technology now supports all database systems beyond Teradata and Greenplum. To avoid compilation error, it is better to apply ANSI SQL functions instead of SAS’s own function. As for me, it is still not very clear how SAS passes its statistical procedures into the relation database systems.

9. Risk management
SAS’s risk management platform is quite mature and implemented the latest procedures like PROC COPULA. It seems that the end users have to own Bloomberg or other vendor’s license to update the market data.

10. Statistical graph
More SAS’s procedures and solution plans integrated layer-based statistical graph technology to visualize results. Still SAS’s Windowing Environment still doesn’t support the syntax highlighting for  Graph Template Language and the SG procedures, since it always shows red fonts warning syntax errors.

Good math, bad engineering

As a formal statistician and a current engineer, I feel that a successful engineering project may require both the mathematician’s abilit...