Sunday, December 12, 2010

5D visualiztion: from SAS to Google Motion Chart

Three dimensions are usually regarded as the maximum for data presentation. With the opening of ODS from SAS 9.2 and its graph template language, 3D graphing is no longer a perplexing problem for SAS programmers. However, nowadays magnificent amount of data with multi-dimension structure needs more vivid and simpler way to be displayed.

The emerging of Google Motion Chart now provides a sound solution to visualize data in a more than three dimensions scenario. This web-based analytical technology originated from Dr. Hans Rosling’s innovation. Dr. Rosling and his Gapminder foundation invented a technology to demonstrate the relationship among multiple dimensions by animated bubbles. They developed a lot of bubble plots in Gapminder’s website to discover knowledge form a bulk of public information, especially for regional/national comparison. It soon attracted Google’s attention. In 2008 after an agreement between Dr. Rosling and Google’s two founders, Google launched its Motion Chart gadget. People could create motion chart by using Google Docs, an online alternative to Microsoft’s Office.

The combination between SAS and Google Motion Chart shows a handy and cheap way for up-to-five-dimension data visualization. For Motion Chart, it supports five variables all together in a plot. Commonly the data structure requires time(animation), var1(X axis), var2(Y axis), var3(color) and var4(bubble size). The correlation from var1 to var4 is expected: usually the bubbles with changing color and size tend to move along the diagonal line. Overall 5d visualization can be rendered within such a single plot. In this example, a SAS help dataset ‘SASHELP.SHOES’ is used. The data set has several regions to compare each other. Logged return money is Y-axis, while logged sale money is X-axis. A series of virtual time is given to each region, with inventory as bubble size and the store number as color. By SAS, the data structure in Motion Chart can be prepared quickly. Thus, once the CSV file is uploaded to Google Docs, a motion chart is ready to be published in any webpage. OK, it's time to sit and discover some interesting tendency...

1.'Show me--New ways of visualising data’. The Economist. Feb 25th 2010.
2.‘Making data dance’. The Economist. Dec 11st 2010.
3. Google Docs online help center. 2010.

*********(1) Extract data from SASHELP.SHOES***********;
proc sql;
create table test as
select region, Sales, Inventory, Returns, Stores
order by region , sales desc
********(2) Create a random variable for time************;
data test1;
do i=1 by 1 until (last.region);
set test;
by region;
mytime=put(time, mmddyy8.);
drop i;
********(3) Transform some variables with log**********;
proc sql;
create table test2 as
select region, mytime, log(sales) as logsales, log(returns) as logreturn, Stores as storenum, Inventory
from test1
order by region, mytime
********(4) Export data as CSV***************;
proc export data=test2 outfile='C:\Users\Yanyi\Desktop\test.csv' replace;
*******(5) Upload CSV to Google Docs************;
******(6)  Create Google Motion Chart manually**********;

**********END*********TEST PASSED 12DEC2010****************************;

Good math, bad engineering

As a formal statistician and a current engineer, I feel that a successful engineering project may require both the mathematician’s abilit...