Tuesday, August 27, 2013

Bubble plot by SAS and Highcharts.js

Bubble plot is a nice data visualization choice for three dimensional numeric variables. It seems quite popular on web and documents.

Static plotting by SAS

Since SAS 9.3, PROC SGPLOT provides a bubble statement, which makes a bubble plot easy. For example, the dataset SASHELP.CLASS can be quickly projected onto a bubble plot.
proc sgplot data = sashelp.class;
   title 'bubble plot by sashelp.class';
   bubble x = weight y = height size = age / group = sex transparency = 0.5;
   yaxis grid;
run;

Dynamic plotting by SAS and Highcharts.js

For the show-off on web, an interactive bubble plot above will be much more attractive. First we need to use SAS to transform the SASHELP.CLASS dataset to a nested JSON array. The link of the final dynamic plot is here.
data one;
   set sashelp.class;
   length data $20.;
   data = cats('[', weight, ',', height, ',', age, '],');
run;

proc sort data = one;
   by sex;
run;
proc transpose data = one out = two;
   by sex;
   var data;
run; 

data JSON;
   set two;
   length _tmp dataline $300.;
   _tmp = cats( of col:);
   substr(_tmp, length(_tmp), 1) = ' ';
   dataline = cats('{data:[', _tmp, '],', 'name:"', sex, '"}');
   keep dataline;
run;
One good thing is that the JSON data can be fully embedded in Highcharts.js, which doesn't require an HTTP server like D3.js. We only need to insert the data from SAS's DATA Step into Highcharts's bubble plot API. It also provides rich options for better visualization effects and convenient downloading.
$(function () {
    $('#container').highcharts({
        chart: {
            type: 'bubble',
            zoomType: 'xy'
        },
        credits: {
            text: "Demo",
            href: 'http://www.sasanalysis.com'
        },
        title: {
            text: 'Bubble plot by sashelp.class'
        },
        series: [{
            data: [
                [84, 56.5, 13],
                [98, 65.3, 13],
                [102.5, 62.8, 14],
                [84.5, 59.8, 12],
                [112.5, 62.5, 15],
                [50.5, 51.3, 11],
                [90, 64.3, 14],
                [77, 56.3, 12],
                [112, 66.5, 15]
            ],
            name: "F"
        }, {
            data: [
                [112.5, 69, 14],
                [102.5, 63.5, 14],
                [83, 57.3, 12],
                [84, 62.5, 13],
                [99.5, 59, 12],
                [150, 72, 16],
                [128, 64.8, 12],
                [133, 67, 15],
                [85, 57.5, 11],
                [112, 66.5, 15]
            ],
            name: "M"
        }]
    });
});

Conclusion

  1. Besides its statistical feature, SAS is also a flexible scripting language such as creating JSON;
  2. Highcharts.js is a view tier tool halfway between D3.js (open source; minimum documentation) and tableau(propriety software; company support), which allows integration with SAS or other data tier tools.

Good math, bad engineering

As a formal statistician and a current engineer, I feel that a successful engineering project may require both the mathematician’s abilit...