Thursday, February 9, 2012

Cholesky decomposition to "expand" data

Yesterday Rick showed how to use Cholesky decomposition to transform data by the ROOT function of SAS/IML. Cholesky decomposition is so important in simulation. For those DATA STEP programmers who are not very familiar with SAS/IML, PROC FCMP in SAS may be another option, since it has an equivalent routine CALL CHOL.

To replicate Rick’s example of general Cholesky transformation for correlates variables,  I randomly chose three variables from a SASHELP dataset SASHELP.CARS and created a simulated dataset which shares the identical variance-covariance structure. A simulated dataset can be viewed as an “expanded’ version of the original data set.

In PROC FCMP, for memory's sake, don’t allocate many matrices (or arrays).  A better way is to use CALL DYNAMIC_ARRAY routine to resize a used matrix, which is similar to the ReDim statement in VBA.  A VBA programmer can easily migrate to SAS through PROC FCMP.

proc corr cov outp=corr_cov plots=scatter;
   var weight length mpg_city;

data cov;
   set corr_cov;
   where _type_ = 'COV';
   drop _:;

proc fcmp;  
   /* Allocate space for matrices*/
   array a1[3, 3] / nosymbols;
   array a2[3, 3] / nosymbols;
   array b1[3, 1000] / nosymbols;
   array b2[3, 1000] / nosymbols;

   /* Simulate a matrix by normal distribution*/
   do i = 1 to 3;
      do j = 1 to 1000;
         b1[i, j] = rannor(12345);

   /* Read the covariance matrix*/
   rc1 = read_array('cov', a1);
   call chol(a1, a2);
   put a2;
   call mult(a2, b1, b2);

   /* Output the result matrix*/
   call dynamic_array(b1, 1000, 3);
   call transpose(b2, b1);
   rc2 = write_array('result', b1);

proc corr data=result cov plots=scatter;

Good math, bad engineering

As a formal statistician and a current engineer, I feel that a successful engineering project may require both the mathematician’s abilit...