Monday, January 23, 2012

A test for memory management of SAS/IML

Programming always involves the considerations for the efficiency and the memory usage. For efficient programming in SAS/IML, my shortcut is to look at the tip sheet from Rick Wicklin and search ways to simplify the codes. As for the memory management mechanism of SAS/IML, I only found one page of SAS/IML 9.2 User’s Guide on the Internet.

To see the performance of SAS/IML ‘s memory management, I designed a simple test, since the SHOW SPACE statement would indicate the memory details of SAS/IML. A simulated 200 rows * 300 columns matrix occupies about 400k memory. I just requested 1MB memory by specifying the WORKSIZE at the beginning, which means that three matrices of such size would blow the tiny work space away. First I assigned multiple references toward this matrix, and then changed one value in this matrix. Finally I cleared all of the objects in the memory and generated a matrix with the same size. The memory change is shown in the plot above.

/* Only request 1MB memory for this test*/
proc iml worksize=1048576;
    /* 0 -- State 0*/
     show space;
    /* 1 -- State 1 (1 reference on 1 object)*/
        x = j(200, 300, 0);
        call randgen(x, "Normal");
     show space;
    /* 2 -- State 2 (2 references on 1 object)*/
       y = x;
    show space;
    /* 3 -- State 3 (3 references on 1 object)*/
       z = x;
    show space;
    /* 4 -- State 4 (2 refernces on 1 object and 2 refernces on another object)*/
       x[1, 1] = 5;
       w = x;
    show space;
    /* 5 -- State 5 (0 references on 2 objects) */
       free x y z w;
    show space;
    /* 6 -- State 1 again (1 reference on 1 object)*/
       x = j(200, 300, 0);
       call randgen(x, "Normal");
    show space;

1. Interestingly, an object with two references costs more memory than the same object with three references. Apparently two identical copies of the same matrix exist at State 2, while three references all point to a single matrix at State 3.

SAS/IML seems to have a unique memory management system. The memory manager of IML first checks the size available of the workspace (memory allocated to IML). If the memory usage doesn’t pass the alarming line, it will generously make copies. Otherwise, it will become very aggressive and start to compress the memory frenetically.

2. Two similar matrices stay in the memory at State 4, although the difference between them is just one cell. Each of the matrices has 2 references.

SAS/IML follows the rule of copy-on-change. Once values in a matrix are changed, the memory manager will make a copy for the changed object.

3. Running the FREE Statement doesn't immediate clear the memory at State 5.

SAS/IML’s FREE statement only kills the references. The unreferenced objects will be discarded during the ensuing operations.

Good math, bad engineering

As a formal statistician and a current engineer, I feel that a successful engineering project may require both the mathematician’s abilit...