Vectorization is an important skill for many matrix languages. From Rick Wiklin’s book about SAS/IML and his recent cheat sheet, I found a few new vector-wise functions since SAS 9.22. To compare the computation efficiency between the traditional do loop style and the vectorization style, I designed a simple test in SAS/IML: square a number sequence(from 1 to 10000) and calculate the time used.
Two modules were written according to these two coding styles. Each module was repeated 100 times, and system time consumed was recorded by SAS/IML’s time() function.
proc iml; start module1; * Build the first module; result1 = j(10000, 1, 1); * Preallocate memory to the testing vector; do i = 1 to 10000; * Use a do-loop to square the sequence; result1[i] = i**2; end; store result1; * Return the resulting object; finish; t1 = j(100, 1, 1); * Run the first test; do m = 1 to 100; t0 = time(); * Set a timer; call module1; t1[m] = time() - t0; end; store t1; quit; proc iml; start module2; * Build the second module; result2 = t(1:10000)##2; * Vectorise the sequence; store result2; * Return the resulting object; finish; t2 = j(100, 1, 1); * Run the second test; do m = 1 to 100; t0 = time(); * Set a timer; call module2; t2[m] = time() - t0; end; store t2; quit; proc iml; load result1 result2; * Validate the results; print result1 result2; quit;
Then the results were released to Base SAS and visualized by a box plot with the SG procedures. In this experiment, the winner is the vectorizing method: vectorization seems much faster than do loop in SAS/IML. Therefore, my conclusions are: (1) avoid the do loop if possible; (2)use those vector-wise functions/operators in SAS/IML; (3) always test the speed of modules/functions by SAS/IML’s time() function.
proc iml; load t1 t2; t = t1||t2; create _1 from t; append from t; close _1; print t; quit; data _2; set _1; length test $25.; test = "do_loop"; time = col1; output; test = "vectorization"; time = col2; output; keep test time; run; proc sgplot data = _2; vbox time / category = test; yaxis grid; run;