Wednesday, July 27, 2011

SAS dataset declassified by Matt Shotwell


Matt Shotwell’s new R package ‘sas7bdat’ is a great achievement to bridge SAS and R. Earlier this year Revolution R, a commercial competitor against SAS, launched a RxSasData() function to read SAS’s unique ‘sas7bdat’ data structure. However, we more like the free lunch provided by the community R.

Now R would have a free access toward SAS’s datasets, including many SAS’s own help datasets. And we will be able to do a lot of tricks toward SAS’s datasets powered by R, in many areas where SAS can’t reach or we didn’t pay the licenses. For example, SAS has a SASHELP.LAKE dataset to show the surface plot feature.  We can use R to directly read it and draw a picture combining a contour plot and a surface plot.


library('sas7bdat', 'lattice')
x = read.sas7bdat('c:/program files/sas/sasfoundation/9.2/graph/sashelp/lake.sas7bda')

panel.3d.contour <-
    function(x, y, z, rot.mat, distance,
           nlevels = 20, zlim.scaled, ...){
    add.line <- trellis.par.get("add.line")
    panel.3dwire(x, y, z, rot.mat, distance,
               zlim.scaled = zlim.scaled, ...)
    clines <-
      contourLines(x, y, matrix(z, nrow = length(x), byrow = TRUE),
                   nlevels = nlevels)
    for (ll in clines) {
      m <- ltransform3dto3d(rbind(ll$x, ll$y, zlim.scaled[2]),
                            rot.mat, distance)
      panel.lines(m[1,], m[2,], col = add.line$col,
                  lty = add.line$lty, lwd = add.line$lwd)
    }
}

wireframe(-Depth ~ Length * Width , x, panel.aspect = 0.6,
        panel.3d.wireframe = "panel.3d.contour",  shade = T,
        screen = list(z = -30, x = 50), lwd = 0.01,
        xlab = "Length", ylab = "Width",  
        zlab = "Depth")


I also had a little test to evaualate the speed of the read.sas7bdat() function. Reading the SAS dataset SASHELP.LAKE 30 times only took 1.64 second on my 3-yea-old desktop, which is certainly much faster than transforming it to a CSV file and inputting.

library('sas7bdat')
test <- function(n = 30) {
 system.time(
     for(i in 1:n)
       read.sas7bdat('c:/program files/sas/sasfoundation/9.2/graph/sashelp/lake.sas7bda')
    )
}
test()

> user  system elapsed 
  1.60    0.05    1.64
Hope Matt continues to improve this wonderful package:
  1. add the support for SAS datasets generated by 64bit systems;
  2. add a write.sas7bdat() function (that will be so cool!).

Monday, July 25, 2011

Play Basel II Accord with SAS (2): portfolio simulation


Although Basel II largely depends on probability instead of generalized linear model that SAS is especially good at, still SAS’ excellent data manipulation and visualization features make it one of the finest tools to explore and implement this accord. Paragraphs from No. 403 to No. 409 of Basel II - Pillar 1 list the requirements for a functioning grading structure, such as at least 7 grading levels are needed.


The internal rating-based (IRB) approaches utilize a bank's own rating structure to estimate the risk weights. To discover the impact of an arbitrary grading structure toward portfolio-wise capital requirement, I simulated a 2000-borrower size credit portfolio. Other settings are 45% loss given default (LGD), 2.5 yr maturity and a 7-level grading structure 0-0.05-0.08-0.15-0.5-2-15, exactly like what were used by Gunter and Peter. The histogram of this simulated portfolio between probability of default (PD) and exposure at default (EAD) is displayed above.


data simu;
   format lgd pd percent8.2 ead dollar8.;
   lgd = 0.45;
   m = 2.5;
   do i = 1 to 2000;
      pd = rand('EXPO') * 1.8 / 100;
      ead = rand('UNIFORM') * 300 + 700;
      output;
   end;
   drop i;
   label lgd = 'Loss given default'
         pd  = 'Probability of default'
         ead = 'Exposure at default'
         m   = 'Maturity';
run;

ods html gpath = 'c:\tmp\' style = money;
ods graphics on;
proc kde data = simu;
   bivar pd ead / plots = histsurface;
run;
ods graphics off;
ods html close;

data grdstr01;
   grdstr = '0-0.05-0.08-0.15-0.5-2-15';
   informat lowbound 8.4;
   do i = 1 to 7;
      lowbound = scan(grdstr, i, '-') / 100;
      output;
   end;
   call symput('grdstr', grdstr);
   keep lowbound;
run;

data grdstr02;
   merge grdstr01 grdstr01(firstobs=2 rename=(lowbound=uppbound));
   if missing(uppbound) = 1 then uppbound = 1; 
   grade = _n_;
   ratio = uppbound - lowbound;
run;

proc sql noprint;
   select cats(lowbound, '-<', uppbound, '=', grade) into: fmtvalue separated by ' '
   from grdstr02;
quit;


A grading format and the capital requirement function were built. Then the portfolio was graded and the by-grade required capitals were calculated. The classified results are showed above. Overall, the weighted portfolio-wise capital requirement is 7.95%.


proc format;
   value gradefmt
      &fmtvalue ;
run;

proc fcmp outlib = work.myfunclib.finance;
  function reqcap(pd, lgd, m);
     corr = 0.12*(1-exp(-50*pd))/(1-exp(-50)) + 0.24*(1-(1-exp(-50*pd))/(1-exp(-50)));
     mtradj = (0.11852 - 0.05478 * log(pd))**2;
     return( (lgd * probnorm((probit(pd) + corr**0.5 * probit(0.999))
           / (1-corr)** 0.5) - pd*lgd) * (1 + (m-2.5)*mtradj) / (1-1.5*mtradj) );
  endsub;
quit;

options cmplib = (work.myfunclib);
proc sql noprint;
   create table _tmp01 as
   select * , put(pd, gradefmt.) as grade 
   from simu
   ;
   select count(*) into: totalno
   from _tmp01
   ;
   select avg(pd) into: totalpd
   from _tmp01
   ;
   create table _tmp02 as
   select distinct grade, count(grade) / &totalno as proppd, avg(pd) as grouppd, 
                  reqcap(calculated grouppd, lgd, m) as groupcr,
                  (calculated proppd * calculated grouppd) / &totalpd as propdft
   from _tmp01
   group by grade
   ;
   create table _tmp03 as
   select a.*, b.groupcr as cr,  
      b.groupcr*a.ead as expcr 'EAD*Required Capital' format = dollar8.2
   from _tmp01 as a left join _tmp02 as b
   on a.grade = b.grade
   ;
   select sum(expcr) / sum(ead) into :totalcr
   from _tmp03
   ;
quit;

ods html gpath = 'c:\tmp\' style = money;
proc sgpanel data = _tmp03 noautolegend;
   title "The portfolio-wise required capital is %sysfunc(putn(&totalcr, percent8.2))";
   title2 "By a grading structure &grdstr";
   panelby grade / columns = 4 rows = 2;
   needle x = pd y = expcr;
   rowaxis grid;
run;
ods html close;

The required capital corresponding to each grade was demonstrated in a tile plot. The size of the subplots indicates the counts of borrowers at individual grades. Obviously, the concentrations of borrows concentrated in some grades such as 5 and 6, which contradicts the requirement by Paragraph 406.


proc sql;
   create table _tmp04 as
   select distinct grade, cr 'Required capital', count(grade) as count
   from _tmp03
   group by grade
   ;
quit;
ods html gpath = 'c:\tmp\' style = harvest;
goptions device=javaimg ftitle="arial/bold" ftext="arial" 
htitle=.15in htext=.2in xpixels=600 ypixels=500;
proc gtile data = _tmp04;
    tile count tileby = (grade, count) / colorvar = cr;
run;
ods html close;


Furthermore we can use area under curve(AUC) or Gini coefficient to evaluate this grading structure. For this portfolio, Gini coefficient by such a grading structure is 0.4296, which is pretty low and may suggest that they don’t match each other.


data _tmp05;
   if _n_ = 1 then do
      sum_proppd = 0; sum_propdft = 0;
      dif_dft = 1; dif_pd = 1;
      output;
   end;
   set _tmp02;
      retain sum_proppd sum_propdft;
      sum_proppd = sum_proppd + proppd;
      sum_propdft = sum_propdft + propdft;
      dif_dft = max(0, 1 - sum_propdft);
      dif_pd  = max(0, 1 - sum_proppd);
   output;
run;

proc iml;
   use _tmp05;
      read all var{dif_pd dif_dft};
   append;
   start TrapIntegral(x,y);
      call sort(x,1);
      call sort(y,1);
      N = nrow(x);
      dx = x[2:N] - x[1:N-1];
      meanY = (y[2:N] + y[1:N-1])/2;
      return( dx` * meanY );
   finish;
   area = TrapIntegral(dif_pd, dif_dft);
   gini = (area - 0.5) / (&totalpd / 2 + (1 - &totalpd) - 0.5);
   call symput('area', left(char(area)));
   call symput('gini', left(char(gini)));
quit;

ods html gpath = 'c:\tmp\'  style = money; title;
proc sgplot data = _tmp05;
   series x = dif_pd  y = dif_dft ;
   scatter x = dif_pd  y = dif_dft ;
   band upper = dif_dft lower = 0 x = dif_pd / transparency=.5;
   xaxis grid label = 'Ratio of observations ';
   yaxis grid label = 'Ratio of default' ;
   inset "GINI Coefficient is %sysfunc(putn(&gini, 8.4))" 
      / position = topleft border;
   keylegend "scatter";
run;
ods html close;
In conclusion, the grading structure has significant impact toward the required capital. Optimization of grading structure can be realized by the method of random walk, which will be discussed next post.

Monday, July 18, 2011

Top 10 most exciting new features in SAS 9.3

The era of SAS 9.3 arrives this summer. As a keen SAS user, I look forward to trying ten new features of SAS 9.3, disclosed by its online documents and Rick’s recent blog posts(his first post and his second post).


1. PROC GROOVY
SAS 9.3 ships with PROC GROOVY, a procedure designed to run Groovy statement within SAS. Groovy is one of many Java’s derivatives and a dynamic typing language. It seems that by its ADD statement PROC GROOVY can invoke Java’s classes and libraries. If so, it may extend the application of SAS to much broader areas. I am interested in exploring the potentials of PROC GROOVY in high performance computation.
Impact: R calls C and Fortran, and SAS can call Java now. I am going to buy a book ‘Groovy in Action’ to warm up.


2. Bubble and pie charts by Graph Template Language (GTL) or PROC SGPLOT
In the past, the SG procedures didn’t count pie chart in, possibly because all statisticians despise its misleading visual effect and treat it not as a ‘statistical’ plot. It really bothered people who work with business report and can’t live without pie chart. With bubble and pie charts joining SAS9.3, ODS graphics grows more powerful. Another good news is that ODS graphics moves to SAS/BASE, which means that I can use the SG procedures or GTL at every computer installed with SAS.
Impact: Time to say good-bye to the ancient GCHART and GPLOT procedures. All of the sudden, except the map-making GMAP procedure, SAS/GRAPH becomes dispensable.


3. HTML output by default
I used to think that HTML costs much more computer resources. Glad that SAS fixed the problem. I am also curious what would happen under UNIX SAS or Mainframe SAS.
Impact: I am going to test it by running 1000 regressions. If SAS got jammed, I will switch back to listing as output.


4. Call procedures within PROC IML
I tried this impressive feature in SAS/IML studio under SAS 9.22. In SAS 9.3, PROC IML can harness all power of SAS's procedures and therefore becomes omnipotent.
Impact: With this feature, a SAS/IML programmer doesn’t need to learn Data Step at all. Codes written by Matlab or R can be smoothly migrated to SAS/IML (just treat SAS’s procedures as functions).


5. PROC COPULA
Ten years after its invention, copula becomes a must-have risk evaluation tool for a portfolio of market or credit assets. Eventually PROC COPULA has its debut, and looks very straightforward and versatile. This new procedure is going to be a big plus for SAS/ETS.
Impact: The first thing I probably will do is to download daily return data and use PROC COPULA to test copula-based VaRs.


6. PROC PLM
The first time I saw the appearance of PROC PLM was SAS 9.22. However, in SAS 9.3, the STORE statement starts to exist in every statistical procedure. Therefore PROC PLM could conveniently implement those stored models.
Impact: In the past, each procedure in SAS/STAT has distinctive way to output model and score new data set. Now PROC PLM is the universal standard to use models being built.


7. PROC FMM
SAS’s latest and finest weapon to build mixture models. It is going to be a lot of fun to fit an exotic distribution made by several uncorrelated distributions, such as normal + weibull.
Impact: Awesome! I will be able to utilize PROC FMM to tackle those infamous ‘fat tail’ distributions.


8. System option procedures OPTIONS, OPTLOAD, and OPTSAVE
I always feel headache about memorizing more than 300 SAS system options. Hope the combination of these procedures could help manage the army of options.
Impact: I would abandon my petty macro that turns on all system options before I start to code.


9. New functions
A number of new functions are added to SAS 9.3: 14 new functions for Data Step; 2 for PROC FCMP; 25 for PROC IML.
Impact:  The ever-increasing stockpile of SAS functions gets bigger!


10. Waterfall chart by Graph Template Language (GTL) or PROC SGPLOT
Waterfall chart turns super popular recently (did you not see it in any fancy business report?). Fortunately SAS 9.3 quickly adopted it.
Impact: Finally I don’t need to painfully emulate a waterfall chart by the BAND and STEP statements of PROC SGPLOT.


Overall, SAS 9.3 enjoys significant enhancement in visualization and programmability. I can’t wait to have a test-drive.

Monday, July 11, 2011

Play Basel II Accord with SAS (1): capital requirement

Basel II Accord, revised by Basel Committee on Banking Supervision (BCBS) and adopted by more than 100 nations, regulates the commercial banks’ capital against risks. Major US banks are under the transition window to fully comply it (2008-2011) [Ref. 1]. I am especially interested in exploring Basel II by SAS codes.

Basel II Accord provides the detailed requirements by its three ‘pillars’. Pillar 1 is about the minimal capital requirement and tries to hold the total capital at 8% of risk-weighted assets. Paragraph 272 gives the equations to calculate the capital requirement for a loan with its risk factors, such as PD(probability of default), LGD(loss given default) and M(effective maturity). I used a user-defined function in SAS to compute it. Then I drew three plots for 1/ 2.5/ 5-year maturity, respectively. The function behaves like a paraboloid, which may suggest that the relationship may not hold when PD is very high. In addition, the longer maturity needs more capital.

proc fcmp outlib = work.myfunclib.finance;
   function reqcap(pd, lgd, m);
      corr = 0.12*(1-exp(-50*pd))/(1-exp(-50)) + 0.24*(1-(1-exp(-50*pd))/(1-exp(-50)));
      mtradj = (0.11852 - 0.05478 * log(pd))**2;
      return( (lgd * probnorm((probit(pd) + corr**0.5 * probit(0.999)) 
            / (1-corr)** 0.5) - pd*lgd) * (1 + (m-2.5)*mtradj) / (1-1.5*mtradj) );
   endsub;
quit;

options cmplib = (work.myfunclib);
data reqcap01;
   do pd = 0.01 to 0.80 by 0.01;
       do lgd = 0.01 to 0.6 by 0.01;
           do m = 1 to 5 by 0.5;
               reqcap = reqcap(pd, lgd, m);
               output;
           end;
       end;
   end;
run;

proc template;
   define statgraph surface001;
      dynamic title;
      begingraph;
         entrytitle title;
         layout overlay3d / cube = true rotate = 30 
                xaxisopts = (label="Probability of default")
                yaxisopts = (label="Loss given default")
                zaxisopts = (label="Required capital");
            surfaceplotparm x = pd y = lgd z = reqcap
                   / surfacetype = fill surfacecolorgradient = reqcap;
            endlayout;
      endgraph;
   end;
run;

%macro surfaceplot(mlist = 1 2.5 5);
   %do i = 1 %to 3; 
      %let maturation = %scan(&mlist, &i, ' ');
      ods html gpath = "c:\tmp\"  style = money;
      proc sgrender data = reqcap01(where=(m=&maturation)) template = surface001;
         dynamic title = "Time to maturation is &maturation yr";
      run;
      ods html close;
   %end;
%mend;
%surfaceplot(mlist = 1 2.5 5);

The unconditional expected loss, LGD * PD, is already subtracted by the capital requirement function, according to Basel II. It can be compared with required capital in the same plot above.

proc template;
   define statgraph surface002;
      dynamic title;
      begingraph;
         entrytitle title;
         layout overlay3d / cube = false rotate = 150 tilt = 30
                xaxisopts = (label="Probability of default")
                yaxisopts = (label="Loss given default")
                zaxisopts = (label="Required capital") ;
            surfaceplotparm x = pd y = lgd z = reqcap
                   / surfacetype = fill surfacecolorgradient = reqcap;
            surfaceplotparm x = pd y = lgd z = eval(pd * lgd); 
            endlayout;
      endgraph;
   end;
run;

ods html gpath = "c:\tmp\"  style = money;
proc sgrender data =  reqcap01(where=(m=2.5)) template = surface002;
   dynamic title = "Time to maturation is 2.5 yr";
run;
ods html close;

Paragraph 273 in Pillar 1 makes firm-size adjustment for small- or medium-sized borrowers with less than 50 million Euros reported sales. Then I made another function to reflect the adjustment.

proc fcmp outlib = work.myfunclib.finance;
   function reqcap_sme(pd, lgd, m, s);
      s1 = max(s, 5);
      corr = 0.12*(1-exp(-50*pd))/(1-exp(-50)) + 0.24*(1-(1-exp(-50*pd))/(1-exp(-50))) 
            - 0.04*(1-(s1-5)/45);
      mtradj = (0.11852 - 0.05478 * log(pd))**2;
      return( (lgd * probnorm((probit(pd) + corr**0.5 * probit(0.999)) 
            / (1-corr)** 0.5) - pd*lgd) * (1 + (m-2.5)*mtradj) / (1-1.5*mtradj) );
   endsub;
quit;

options cmplib = (work.myfunclib);
data reqcap02;
   m = 1.5;
   do s = 5 to 50 by 5;
      do pd = 0.01 to 0.80 by 0.01;
         do lgd = 0.01 to 0.6 by 0.01;
               reqcap = reqcap_sme(pd, lgd, m, s);
               output;
           end;
       end;
   end;
run;
proc export data = reqcap02 outfile = 'c:\tmp\out.csv' replace;
run;


Since SAS does not have an option for parallel 3d plotting yet, I switched to R’s ‘lattice’ package. Comparing the plots, it shows that Basel II made an allowance for small size borrowers to inhibit the trend of over-lending to big entities.

x = read.csv('c:/tmp/out.csv')
library('lattice')
wireframe(reqcap ~ pd * lgd | s , x, shade = TRUE,
          screen = list(z = -30, x = -50), lwd = 0.01, 
          xlab = "Probablity of default", ylab = "Loss given default",  
          zlab = "Required capital")


PS.
Rick Wicklin kindly provided a novel solution for four-dimension visualization - paneled contour plot, powered by SAS's graph template language (GTL). Therefore, we can clearly see the trend as the sizes of the borrowers change. In conclusion, by utilizing the SG procedures and GTL of SAS, a SAS user may easily grow to a data artist.

proc template;
   define statgraph Graph;
      dynamic _X _Y _Z _panelby;
      dynamic _panelnumber_;
         begingraph;
            layout datapanel classvars=(_panelby) /
               rowdatarange=unionall row2datarange=unionall
               columndatarange=unionall column2datarange=unionall
               headerlabeldisplay=value rows=3 columns=4;
               layout prototype;
                  contourplotparm x=_X y=_Y z=_Z /
                     name='contour' contourtype=LABELEDLINEFILL
                     colormodel=ThreeColorRamp gridded=false;
               endlayout;
           endlayout;
        endgraph;
   end;
run;

ods html gpath = "c:\tmp\" style = money;
proc sgrender data=WORK.REQCAP02 template=Graph;
   dynamic _X="PD" _Y="LGD" _Z="REQCAP" _panelby="S";
run; 
ods html close;
References:
1. Philippe Jorion. ‘Financial Risk Manager Handbook’ 6th edition. Wiley, 2010
2. ‘Basel II - Third Consultative Package, Pillar One (29 April 2003)’. www.bis.org/bcbs/cp3part2.pdf

Thursday, July 7, 2011

Using SAS/IML for high performance VaR backtesting


Thanks to Rick’s new book about SAS/IML, I realized that SAS/IML is far beyond PROC IML, and recently it added an advanced IML language called IMLPlus that enjoys many new features. I also start to play with SAS/IML Studio, an IDE exclusively for IMPlus in addition to the SAS enhanced editor we are familiar with (Interestingly, a number of emerging IDEs for matrix languages prefer to use names like XXX Studio. For example, a hot IDE of R is called RStudio). Although SAS users/programmers are notably conservative, I had a rough intuition that learning SAS/IML will be helpful in some occasions of high-demanding computation.

For instance, backtesting of Value-at-risk(VaR) requires a lot of computer resource. At the beginning, VaR based on different methods should be calculated. Jon Danielsson introduced how to use Matlab and R to find VaRs by four ways: exponential moving average, moving average, historical simulation, and generalized autoregressive conditional heteroskedasticity (GARCH), separately [Ref. 1]. And he maintains a daily-updating plot on his website to show the trends [Ref. 2]. As Jon described, if a 4000-day data is fetched and a 1000-day rolling window for calculation is applied, it means that 3000 loops will be needed to have the VaRs , which would make the computation efficiency a concern. Fortunately, we can easily translate the codes by Matlab and R into SAS codes through PROC IML. If there is any hurdle in this porting process, I can always find tips from Rick’s resourceful blog [Ref.3]. The first three chapters of his book are also good reference [Ref. 4].

That is what I did. First I have to use R to download S&P 500 daily returns from 1995-08-09 to 2011-06-28, total 4000 observations. SAS/IML Studio can call R’s function. However, the R on my laptop is version 2.13.0, which is not supported by SAS/IML Studio currently. So I directly used RStudio, my favorite IDE for R, and saved the data as CSV.

######################(0) DOWNLOAD STEP###############################
library("tseries")    
p = get.hist.quote(instrument = "^gspc",
    Start = "1995-08-09",end="2011-06-28",quote="AdjClose",quiet=T) 
y=diff(log(p))
nrow(y)
result=as.data.frame(y)
write.csv(result,file='c:/tmp/result.csv')
####################END OF STEP 0#####################################
Then I used PROC IML to calculate VaRs by exponential moving average, moving average and historical simulation, and eventually inserted them into a macro. Unlike R or Matlab, SAS depends on the procedures rather than the functions. Since PROC IML cannot call other SAS’s procedures, I used PROC FCMP to make PROC AUTOREG as a user-defined function for GARCH. SAS now provides virtual hard disk with physical memory specified by the MEMLIB command, which is indeed a big help for me to slash the computer’s operation time. After all, the total time to run the program below on my 2-year-old laptop is 9 minutes, which is acceptable.

**********************(1) INPUT STEP*********************************;
***********(1.1) INPUT DAILY RETURN DATA FROM R**********************;
data _tmp01;
   infile "c:\tmp\result.csv" delimiter = ',' missover dsd
   lrecl=32767 firstobs=2; 
   format   date yymmdd10. r best32.; 
   informat date yymmdd10. r best32.;
   input date $ r;
run;

***********(1.2) LOAD DATA INTO MEMORY LIBRARY***********************;
libname mem 'd:\tmp' memlib;
data mem._tmp01; 
   set _tmp01(keep=r);
run;
********************END OF STEP 1************************************;

**********************(2) MODULE-BUILDING STEP***********************;
***********(2.1) CREATE A FUNCTION FOR 3 TYPES OF VARS BY AVERAGING**;
%macro var123_macro();
   proc iml;
      lambda = 0.94;
      use mem._tmp01;
         read all var{r} into y;
      close mem._tmp01;
      t1 = &start - 1000;
      t2 = &start - 1;
      window = y[t1:t2];
      start Var(A);            
         n = nrow(A);              
         return( (A-A[:,])[##,]/(n-1) );
      finish Var;
      if exist("mem.b") then do;
         use mem.b;
            read all var{sl1} into sl1;
         close mem.b;
      end;
      else do;
         sl1 = Var(y[1:30]);
         do t = 2 to 1000;
            sl1 = lambda * sl1 + (1-lambda) * y[t-1]**2;
         end;
      end;
      sl1 = lambda * sl1 + (1-lambda) * y[&start-1]**2;
      var1 = -probit(&pvalue) * sqrt(sl1) * &value;
      call symput('var1', left(char(var1)));
      create mem.b var{sl1};
            append;
      close mem.b;
      call sort(window, {1});
      var2 = -sqrt(Var(window)) * probit(&pvalue) * &value;
      var3 = -window[1000*&pvalue] *&value;
      call symput('var2', left(char(var2)));
      call symput('var3', left(char(var3)));
   quit;
%mend var123_macro;

proc fcmp outlib = work.myfunclib.finance;
   subroutine var123(start, pvalue, value, var1, var2, var3);
      outargs var1, var2, var3;
      rc1 = run_macro('var123_macro', start, pvalue, value, var1, var2, var3);
   endsub;
quit;

***********(2.2) CREATE A FUNCTION FOR VAR BY GARCH******************;
%macro var4_macro();
   ods select none;
   %let start = %eval(&start - 1000);
   %let end   = %eval(&start + 999);
   proc autoreg data = mem._tmp01(firstobs = &start obs = &end);
      model r = / noint garch=(q=1, p=1);
      output out = mem._gar01 ht = ht;
   run;
   data _null_;
      set mem._gar01 end = eof;
      if eof then do;
          var4 = -sqrt(ht) * probit(&pvalue) * &value;
          call symput('var4', var4);
      end;
   run;
   ods select all;
%mend;

proc fcmp outlib = work.myfunclib.finance;
   function var4(start, pvalue, value);
       rc1 = run_macro('var4_macro', start, pvalue, value, var4);
       return(var4);
   endsub;
quit;
********************END OF STEP 2************************************;

**********************(3) IMPLEMENTATION STEP************************;
options cmplib = (work.myfunclib);
data mem.one;
   pvalue = 0.01;
   value = 1000;
   do i  = 1001 to 4000;
       call var123(i, pvalue, value, var1, var2, var3);
       var4 = var4(i, pvalue, value);
       output;
   end;
run;
********************END OF STEP 3************************************;
The four VaRs can be displayed in a time series plot. Then the backtests can be conducted by Bernoulli coverage test and independence coverage test.

**********************(4) VISUALIZATION STEP*************************;
data _tmp02;
   set _tmp01;
   i = _n_;
run;

data _tmp03(where=(i>1000));
   merge _tmp02 mem.one;
   by i;
   label var1 = "Exponential Moving Average"
         var2 = "Moving Average"
         var3 = "Historical Simulation"
         var4 = "GARCH";
run;

ods html gpath = "c:\tmp" style = harvest;
proc sgplot data = _tmp03;
   series x = date y = var1;
   series x = date y = var2;
   series x = date y = var3;
   series x = date y = var4;
   format var1 dollar8.2;
   yaxis grid label = 'Value-at-risk '; 
   xaxis label = ' ';
run;
ods html close;
********************END OF STEP 4************************************;
The enhanced version of PROC IML, IMLPlus, can call SAS’s procedures within it. Rick gave a number of illustrations in the 4, 12-14 chapters of his book [Ref.4]. The GARCH part above can be re-written in SAS/IML Studio like:

value = 1000;
pvalue = 0.01;
var4 = j(4000, 1);

do i = 1001 to 4000;
   start = i - 1000;
   end   = i - 1;
   submit start end;
      proc autoreg data = sasuser._tmp01(firstobs = &start obs = &end);
            model r = / noint garch=(q=1, p=1);
            output out = _gar01 ht = ht;
      quit;
   endsubmit;
   use _gar01;
      read all var{ht} into ht;
   close _gar01;
   x = ht[1000];
   getvar4 = -sqrt(x)*probit(pvalue)*value;
   var4[i, 1] = getvar4;
end;

create sasuser.garch from var{var4}; 
   append;       
close sasuser.garch; 
In conclusion, I had a lot of fun in studying SAS/IML this summer. SAS/IML Studio is a friendly environment to write code. And IMLPlus provides a different perspective to think about SAS programming.

References:
1. Jon Danielsson. ‘Financial Risk Forecasting’. Wiley, 2011.
2. Jon Danielsson . http://www.financialriskforecasting.com/results
3. Rick Wicklin. The Do Loop. http://blogs.sas.com/iml/
4. Rick Wicklin. ’ Statistical Programming with SAS/IML Software’. SAS Publishing, 2010.

Good math, bad engineering

As a formal statistician and a current engineer, I feel that a successful engineering project may require both the mathematician’s abilit...