Sunday, August 14, 2011

Using PROC COPULA in a more volatile market

The last week witnessed one of the wildest fluctuations in the market. Copula could measure the nonlinear dependence of multiple assets in a portfolio, and most importantly, is pronounced as \`kä-pyə-lə\(Thanks to the tip by Rick). The latest COPULA procedure in SAS 9.3 is one of the emerging tools to implement copulas.

To test it, I used R to download the daily return data for a few stock shares plus S&P500 index prices, since January 01, 2010. The six stocks are Citi group(C), JP Morgan(jpm), Pfizer(pfe), IBM(ibm), Apple(aapl), and Google(goog). I constructed an equally weighted portfolio by them. Until August 12, 2011, there are 406 observations. Therefore, in a composite plot by SAS, the stocks of banks show the greatest volatility, followed by pharmaceutical and high-tech companies.


#*********************(0) DOWNLOAD MARKET DATA***********************;
library("tseries")    
sp500= get.hist.quote(instrument="^gspc",start="2010-01-01",quote="AdjClose")
c    = get.hist.quote(instrument="c",    start="2010-01-01",quote="AdjClose") 
jpm  = get.hist.quote(instrument="jpm",  start="2010-01-01",quote="AdjClose") 
pfe  = get.hist.quote(instrument="pfe",  start="2010-01-01",quote="AdjClose")
ibm  = get.hist.quote(instrument="ibm",  start="2010-01-01",quote="AdjClose")
aapl = get.hist.quote(instrument="aapl", start="2010-01-01",quote="AdjClose")
goog = get.hist.quote(instrument="goog", start="2010-01-01",quote="AdjClose")

result=as.data.frame(diff(log(merge(sp500, c, jpm, pfe, ibm, aapl, goog))))
write.csv(result,file='c:/tmp/r2sas.csv')

**********************(1) INPUT RAW DATA*****************************;
options fullstimer; dm 'output;clear; log;clear;';
data raw;
   infile 'c:/tmp/r2sas.csv' delimiter = ',' missover dsd firstobs=2 ;
   informat date yymmdd10. sp500 c jpm pfe ibm aapl goog best32.; 
   format date date9.;
   input date sp500 c jpm pfe ibm aapl goog;
run;

**********************(2) PLOT STOCK RETURNS*************************;
proc transpose data = raw out = test01;
   by date;
   var c jpm pfe ibm aapl goog;
run;
data test02;
   merge test01 raw(keep=date sp500);
   by date;
run;
ods graphics / antialiasmax=2900;
proc sgpanel data = test02;
   panelby _name_ / spacing=5 columns = 3 rows = 2 novarname;
   series y=sp500 x=date / lineattrs=(color=red);
   series y=col1  x=date / lineattrs=(color=blue);
   refline 0/ axis=y lineattrs=(pattern=shortdash);
   rowaxis label = 'daily returns';
   label col1 = 'individual stock' ;
run;

I followed the online document of this procedure and also chose the t copula. The correlation plots of the fitted data are displayed above. It seems that PROC COPULA could only draw up to 5*5 matrix for scatter plots in my test. I don’t know if there is any parameter to activate since I have 6 stocks.

**********************(3) CALCULATE COPULA***************************;
proc copula data = raw(drop=sp500); 
   var c jpm pfe ibm aapl goog;
   fit t / marginals = empirical
           method = mle
           plots = (data = both matrix);
   simulate / ndraws = 10000
            seed = 20100822
            out  = sm_t;
run;

Then the kernel densities between stocks from the simulated dataset were calculated in SAS and plotted in R.
 
**********************(4) CALCULATE KERNEL DENSITIES*****************;
%macro kernel(var_list = );
   ods select none;
   %do i = 1 %to 5;
      %do j = %eval(&i + 1) %to 6;
         %let var1 = %scan(&var_list, &i);
         %let var2 = %scan(&var_list, &j);
         proc kde data= sm_t ;
            bivar &var1 &var2 / out = _tmp01;
         run;
         %if %eval(&i + &j) = 3 %then %do;
            data comb;
               set _tmp01;
            run;
         %end;
         %else %do;
            data comb;
               set comb _tmp01;
            run;
         %end;
      %end;
   %end;
   ods select all;
%mend;
%kernel(var_list = c jpm pfe ibm aapl goog);

data comb1;
   set comb;
   length gname $15;
   gname = cats('x=', var1, ';', 'y=', var2);
   keep value1 value2 gname density;
run;
proc export data = comb1 outfile = 'c:/tmp/sas2r.csv' replace;
run;

#*********************(5) PLOT DENSITY IN R**************************;
x = read.csv('c:/tmp/sas2r.csv')
library('lattice')
wireframe(density ~ value1 * value2 | gname , x, shade = TRUE,
          screen = list(z = -30, x = -50), lwd = 0.01, 
          xlab = "Stock X", ylab = "Stock Y",  
          zlab = "Density")


The simulated daily portfolio returns are likely to follow a normal distribution.

**********************(6) PLOT RETURN DISTRIBUTION*******************;
data port_ret (drop = i ret);
   set sm_t;
   array returns{6} c jpm pfe ibm aapl goog;
   ret =0;
   do i =1 to 6;
      ret = ret+ (1/6)*exp(returns[i]);
   end;
   port_ret = ret-1;
run;

proc kde data = port_ret;
   univar port_ret / percentiles = 1,2.5,5,10,90,95,99 plots=histdensity;
   ods output  percentiles = pcts;
   format port_ret percent8.3;
   label port_ret = 'portfolio return';
run;

Several predicted portfolio changes at given probabilities are given in a tilt plot. Hope this portfolio’s performance next day (August 15, 2011) would be within the expected ranges.

**********************(7) PLOT PORTFOLIO RETURNS*********************;
data prob;
   set pcts;
   percent = percent / 100;
   if percent > 0.5 then do ;
      percent  = 1 - percent ;
    result = put(percent , percent8.1)||
         'probability portfolio gains'|| put(port_ret, percent8.3);
   end;
    else result = put(percent , percent8.1)||
         'probability portfolio loses'|| put(port_ret, percent8.3);
run;

goptions device=javaimg ftitle="arial/bold" ftext="arial" 
htitle=.15in htext=.2in xpixels=600 ypixels=500;
proc gtile data = prob;
    tile Percent tileby = (result, Percent) / colorvar = port_ret;
   format port_ret 8.4;
   label port_ret = 'portfolio return';
run;

PROC COPULA supports five types of copulas(normal copula, t copula, clayton copula, Gumbel copula and Frank copula). Jan Chvosta described a ranking method to choose the best copula. I can easily apply the author's protocol.

Overall, PROC COPULA has much lower learning curve than the R package ‘copula’. Hope it grows to a dominating tool in analyzing copula.

Good math, bad engineering

As a formal statistician and a current engineer, I feel that a successful engineering project may require both the mathematician’s abilit...