Wednesday, May 25, 2011

Semiparametric methods in predicting loss given default


Sparse data is a big concern in building models for loss given default (LGD) for corporate risk. For LGD, most predictors are instrument-related, firm-specific, macroeconomic and industry-specific variables, while the costs to collect such data may be relatively high. In one example of Gunter and Peter’s book, industry-wise average default rate, yearly average default rate, firm-wise leverage rate were applied to predict LGD. To increase the predictability, the painful transformation of LGD was conducted [Ref. 1]. Actually some non-linear models could be considered.

In a conference paper about consumer risk scoring, Wensui mentioned that generalized additive model (GAM) provides the ability to detect the nonlinear relationship between risk behavior and predictors [Ref. 2]. In this example, we are possibly more interested in estimating the parameter of firm-specific leverage (lev). Thus I used Proc GAM to estimate this variable’s parameter while smoothing other predictors by LOESS functions. In addition, I used Proc LOESS to realize the nonparametric regression. Comparing the two methods in a series plot, their predictions of LGD are pretty close. As the result, Proc GAM may provide us an insightful tool to construct meaningful semiparametric regression to predict LGD.

References:
1. Gunter Loeffler and Peter Posch. ‘Credit Risk Modeling using Excel and VBA’. The 2nd edition. Wiley. 2011
2. Wensui Liu, Chuck Vu, Jimmy Cela.‘Generalizations of Generalized Additive Model (GAM): A Case of Credit Risk Modeling’. SAS Global 2009

data _tmp01;
  infile "h:\raw_data.txt" delimiter = '09'x missover dsd firstobs=2;
  informat lgd   lev   lgd_a i_def 8.3;
  label lgd   = 'Real loss given default'
      lev   = 'Leverage coefficient by firm'
      lgd_a = 'Mean default rate by year'
      i_def = 'Mean default rate by industry';
  input lgd   lev   lgd_a i_def;
run;

ods html gpath = 'h:\' style = money;
ods graphics on;
proc loess data=_tmp01;
   model lgd = lev lgd_a i_def / scale = sd select = gcv degree = 2;
   score;
   ods output scoreresults = predloess;
run;

proc gam data= _tmp01 plots = components(clm);
   model lgd = loess(i_def) loess(lgd_a) param(lev) / method = gcv;
   output out = predgam p = pbygam;
run;
ods graphics off;

data _tmp02;
   merge predloess predgam;
   keep p_LGD LGD pbygamLGD obs;
   label p_LGD = 'Prediction by Proc LOESS'
      pbygamLGD = 'Prediction by Proc GAM';
run;

proc sgplot data = _tmp02;
   series x = obs y = lgd ;
   series x = obs y = p_lgd;
   series x = obs y = pbygamlgd;
   yaxis label = 'loss given default';
run;
ods html close;

Good math, bad engineering

As a formal statistician and a current engineer, I feel that a successful engineering project may require both the mathematician’s abilit...