Monday, December 6, 2010

Proc Fcmp(3): brute force for a distribution's pdf


Proc Fcmp can produce arbitrary distribution formulas. In this example, suppose that I don’t know too much statistics, what if I want to evaluate the pdf of the absolute value of the subtraction between two independent random variable from the uniform distributions? In this example, probably we can follow such procedures: (1) use Proc Kde to find the kernel distribution and make a guess about what this distribution is; (2) use Proc Fcmp to make the pdf of the distribution; (3) use Proc Model to fit the test dataset and find the parameters; (4) use the simulation to validate the hypothesis. Thus, such a method may be a useable solution to evaluate distribution of a random variable, if without stringent mathematical proof.
*************(1) The distribution of the absolute value of between two uniform distribution*****;
data distance;
do i=1 to 10000;
x1=ranuni(0);
x2=ranuni(0);
x=abs(x1-x2);
drop i;
output;
end;
run;
proc sgplot;
histogram x1/transparency=0.3;
histogram x2/transparency=0.7;
histogram x/transparency=0.8;
run;

************(2) Observe the kernel distribution ************;
ods graphics on;
proc kde data=distance;
univar x /plots=all out=myden;
run;
ods graphics off;

***********(3) Fit with Beta distribution*****************;
proc fcmp outlib = sasuser.myfunc.math;
function betadist(a,b,x);
density=(beta(a,b))**(-1)*x**(a-1)*(1-x)**(b-1);
return(density);
endsub;
run;

options cmplib = sasuser.myfunc nodate ls=80;
proc model data=myden(rename=(value=x));
density=betadist(a,b,x);
fit density;
quit;

************(4) Look at the distribution again***************;
data newdata;
do i=1 to 1E5;
y=rand('beta', 1, 2);
output;
end;
run;

proc sgplot data=newdata;
histogram y/fillattrs=(color=green) ;
run;
********************END********************;

Good math, bad engineering

As a formal statistician and a current engineer, I feel that a successful engineering project may require both the mathematician’s abilit...