Wednesday, September 22, 2010

Multi-study research on Bovine respiratory disease



Situation:
The purpose of this research was to (1) to explore a recent multi-study approach (Arends, et al. 2008) in combining observational survival data instead of traditional meta-analysis, and (2) to develop multivariate random-effects models with or without covariates to aggregate three studies on Bovine Respiratory Disease (BRD). Models were constructed, assessed and presented by programming in SAS®.
Task:The multivariate random-effects models built in this report demonstrated improved efficiency, and generalizability and precision.
Action:First the modeling is simple and easy to explain. Second the aggregation of the three studies was accomplished and survival proportion with CIs were updated. Second the estimated survival proportions by such models, showing reduced standard error, are more precise than Kaplan-Meier estimated survival proportion or Cox proportion hazards regression estimated survival proportion.
Result:Transformed data with LOCF by LAST. and FIRST. variables;Proposed a multi-survival-data model by PROC PHREG/LIFETEST/GLIMMIX and increased >70% precision; Rendered graphs by PROC SGPLOT/SGPANEL and SAS Graph Template language; Generated table and listing by PROC REPORT/SQL/TABULATE

Reference: Arends LR, Hunink MG, Stijnen T. Meta-analysis of summary survival curve data. Stat Med. 2008 Sep 30;27(22):4381-96.
Code I
/******************************************************************/
PROGRAM: NO-COVARIATES MODEL
PURPOSE: CREATE ARENDS' MODEL FOR DATA SETS 1 & 2 $ 3 BY NO-COVARIATE METHOD
AUTHOR: ---------------
LAST REVISE DATE:  JUNE 18, 2010
LAST TEST DATE: JUNE 18, 2010
*****************************************************************/
libname source 'd:\animal_data'; /*set up the library for the input*/;
/*set up the work library for output*/;

********************BEGINNING OF THE PROGRAM********************;

**********#1 USE PROC LIFETEST**********;
ods graphics on;
proc lifetest data = source.data_1_2_3 OUTSURV =pred_bylifetest  plots=(ls,lls) alpha=0.05;   /*generate the log and loglog transformation plots*/
title 'Figure 2A & 2B ';
time DOF_FIRST_TX * status(0);
strata source;
run;
ods graphics off;
**********END**********;

**********#2 GENERATE SURVIVAL CURVE BY WEEKS**********;
proc sgplot data=pred_bylifetest;
title 'Figure 1';
step x= DOF_FIRST_TX y=survival /group=source;
yaxis min=0;
xaxis grid values=(0, 7, 14, 21, 28, 35, 42, 49, 56);
run;
**********END**********;

**********#3 MANIPULATION OF THE PREDICTED DATA SET BY PROC LIFETEST**********;
data pred_bylifetest;
set pred_bylifetest;
*****CLEAN THE DATA SET;
if survival=.  then delete;                 /*delete invalid information for survival proportion*/
*****ADD TWO VARIABLES;
if survival ne 1 then  lls=log(-log(survival));                      /*add the log(-log()) transformed survival proportion*/
if DOF_FIRST_TX gt 0 then ln_day=log(DOF_FIRST_TX);                                 /*add the log transformed time*/
*****ADD LABELS;
label DOF_FIRST_TX='Days to the first treatment'                    
ln_day='log(Days to the first treatment)'
lls='log(-log(Survival probability function))';
run;
**********END**********;

**********#4 FIT THE SCATTER DOTS WITH LEAST-SQUAR LINES**********;
*****FIT THE SCATTER DOTS WITH SEPARATE STRAIGHT LINES;
proc sgplot data=pred_bylifetest;
title 'Figure 2C';
scatter x=ln_day y=lls/group=source;
reg  x=ln_day y=lls/group=source degree=1;        /*fit the scatter dots with the straight lines separately*/
yaxis label='log(-log(Survival probability))';
xaxis label='log(Days to the first treatment)';
run;
*****FIT THE SCATTER DOTS WITH SEPARATE QUARRATIC CURVES;
proc sgplot data=pred_bylifetest;
title 'Figure 2D';
scatter x=ln_day y=lls/group=source;   
reg  x=ln_day y=lls/group=source degree=2;          /*fit the scatter dots with the quadratic curves separately*/
yaxis label='log(-log(Survival probability))';
xaxis label='log(Days to the first treatment)';
run;
**********END**********;

**********#5 USE PROC MIXED TO GENERATE COEFFICIENTS**********;
*****GENERATE THE COEFFICIENTS WITH THE LINEAR MODEL;
proc mixed data= pred_bylifetest;
title 'Table 1 & 2';
class source;
model  lls=ln_day/s outp=pred_bymixed_linear;
random source/s;
run;
*****GENERATE THE COEFFICIENTS WITH THE LINEAR MODEL;
proc mixed data= pred_bylifetest;
title 'Table  & 2';
class source;
model  lls=ln_day ln_day*ln_day/s outp=pred_bymixed_quadratic;
random source/s;
run;
**********END**********;

**********#6 GENERATE THE COMBINED MODEL BY THE LINEAR EQUATION**********;
*****GENERATE THE MODEL;
data combined_bymixed_linear;
do DOF_FIRST_TX=1 to 56 by 1;
Esurvival=exp(-exp(-1.9249+0.6421*log(DOF_FIRST_TX)));
source='Combine';
output;
end;
run;
*****COMBINE THE MODEL WITH RAW SURVIVAL PROPORTION;
data combined_bymixed_linear_plus_raw;
set combined_bymixed_linear(rename=(Esurvival=SURVIVAL))  pred_bylifetest(keep=source DOF_FIRST_TX  SURVIVAL);
run;
*****COMPARE THE COMBINED WITH RAW SURVIVAL CURVES;
proc sgplot data=combined_bymixed_linear_plus_raw;
title 'Figure 3A';
step x=DOF_FIRST_TX y=survival/group=source;
yaxis grid values=(0 to 1 by 0.1);
run;
**********END**********;

**********#7 GENERATE THE COMBINED MODEL BY THE QUADRATIC EQUATION**********;
*****GENERATE THE MODEL;
data combined_bymixed_quadratic;
do DOF_FIRST_TX=1 to 56 by 1;
Esurvival=exp(-exp(-2.4587+1.3285*log(DOF_FIRST_TX)-0.1690*log(DOF_FIRST_TX)*log(DOF_FIRST_TX)));  /*coefficients were given by proc mixed above*/
source='Combine';
output;
end;
run;
*****COMBINE THE MODEL WITH RAW SURVIVAL PROPORTION;
data combined_bymixed_quad_plus_raw;
set combined_bymixed_quadratic(rename=(Esurvival=SURVIVAL))  pred_bylifetest(keep=source DOF_FIRST_TX  SURVIVAL);
run;
*****COMPARE THE COMBINED WITH RAW SURVIVAL CURVES;
proc sgplot data=combined_bymixed_quad_plus_raw;
title 'Figure 3B';
step x=DOF_FIRST_TX y=survival/group=source;
yaxis grid values=(0 to 1 by 0.1);
run;
**********END**********;

**********#8 RECOVER THE MODELS FOR THE SPECIFIC-STUDY CURVES**********;
*****RECOVER THE LINEAR MODEL;
data recover_bymixed_linear;
set pred_bymixed_linear;
Esurvival=exp(-(exp(pred)));
Eucl=exp(-(exp(lower)));
Elcl=exp(-(exp(upper)));
run;
*****RECOVER THE QUADRATIC MODEL;
data recover_bymixed_quadratic;
set pred_bymixed_quadratic;
Esurvival=exp(-(exp(pred)));
Eucl=exp(-(exp(lower)));
Elcl=exp(-(exp(upper)));
run;
**********END**********;

**********#9 GENERATE STUDY-SPECIFIC SURVIVAL PROPORTIONS AND CONFIDENCE INTERVALS*********;
*****DRAW THE RAW SURVIVAL CURVES AND THE CORRESPONDING CONFIDENCE INTERVALS;
proc sgplot data=pred_bylifetest;
title 'Figure 4A';
series x=DOF_FIRST_TX y=survival/group=source;
band lower=SDF_LCL upper=SDF_UCL x=DOF_FIRST_TX /group=source transparency=0.5;
yaxis grid values=(0 to 1 by 0.1);
run;
*****DRAW THE  SURVIVAL CURVES FROM THE LINEAR MODEL AND THE CORRESPONDING CONFIDENCE INTERVALS;
proc sgplot data=recover_bymixed_linear;
title 'Figure 4B';
series x=DOF_FIRST_TX y=Esurvival/group=source;
band lower=Elcl upper=Eucl x=DOF_FIRST_TX /group=source transparency=0.5;
yaxis grid values=(0 to 1 by 0.1);
run;
*****DRAW THE  SURVIVAL CURVES FROM THE QUADRATIC MODEL AND THE CORRESPONDING CONFIDENCE INTERVALS;
proc sgplot data=recover_bymixed_quadratic;
title 'Figure 4C';
series x=DOF_FIRST_TX y=Esurvival/group=source;
band lower=Elcl upper=Eucl x=DOF_FIRST_TX /group=source transparency=0.5;
yaxis grid values=(0 to 1 by 0.1);
run;
**********END**********;

********************END OF THE PROGRAM********************;

Code II
/*******************************************************************
PROGRAM: COVARIATE MODEL
PURPOSE: CREATE ARENDS' MODEL FOR DATA SETS 1 & 2 BY COVARIATE METHOD
AUTHOR: -----------------
LAST REVISE DATE:  JUNE 18, 2010
LAST TEST DATE: JUNE 18, 2010
********************************************************************/

libname source 'd:\animal_data';     /*set up the library for the input*/;
libname mylib 'd:\animal_data_output';    *set up the library for the output*/;

********************BEGINNING OF THE PROGRAM********************;

**********#1 USE PROC PHREG WITH TEMPERATURE COVARIATE**********;
**** INPUT VALUES FOR TEMPERATURE;
data mylib.vector;
do rect=38.1 to 41.7 by 0.1;                      /*generate a vector as covariate for proc phreg*/
id=cats('The temperature=', rect);            /*add a label for each temperature level*/
output;
end;
run;
*****OUTPUT PREDICTIONS ;
proc phreg data=source.data_1_2;
model DOF_FIRST_TX * status(0)=RECT;
strata source;
baseline covariates=mylib.vector out=mylib.pred_byphreg logsurv=ls loglogs=lls survival=_all_ /alpha=0.05;      /*generate the estimated survival  proportions and the corresponding 95% CI*/
run;
***********END ***********;

**********#2 MANIPULATION OF PREDICTION DATA SET BY PHREG*********;
data mylib.pred_byphreg ;
set mylib.pred_byphreg ;
if DOF_FIRST_TX gt 0 then logday=log(DOF_FIRST_TX) ;   /*add a log transformation for the time variable*/
nls=-ls;                     /*add a negative log transformation for the survival proportions*/
label DOF_FIRST_TX='Days to the first treatment'                      /*add the labels to the variables*/
RECT='Rectal temperature'
logday='log(Days to the first treatment)'
nls='-log(Survival probability function)';
rect=trim(left(rect));                                                              /*align all temperatures to the same format*/
run;
***********END***********;

**********#3 USE GRAPH TEMPLATE LANGUAGE TO GENERATE THE RESPONSE SURFACE FOR THE RAW SURVIVAL PROPORTIONS**********;
***** MAKE A 3D MODEL FOR SURVIVAL PROPORTION BY PROC TEMPLATE;
proc template;
define statgraph survival_surface_3d;      /*the template will be saved under sasuser.templat*/
dynamic  var1 var2 var3 ;          /*make the three dynamic variables for proc sgrender*/
begingraph;
layout overlay3d / cube=false;
surfaceplotparm x=var1 y=var2 z=var3 /   
surfacetype=fill
surfacecolorgradient=var3             
colormodel=twocolorramp
reversecolormodel=true ;    /*the gradient color of surface will be by the survival proportion*/;
endlayout;
endgraph;
end;
run;
*****RENDER THE SURVIVAL PROPORTION FOR DATA SET 1 BY PROC TEMPLATE;
proc sgrender data=mylib.pred_byphreg template=survival_surface_3d;
title 'Figure 5A';
dynamic var1='rect' var2="DOF_FIRST_TX" var3="survival" ; 
where source='A';
run;
*****RENDER THE SURVIVAL PROPORTION FOR DATA SET 2 BY PROC TEMPLATE;
proc sgrender data=mylib.pred_byphreg template=surv3d;
title 'Figure 5C';
dynamic var1='rect'  var2="DOF_FIRST_TX" var3="survival";
where source='B';
run;
***** MAKE A 3D MODEL FOR CONFIDENCE INTERVAL BY PROC TEMPLATE;
proc template;
define statgraph ci_surface_3d;        /*the template will be saved under sasuser.templat*/
begingraph;
dynamic  var1 var2 var3 var4 ;          /*make the four dynamic variables for proc sgrender*/
layout overlay3d / cube=false rotate=60;
surfaceplotparm x=var1 y=var2  z=var3;    /*make the reponse surface for the upper confidence interval*/
surfaceplotparm x=var1 y=var2 z=var4;          /*make the reponse surface for the lower confidence interval*/
endlayout;
endgraph;
end;
run;
***** RENDER THE CONFIDENCE INTERVAL FOR DATA SET 1 BY PROC TEMPLATE;
proc sgrender data=mylib.pred_byphreg template=ci_surface_3d;
title 'Figure 5B';
dynamic var1='rect'  var2="DOF_FIRST_TX" var3="UpperSurvival" var4="LowerSurvival";
where source='A';
run;
***** RENDER THE CONFIDENCE INTERVAL FOR DATA SET 2 BY PROC TEMPLATE;
proc sgrender data=mylib.pred_byphreg template=ci_surface_3d;
title 'Figure 5D';
dynamic var1='rect'  var2="DOF_FIRST_TX" var3="UpperSurvival" var4="LowerSurvival";
where source='B';
run;
***** MAKE THE CROSS SECTIONS OF SURVIVAL CURVES BY PROC SGPANEL;
proc sgpanel data=mylib.pred_byphreg(where=(rect in (38.5, 39,39.5, 40, 40.5,41, 41.5 )));  /*choose 7 temperatures to output*/
title 'Figure 5E';
panelby rect source/  layout=lattice novarname columns=7;        /*decide the 2X7 configuration of the panel*/
band x=DOF_FIRST_TX lower=LowerSurvival upper=UpperSurvival;  /*draw the confidence  intervals*/
series x=DOF_FIRST_TX y=survival;               /*draw the survival curves*/
run;
**********END**********;

**********#4 MODEL SELECTION BY PROC GLMSELECT**********;
ods graphics on;
proc glmselect data=mylib.pred_byphreg  plot=CriterionPanel;
title 'Figure 6 ';
class source;
model lls=logday rect logday*logday rect*rect rect*logday/selection = stepwise(select=SL) stats=all;
run;
ods graphics off;
***********END**********;

**********#5 MODEL FITTING BY PROC MIXED**********;
proc mixed data=mylib.pred_byphreg;
title 'Table 3';
class source;
model lls=logday rect  logday*logday/solution outp=mylib.pred_bymixed;  /*use the selection result by proc glmselect*/
random source/s;
run;
**********END**********;

**********#6 GENERATE THE RESPONSE SURFACE FOR THE COMBINED SURVIVAL PROPORTIONS**********;
*****CONSTRUCT  THE COMBINED EQUATION;
data mylib.combined_bymixed;
do DOF_FIRST_TX=1 to 42 by 1;
do rect=38.1 to 41.7 by 0.1;
esuv=exp(-exp(-32.7610+1.5148*log(DOF_FIRST_TX)-0.2024*log(DOF_FIRST_TX)*log(DOF_FIRST_TX)+0.7610*rect)); /*use the coeffiicients of fixed effects result by proc mixed*/
rect=trim(left(rect));
output;
end;
end;
label esuv='Combined survival proportion';
run;
***** RENDER THE COMBINED SURVIVAL PROPORTION ;
proc sgrender data=mylib.combined_bymixed template=survival_surface_3d;   /*use the previous 3D model*/
title 'Figure 7A';
dynamic var1='rect' var2="DOF_FIRST_TX" var3="Esuv" ;
run;
***** MAKE THE CROSS SECTIONS OF SURVIVAL CURVES BY PROC SGPANEL;
proc sgpanel data=mylib.combined_bymixed (where=(rect in (38.5, 39,39.5, 40, 40.5,41, 41.5 )));
title 'Figure 7B';
panelby rect /  novarname columns=7;
series x=DOF_FIRST_TX y=esuv;
run;
**********END **********;

***********#7 GENERATE THE RESPONSE SURFACE FOR THE STUDY-SPECIFIC SURVIVAL PROPORTIONS**********;
data mylib.specific_bymixed(keep=DOF_FIRST_TX rect Esurvival source Eucl Elcl);
set mylib.pred_bymixed;;
Esurvival=exp(-(exp(pred)));
Eucl=exp(-(exp(lower)));
Elcl=exp(-(exp(upper)));
rect=trim(left(rect));
run;
*****RENDER THE SURVIVAL PROPORTION FOR DATA SET 1 ;
proc sgrender data=mylib.specific_bymixed template=survival_surface_3d;
title 'Figure 8A';
dynamic var1='rect' var2="DOF_FIRST_TX" var3="esurvival" ;
where source='A';
run;
*****RENDER THE SURVIVAL PROPORTION FOR DATA SET 2;
proc sgrender data=mylib.specific_bymixed template=survival_surface_3d;
title 'Figure 8C';
dynamic var1='rect' var2="DOF_FIRST_TX" var3="esurvival" ;
where source='B';
run;
*****RENDER THE CONFIDENCE INTERVAL FOR DATA SET 1 ;
proc sgrender data=mylib.specific_bymixed template=ci_surface_3d;
title 'Figure 8B';
dynamic var1='rect'  var2="DOF_FIRST_TX" var3="Eucl" var4="Elcl";
where source='A';
run;
*****RENDER THE CONFIDENCE INTERVAL FOR DATA SET 2 ;
proc sgrender data=mylib.specific_bymixed template=ci_surface_3d;
title 'Figure 8D';
dynamic var1='rect'  var2="DOF_FIRST_TX" var3="Eucl" var4="Elcl";
where source='B';
run;
***** MAKE THE CROSS SECTIONS OF SURVIVAL CURVES BY PROC SGPANEL;
proc sgpanel data=mylib.specific_bymixed(where=(rect in (38.5, 39,39.5, 40, 40.5,41, 41.5 )));  /*choose 7 temperatures to output*/
title 'Figure 8E';
panelby rect source/  layout=lattice novarname columns=7;        /*decide the 2X7 configuration of the panel*/
band x=DOF_FIRST_TX lower=Elcl upper=Eucl;           /*draw the confidence  intervals*/
series x=DOF_FIRST_TX y=Esurvival;               /*draw the survival curves*/
run;
**********END*********;

********************END OF THE PROGRAM********************;

Good math, bad engineering

As a formal statistician and a current engineer, I feel that a successful engineering project may require both the mathematician’s abilit...