Compute Jackknife Coefficient Estimates in SAS¶
In certain scenarios, we want to estimate a model's parameters on the sample for each observation with itself excluded. This can be achieved by estimating the model repeatedly on the leaveoneout samples but is very inefficient. If we estimate the model on the full sample, however, the coefficient estimates will certainly be biased. Thankfully, we have the Jackknife method to correct for the bias, which produces the Jackknifed coefficient estimates for each observation.
Variable Definition¶
Let's start with some variable definitions to help with the explanation.
Variable  Definition 

b(i)  the parameter estimates after deleting the ith observation 
s^2(i)  the variance estimate after deleting the ith observation 
X(i)  the X matrix without the ith observation 
\hat{y}(i)  the ith value predicted without using the ith observation 
r_i = y_i  \hat{y}_i  the ith residual 
h_i = x_i(X'X)^{1}x_i'  the ith diagonal of the projection matrix for the predictor space, also called the hat matrix 
RStudent =\frac{r_i}{s(i) \sqrt{1h_i}}  studentized residual 
(X'X)_{jj}  the (j,j)th element of (X'X)^{1} 
DFBeta_j = \frac{b_{j}  b_{(i)j}}{s(i)\sqrt{(X'X)_{jj}}}  the scaled measures of the change in the jth parameter estimate calculated by deleting the ith observation 
Objective¶
Compute the coefficient estiamtes with the ith observation excluded from the sample, i.e. b(i), or the Jackknifed coefficient estimate.
Formula¶
From the table above, we can get that the jth Jackknifed coefficient estimate b_{(i)j} without using the ith observation is:
Hence,
The good thing is that PROC REG
produces the coefficient estimate b_j for j=1,2,...K, where K is the number of coefficients, and the INFLUENCE
and I
options produce the remaining statistics just enough to compute b(i):
Variable  Option in PROC REG or MODEL statement  Name in the output dataset 

b_j  Outest= option in PROC REG  <jthVariable> 
r_i  OutputStatistics= from INFLUENCE option in MODEL statement  Residual 
RStudent  OutputStatistics= from INFLUENCE option in MODEL statement  RStudent 
h_i  OutputStatistics= from INFLUENCE option in MODEL statement  HatDiagnol 
DFBeta_j  OutputStatistics= from INFLUENCE option in MODEL statement  DFB_<jthVariable> 
(X'X)_{jj}  InvXPX= from I option in MODEL statement  <jthVariable> 
Example¶
Discretionary accruals¶
Suppose we want to calculate the firmlevel discretionary accruals for each year using the Jones (1991) model and Kothari et al (2005) model. For a firm i, we need to first estimate the model for the industryyear excluding firm i, then use the coefficient estimates to generate predicted accruals for firm i. The firm's discretionary accruals is the actual accruals minus the predicted accruals.
Below is an example PROC REG
that produces three datasets named work.params
, work.outstats
and work.xpxinv
, which contain sufficient statistics to compute the Jackknifed estimates and thus the predicted accruals.
1 2 3 4 5 6 7 8 9 10 11 12 13 

Full SAS program for estimating 5 different measures of discretionary accruals is avaiable at /programs/discretionaryaccruals/.