Panel variable: nr (strongly balanced)
Time variable: year, 1980 to 1987
Delta: 1 unit
More than one way to estimate a model
\[y_{it} = \alpha + \beta x_{it} + \alpha_i + \delta_t + \epsilon_{it}\]
\[\Delta y_{it} = \beta \Delta x_{it} + \delta_t + \Delta \epsilon_{it}\]
Today we are going to describe the use of three other methods:
This method require their own methods, but could be used in more flexible scenarios.
What do we mean Fixed effects? Random Effects? Correlated Random Effects?
Lets consider the following model: \[y_{it} = \beta_1 x_{it} + \beta_2 z_{it} + \alpha_i + \epsilon_{it}\]
which doesnt include a time-fixed effect, nor time-invariante factors.
This could be estiamted simply adding dummies for each individual in the data set. (too many dummies). Instead consider the following
Now, for each person, lets estimate the average characteristics \(\bar w = \frac{1}{T} \sum_{t=1}^T w_{it}\). We could apply this to the model above an dobtain:
\[\bar y_{i} = \beta_1 \bar x_{i} + \beta_2 \bar z_{i} + \alpha_i + \bar \epsilon_{i}\]
This no longer change across time.
It is a model interesting on itself. It captures Between Effects.
\[y_{it}-\bar y_{i} = \beta_1 (x_{it}- \bar x_{i}) + \beta_2 (z_{it}- \bar z_{i}) + e_{it} - \bar \epsilon_{i}\] \[\tilde y_{it} = \beta_1 \tilde x_{it} + \beta_2 \tilde z_{it} + \tilde \epsilon_{it}\]
Also of interest: * This model could now be estimated using OLS * Its an application of the FWL theorem. (we partial out the time-invariant factors) * If done by OLS, you need to correct the Degrees of freedom. (NT-N-k)
\[y_{it} = \beta_1 x_{it} + \beta_2 z_{it} + \alpha_i + \delta_t + \epsilon_{it}\]
\[\tilde y_{it} = y_{it} - \bar y_i - \bar y_t + \bar y\]
where \(\bar y_t\) is the average across individuals, and \(\bar y\) is the overall average of \(y_{it}\).
If panel is not balanced, you need to demean
data using interative methods.
Lets assume that \(\bar y=0\). We would need to demean the data many times as follows:
\[ \overline{ty}_{it} = y_{it} - \bar y_i - \bar y_t \] \[ \overline{tty}_{it} = \overline{ty}_{it} - \overline{ty}_i - \overline{ty}_t \] \[ \overline{ttty}_{it} = \overline{tty}_{it} - \overline{tty}_i - \overline{tty}_t \]
So on and so forth, until there is no more variation in the transformed data.
NOTE: There are more efficient ways to do this.
\[Corr(e_{it}+a_i,e{is}+a_i)=\frac{\sigma^2_a}{\sigma^2_a+\sigma^2_e}\]
GLS
. Since we know the “theoretical” correlation across errors, we could use this to transform the data, and estimate SE.First, define:
\[\theta = 1- \left[ \frac{\sigma^2_e}{\sigma^2_e+T \sigma^2_a}\right]^{1/2}\]
\[\tilde w_{it} = w_{it} - \theta \bar w_i\]
Estimate the main model using pool OLS. \(y_{it}=\beta_0 + \beta_1 x_{it} + v_{it}\)
Estimate \(\sigma^2_a\) as: \(\hat \sigma^2_a = \frac{\sum_{i=1}^N \sum_{t=1}^{T-1}\sum_{s=t+1}^{T} \hat v_{it} \hat v_{is}}{NT(T-1)/2 - (k+1)}\)
Estimate \(\sigma^2_e\) as: \(\hat \sigma_e^2 = \hat\sigma^2_v - \hat \sigma^2_a\)
Biggest Advantage of RE model is that you can now obtain effects for Time-invariant variables.
It is also more efficient, because you do not need to estimate individual effects, just capture the distribution of \(\alpha_i\).
In Stata
, you could estimate the panel models using the xtreg
command.
This command has options for Fixed Effects, Between Effects and Random Effects.
Panel variable: nr (strongly balanced)
Time variable: year, 1980 to 1987
Delta: 1 unit
** Pool OLS
qui: reg lwage educ black hisp exper expersq married union,
est sto m1
** pool OLS with Clustered SE
qui: reg lwage educ black hisp exper expersq married union, cluster(nr)
est sto m2
** Random Effects: Default
qui:xtreg lwage educ black hisp exper expersq married union, re
est sto m3
** Fixed Effects
qui:xtreg lwage educ black hisp exper expersq married union, fe
est sto m4
esttab m1 m2 m3 m4, se b(4) noomit nonumber mtitle(OLS OLS_CL RE FE)
----------------------------------------------------------------------------
OLS OLS_CL RE FE
----------------------------------------------------------------------------
educ 0.0994*** 0.0994*** 0.1012***
(0.0047) (0.0092) (0.0089)
black -0.1438*** -0.1438** -0.1441**
(0.0236) (0.0501) (0.0476)
hisp 0.0157 0.0157 0.0202
(0.0208) (0.0392) (0.0426)
exper 0.0892*** 0.0892*** 0.1121*** 0.1168***
(0.0101) (0.0124) (0.0083) (0.0084)
expersq -0.0028*** -0.0028** -0.0041*** -0.0043***
(0.0007) (0.0009) (0.0006) (0.0006)
married 0.1077*** 0.1077*** 0.0628*** 0.0453*
(0.0157) (0.0261) (0.0168) (0.0183)
union 0.1801*** 0.1801*** 0.1074*** 0.0821***
(0.0171) (0.0276) (0.0178) (0.0193)
_cons -0.0347 -0.0347 -0.1075 1.0649***
(0.0646) (0.1201) (0.1107) (0.0267)
----------------------------------------------------------------------------
N 4360 4360 4360 4360
----------------------------------------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001
We now know how to analyze panel data using three Stretegies: Pool OLS, Fixed Effects and, Random Effects.
Choosing between RE and POLs is rarely considered. (RE would be the default in most cases)
However, Choosing between FE vs RE is common: Hausman Test
---- Coefficients ----
| (b) (B) (b-B) sqrt(diag(V_b-V_B))
| fe re Difference Std. err.
-------------+----------------------------------------------------------------
exper | .1168467 .1121195 .0047272 .0016276
expersq | -.0043009 -.0040689 -.000232 .0001269
married | .0453033 .0627951 -.0174918 .0073427
union | .0820871 .1073789 -.0252917 .0073636
------------------------------------------------------------------------------
b = Consistent under H0 and Ha; obtained from xtreg.
B = Inconsistent under Ha, efficient under H0; obtained from xtreg.
Test of H0: Difference in coefficients not systematic
chi2(4) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 31.45
Prob > chi2 = 0.0000
The CRE model is an alternative approach that combines some of the features of RE and FE estimators. \[y_{it} = \beta_1 x_{it} + \beta_2 z_{i} + \alpha_i + \epsilon_{it}\]
One way to look at the “individual” fixed effect is to model it as a function of fixed effects:
\[\alpha_i = \alpha + \gamma_1 \bar x_{i} + \gamma_2 z_{i} + r_i\]
In this case, we assume the fixed unobserved effect \(\alpha_i\) could be written as a function of avg observed characteristics, and fixed factors.
And we assume that \(r_i\) would be uncorrelated with \(x_{it}\) , \(\bar x_i\) and \(z_{i}\).
\[y_{it} = \beta_1 x_{it} + \beta_2 z_{i} + \alpha + \gamma_1 \bar x_{i} + \gamma_2 z_{i} + r_i + \epsilon_{it}\] \[y_{it} = \alpha + \beta_1 x_{it} + \beta_2 z_{i} + \gamma_1 \bar x_{i} + \underbrace{\nu_{it}}_{ r_i + \epsilon_{it}}\]
Which we could now estimate using Pool OLS or RE. There is no more need to worry about correlation between \(r_i\) and \(x_{it}\).
Differences and Advantages:
CRE estimates for Time varying variables are identical to FE. \[\hat \beta_{cre}=\hat \beta_{fe}\]
CRE estimates shows clearly why RE are more efficient (RE imposes \(\gamma=0\))
Thus, we can test for FE vs RE using the following test:
\[H_0: \gamma = 0 \text{ or } RE \] \[H_a: \gamma \neq 0 \text{ or } FE \]
Random-effects GLS regression Number of obs = 4,360
Group variable: nr Number of groups = 545
R-squared: Obs per group:
Within = 0.1780 min = 8
Between = 0.2192 avg = 8.0
Overall = 0.2002 max = 8
Wald chi2(11) = 976.27
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
------------------------------------------------------------------------------
lwage | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
educ | .0946036 .0109043 8.68 0.000 .0732315 .1159757
black | -.1388124 .0488709 -2.84 0.005 -.2345977 -.0430271
hisp | .0047758 .0426925 0.11 0.911 -.0788999 .0884515
exper | .1168467 .0084197 13.88 0.000 .1003444 .133349
expersq | -.0043009 .0006053 -7.11 0.000 -.0054872 -.0031146
married | .0453033 .0183097 2.47 0.013 .009417 .0811896
union | .0820871 .0192907 4.26 0.000 .044278 .1198963
mn_exper | -.1672838 .051032 -3.28 0.001 -.2673046 -.067263
mn_expersq | .0094254 .0032684 2.88 0.004 .0030195 .0158312
mn_married | .0983604 .0450837 2.18 0.029 .0099979 .1867228
mn_union | .1885894 .0504022 3.74 0.000 .0898029 .2873759
_cons | .492309 .2210094 2.23 0.026 .0591386 .9254794
-------------+----------------------------------------------------------------
sigma_u | .32456727
sigma_e | .35125535
rho | .46057172 (fraction of variance due to u_i)
------------------------------------------------------------------------------
( 1) mn_exper = 0
( 2) mn_expersq = 0
( 3) mn_married = 0
( 4) mn_union = 0
chi2( 4) = 27.27
Prob > chi2 = 0.0000
Comparing across models:
------------------------------------------------------------
FE CRE RE
------------------------------------------------------------
exper 0.1168*** 0.1168*** 0.1121***
(0.0084) (0.0084) (0.0083)
expersq -0.0043*** -0.0043*** -0.0041***
(0.0006) (0.0006) (0.0006)
married 0.0453* 0.0453* 0.0628***
(0.0183) (0.0183) (0.0168)
union 0.0821*** 0.0821*** 0.1074***
(0.0193) (0.0193) (0.0178)
educ 0.0946*** 0.1012***
(0.0109) (0.0089)
black -0.1388** -0.1441**
(0.0489) (0.0476)
hisp 0.0048 0.0202
(0.0427) (0.0426)
mn_exper -0.1673**
(0.0510)
mn_expersq 0.0094**
(0.0033)
mn_married 0.0984*
(0.0451)
mn_union 0.1886***
(0.0504)
_cons 1.0649*** 0.4923* -0.1075
(0.0267) (0.2210) (0.1107)
------------------------------------------------------------
N 4360 4360 4360
------------------------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001
Stata
, you could estimate the CRE panel models using the xtreg
with RE option.
cre
(from fra install)checking cre consistency and verifying not already installed...
all files already exist and are up to date.
Random-effects GLS regression Number of obs = 4,360
Group variable: nr Number of groups = 545
R-squared: Obs per group:
Within = 0.1780 min = 8
Between = 0.2192 avg = 8.0
Overall = 0.2002 max = 8
Wald chi2(11) = 976.27
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
------------------------------------------------------------------------------
lwage | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
educ | .0946036 .0109043 8.68 0.000 .0732315 .1159757
black | -.1388124 .0488709 -2.84 0.005 -.2345977 -.0430271
hisp | .0047758 .0426925 0.11 0.911 -.0788999 .0884515
exper | .1168467 .0084197 13.88 0.000 .1003444 .133349
expersq | -.0043009 .0006053 -7.11 0.000 -.0054872 -.0031146
married | .0453033 .0183097 2.47 0.013 .009417 .0811896
union | .0820871 .0192907 4.26 0.000 .044278 .1198963
m1_exper | -.1672838 .051032 -3.28 0.001 -.2673046 -.067263
m1_expersq | .0094254 .0032684 2.88 0.004 .0030195 .0158312
m1_married | .0983604 .0450837 2.18 0.029 .0099979 .1867228
m1_union | .1885894 .0504022 3.74 0.000 .0898029 .2873759
_cons | -.0330167 .1330654 -0.25 0.804 -.2938202 .2277867
-------------+----------------------------------------------------------------
sigma_u | .32456727
sigma_e | .35125535
rho | .46057172 (fraction of variance due to u_i)
------------------------------------------------------------------------------
Next week: Switching gears to Time Series Analysis - When the present, the past and the future matter