Advanced Panel Data

More than one way to estimate a model

Fernando Rios-Avila

The old way (last class)

Last class we introduce the basic panel data model:

\[y_{it} = \alpha + \beta x_{it} + \alpha_i + \delta_t + \epsilon_{it}\]

This model could be estimated using First Differences approach.

\[\Delta y_{it} = \beta \Delta x_{it} + \delta_t + \Delta \epsilon_{it}\]

It elimitates the \(\alpha_i\), and constrains how \(\delta_t\) is estimated.
It allows you to related how changes in \(x_{it}\) are related to changes in \(y_{it}\).
- Thus Fixed variables across time cannot be identified.
It requires strong assumptions of strict exogeneity and no serial correlation.

The new way (This class)

Today we are going to describe the use of three other methods:
- Fixed Effects (FE)
- Random Effects (RE)
- Correlated Random effects (CRE) \(\simeq\) FE+RE
This method require their own methods, but could be used in more flexible scenarios.
What do we mean Fixed effects? Random Effects? Correlated Random Effects?
- All this will be estimation methods that relate to the same model.
- However, in all cases, we assume the unobserved are fixed factors across time. We simply identify them differently.

Fixed Effects Estimation

Lets consider the following model: \[y_{it} = \beta_1 x_{it} + \beta_2 z_{it} + \alpha_i + \epsilon_{it}\]

which doesnt include a time-fixed effect, nor time-invariante factors.

This could be estiamted simply adding dummies for each individual in the data set. (too many dummies). Instead consider the following
Now, for each person, lets estimate the average characteristics \(\bar w = \frac{1}{T} \sum_{t=1}^T w_{it}\). We could apply this to the model above an dobtain:

\[\bar y_{i} = \beta_1 \bar x_{i} + \beta_2 \bar z_{i} + \alpha_i + \bar \epsilon_{i}\]

This no longer change across time.
It is a model interesting on itself. It captures Between Effects.

Now, lets substract this from the original model:

\[y_{it}-\bar y_{i} = \beta_1 (x_{it}- \bar x_{i}) + \beta_2 (z_{it}- \bar z_{i}) + e_{it} - \bar \epsilon_{i}\] \[\tilde y_{it} = \beta_1 \tilde x_{it} + \beta_2 \tilde z_{it} + \tilde \epsilon_{it}\]

What we have just done is apply the within transformation. The model above now captures the relationship between \(X's\) and \(Y's\) using only changes within each individual.
- This “ignores” variation across individuals.
This within transformation eliminates all time-invariant factors, including \(\alpha_i\).

Also of interest: * This model could now be estimated using OLS * Its an application of the FWL theorem. (we partial out the time-invariant factors) * If done by OLS, you need to correct the Degrees of freedom. (NT-N-k)

Expanding the model: Time fixed effects

Now, lets consider the following model:

\[y_{it} = \beta_1 x_{it} + \beta_2 z_{it} + \alpha_i + \delta_t + \epsilon_{it}\]

We could apply the same transformation as before, but now we need to consider the \(\delta_t\).
- Typically, the number of time periods is small, and we could control for them using dummies. (need to be explicit about it)
- Altenativelly, One may need to use a Douple Demeaning approach.

\[\tilde y_{it} = y_{it} - \bar y_i - \bar y_t + \bar y\]

where \(\bar y_t\) is the average across individuals, and \(\bar y\) is the overall average of \(y_{it}\).

This will work as intended if the panel is balanced.

When Panel is not balanced:

If panel is not balanced, you need to demean data using interative methods.
Lets assume that \(\bar y=0\). We would need to demean the data many times as follows:

\[ \overline{ty}_{it} = y_{it} - \bar y_i - \bar y_t \] \[ \overline{tty}_{it} = \overline{ty}_{it} - \overline{ty}_i - \overline{ty}_t \] \[ \overline{ttty}_{it} = \overline{tty}_{it} - \overline{tty}_i - \overline{tty}_t \]

So on and so forth, until there is no more variation in the transformed data.

i.e. \(\overline{t \dots ty}_i = \overline{t\dots t y}_t=0\)

NOTE: There are more efficient ways to do this.

FE vs FD: Balance Panel

FE and FD both aim to estimate the model by “eliminating” individual effects \(\alpha_i\).
With \(T=2\), both will give you the same results.
With \(T\geq 3\), you may need to choose based on assumptions on the error
- if \(e_{it}\) is serially uncorrelated, then FE is more efficient. Otherwise, FD may be better (if correlation is strong)
Otherwise, typical suggestionis to try both, and evaluate the results.
In general, there are few arguments to choose between FD and FE.
- Empirically, People use FE, because its the default in most software.

FE vs FD: Unbalanced Panel

Unbalance panel data occures when different units are observed over different time periods.
- Some periods may or may not overlap, some may skip periods, etc
- It may be more important understanding why one observes this kind of missing data problem.
If this is the case FD may be more difficult to use, because it requires data with regular time gaps. (Observations with missing data may be dropped)
With FE, you make most use of available data. Only those with “singletons” (units observed only once) are dropped.

Random Effects Models

Even if not done by hand, FE estimation is very computationally intensive and inneficient, because it requires estimating a large set of coefficients for indivuals.
- This, however, its important if we believe that \(\alpha_i\) are correlated with \(x_{it}\).
If \(\alpha_i\) were uncorrelated with \(x_{it}\), then we could use a more efficient Approach: Random Effects model.
- \(Corr(\alpha_i, x_{it})=0\) can be a very hard assumption to make.
If this is the case, we could estimate the model using OLS or Pool-OLS. Both would be consistent.
- However, the standard errors would be biased, because of the correlation across errors.

\[Corr(e_{it}+a_i,e{is}+a_i)=\frac{\sigma^2_a}{\sigma^2_a+\sigma^2_e}\]

Random Effects Models: SE estimation

There are two ways to estimate the standard errors in a Random Effects model:
- One could be to apply “clustered-robust” standard errors, using the individual as the cluster.
  - Its a genereric solution to Clustering…Specially appropriate if we do not know how Clustering happens. (but we know
- The second One is apply GLS. Since we know the “theoretical” correlation across errors, we could use this to transform the data, and estimate SE.

First, define:

\[\theta = 1- \left[ \frac{\sigma^2_e}{\sigma^2_e+T \sigma^2_a}\right]^{1/2}\]

All variables in the model (inclulding the constant) should be transformed using a quasi-differentiation as follows:

\[\tilde w_{it} = w_{it} - \theta \bar w_i\]

This transformation will eliminate the correlation across errors, and allow us to estimate the model using OLS. \[y_{it}-\theta \bar y_i = \beta_0 (1-\theta) + b_1 (x_{it} - \theta \bar x_i) + b_2 (z_{it} - \theta \bar x_i) + v_{it} - \theta \bar v_i\]

Last pieces of the puzzle:

Estimate the main model using pool OLS. \(y_{it}=\beta_0 + \beta_1 x_{it} + v_{it}\)
Estimate \(\sigma^2_a\) as: \(\hat \sigma^2_a = \frac{\sum_{i=1}^N \sum_{t=1}^{T-1}\sum_{s=t+1}^{T} \hat v_{it} \hat v_{is}}{NT(T-1)/2 - (k+1)}\)
Estimate \(\sigma^2_e\) as: \(\hat \sigma_e^2 = \hat\sigma^2_v - \hat \sigma^2_a\)

Biggest Advantage of RE model is that you can now obtain effects for Time-invariant variables.
It is also more efficient, because you do not need to estimate individual effects, just capture the distribution of \(\alpha_i\).

Example

In Stata, you could estimate the panel models using the xtreg command.

This command has options for Fixed Effects, Between Effects and Random Effects.

Code

frause wagepan, clear
** Good idea to Set the data as panel data
xtset nr year


Panel variable: nr (strongly balanced)
 Time variable: year, 1980 to 1987
         Delta: 1 unit

Code

** Pool OLS
qui: reg lwage educ black hisp exper expersq married union,
est sto m1
** pool OLS with Clustered SE
qui: reg lwage educ black hisp exper expersq married union, cluster(nr)
est sto m2
** Random Effects: Default
qui:xtreg lwage educ black hisp exper expersq married union, re
est sto m3
** Fixed Effects
qui:xtreg lwage educ black hisp exper expersq married union, fe
est sto m4
esttab m1 m2 m3 m4, se b(4) noomit nonumber mtitle(OLS OLS_CL RE FE)


----------------------------------------------------------------------------
                      OLS          OLS_CL              RE              FE   
----------------------------------------------------------------------------
educ               0.0994***       0.0994***       0.1012***                
                 (0.0047)        (0.0092)        (0.0089)                   

black             -0.1438***      -0.1438**       -0.1441**                 
                 (0.0236)        (0.0501)        (0.0476)                   

hisp               0.0157          0.0157          0.0202                   
                 (0.0208)        (0.0392)        (0.0426)                   

exper              0.0892***       0.0892***       0.1121***       0.1168***
                 (0.0101)        (0.0124)        (0.0083)        (0.0084)   

expersq           -0.0028***      -0.0028**       -0.0041***      -0.0043***
                 (0.0007)        (0.0009)        (0.0006)        (0.0006)   

married            0.1077***       0.1077***       0.0628***       0.0453*  
                 (0.0157)        (0.0261)        (0.0168)        (0.0183)   

union              0.1801***       0.1801***       0.1074***       0.0821***
                 (0.0171)        (0.0276)        (0.0178)        (0.0193)   

_cons             -0.0347         -0.0347         -0.1075          1.0649***
                 (0.0646)        (0.1201)        (0.1107)        (0.0267)   
----------------------------------------------------------------------------
N                    4360            4360            4360            4360   
----------------------------------------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001

Pool OLS vs RE vs FE

We now know how to analyze panel data using three Stretegies: Pool OLS, Fixed Effects and, Random Effects.
1. FE is usually prefered to RE, because is more consistent by relaxing the assumption of no correlation between \(a_i\) and \(x_{it}\) (explicit control). Its less efficient.
2. RE may be prefer to FE if the correlation between \(a_i\) and \(x_{it}\) is small. Its more efficient, and allows to estimate effects for time-invariant variables.
3. RE and POLS will be consistent under the same assumptions. However, RE will remove some of the serial correlation, and may have less bias than OLS (even if \(a_i\) and \(x_{it})\) are correlated.
Choosing between RE and POLs is rarely considered. (RE would be the default in most cases)
However, Choosing between FE vs RE is common: Hausman Test

Hausman Test

Hausman test is used to determine which model to use between two estimators.
1. You assume FE is consistent (but not efficient).
2. You estimate the model using RE. If RE estimates are close to FE, then RE is consistent and efficient (preferred)
3. Othewise, we suspect RE are inconsistent, and we use FE.
For most applied work, however, FE is generally prefered to RE

Code

** Hausman Test
*** Consistent model FE
qui:xtreg lwage educ black hisp exper expersq married union, fe
est sto fe
*** Efficient under H0
qui:xtreg lwage educ black hisp exper expersq married union, re
est sto re
hausman fe re


                 ---- Coefficients ----
             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
             |       fe           re         Difference       Std. err.
-------------+----------------------------------------------------------------
       exper |    .1168467     .1121195        .0047272        .0016276
     expersq |   -.0043009    -.0040689        -.000232        .0001269
     married |    .0453033     .0627951       -.0174918        .0073427
       union |    .0820871     .1073789       -.0252917        .0073636
------------------------------------------------------------------------------
                          b = Consistent under H0 and Ha; obtained from xtreg.
           B = Inconsistent under Ha, efficient under H0; obtained from xtreg.

Test of H0: Difference in coefficients not systematic

    chi2(4) = (b-B)'[(V_b-V_B)^(-1)](b-B)
            =  31.45
Prob > chi2 = 0.0000

Correlated Random Effects

The CRE model is an alternative approach that combines some of the features of RE and FE estimators. \[y_{it} = \beta_1 x_{it} + \beta_2 z_{i} + \alpha_i + \epsilon_{it}\]
One way to look at the “individual” fixed effect is to model it as a function of fixed effects:

\[\alpha_i = \alpha + \gamma_1 \bar x_{i} + \gamma_2 z_{i} + r_i\]

In this case, we assume the fixed unobserved effect \(\alpha_i\) could be written as a function of avg observed characteristics, and fixed factors.
And we assume that \(r_i\) would be uncorrelated with \(x_{it}\) , \(\bar x_i\) and \(z_{i}\).

If we combine this with the main regression we have:

\[y_{it} = \beta_1 x_{it} + \beta_2 z_{i} + \alpha + \gamma_1 \bar x_{i} + \gamma_2 z_{i} + r_i + \epsilon_{it}\] \[y_{it} = \alpha + \beta_1 x_{it} + \beta_2 z_{i} + \gamma_1 \bar x_{i} + \underbrace{\nu_{it}}_{ r_i + \epsilon_{it}}\]

Which we could now estimate using Pool OLS or RE. There is no more need to worry about correlation between \(r_i\) and \(x_{it}\).
Differences and Advantages:
- We must include individual level average characteristics.
- We can now estimate effects for time-invariant variables.
- We can test for FE vs RE models.

Correlated Random Effects: FE vs RE

CRE estimates for Time varying variables are identical to FE. \[\hat \beta_{cre}=\hat \beta_{fe}\]
CRE estimates shows clearly why RE are more efficient (RE imposes \(\gamma=0\))
Thus, we can test for FE vs RE using the following test:

\[H_0: \gamma = 0 \text{ or } RE \] \[H_a: \gamma \neq 0 \text{ or } FE \]

Example

Code

foreach i of varlist exper expersq married union {
  bysort nr: egen mn_`i'=mean(`i')
}
xtreg lwage educ black hisp exper expersq married union mn_*, re
est sto m2
** FE vs RE
test mn_exper mn_expersq mn_married mn_union


Random-effects GLS regression                   Number of obs     =      4,360
Group variable: nr                              Number of groups  =        545

R-squared:                                      Obs per group:
     Within  = 0.1780                                         min =          8
     Between = 0.2192                                         avg =        8.0
     Overall = 0.2002                                         max =          8

                                                Wald chi2(11)     =     976.27
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

------------------------------------------------------------------------------
       lwage | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        educ |   .0946036   .0109043     8.68   0.000     .0732315    .1159757
       black |  -.1388124   .0488709    -2.84   0.005    -.2345977   -.0430271
        hisp |   .0047758   .0426925     0.11   0.911    -.0788999    .0884515
       exper |   .1168467   .0084197    13.88   0.000     .1003444     .133349
     expersq |  -.0043009   .0006053    -7.11   0.000    -.0054872   -.0031146
     married |   .0453033   .0183097     2.47   0.013      .009417    .0811896
       union |   .0820871   .0192907     4.26   0.000      .044278    .1198963
    mn_exper |  -.1672838    .051032    -3.28   0.001    -.2673046    -.067263
  mn_expersq |   .0094254   .0032684     2.88   0.004     .0030195    .0158312
  mn_married |   .0983604   .0450837     2.18   0.029     .0099979    .1867228
    mn_union |   .1885894   .0504022     3.74   0.000     .0898029    .2873759
       _cons |    .492309   .2210094     2.23   0.026     .0591386    .9254794
-------------+----------------------------------------------------------------
     sigma_u |  .32456727
     sigma_e |  .35125535
         rho |  .46057172   (fraction of variance due to u_i)
------------------------------------------------------------------------------

 ( 1)  mn_exper = 0
 ( 2)  mn_expersq = 0
 ( 3)  mn_married = 0
 ( 4)  mn_union = 0

           chi2(  4) =   27.27
         Prob > chi2 =    0.0000

Comparing across models:

Code

qui:xtreg lwage educ black hisp exper expersq married union, fe
est sto m1
qui:xtreg lwage educ black hisp exper expersq married union, re
est sto m3
esttab m1 m2 m3, se b(4) noomit nonumber mtitle(FE CRE RE)


------------------------------------------------------------
                       FE             CRE              RE   
------------------------------------------------------------
exper              0.1168***       0.1168***       0.1121***
                 (0.0084)        (0.0084)        (0.0083)   

expersq           -0.0043***      -0.0043***      -0.0041***
                 (0.0006)        (0.0006)        (0.0006)   

married            0.0453*         0.0453*         0.0628***
                 (0.0183)        (0.0183)        (0.0168)   

union              0.0821***       0.0821***       0.1074***
                 (0.0193)        (0.0193)        (0.0178)   

educ                               0.0946***       0.1012***
                                 (0.0109)        (0.0089)   

black                             -0.1388**       -0.1441** 
                                 (0.0489)        (0.0476)   

hisp                               0.0048          0.0202   
                                 (0.0427)        (0.0426)   

mn_exper                          -0.1673**                 
                                 (0.0510)                   

mn_expersq                         0.0094**                 
                                 (0.0033)                   

mn_married                         0.0984*                  
                                 (0.0451)                   

mn_union                           0.1886***                
                                 (0.0504)                   

_cons              1.0649***       0.4923*        -0.1075   
                 (0.0267)        (0.2210)        (0.1107)   
------------------------------------------------------------
N                    4360            4360            4360   
------------------------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001

CRE Implementation

As shown above, in Stata, you could estimate the CRE panel models using the xtreg with RE option.
- You just need to be careful when estimating the averages of all variables in the model.
- This is particularly relevant for unbalance panel data.
- In those cases, you could use cre (from fra install)
You could also extend this to using Multiple fixed effects (time and individual), but some equivalences are lost.

Code

fra install cre
cre , abs(nr): xtreg lwage educ black hisp exper expersq married union, re

checking cre consistency and verifying not already installed...
all files already exist and are up to date.

Random-effects GLS regression                   Number of obs     =      4,360
Group variable: nr                              Number of groups  =        545

R-squared:                                      Obs per group:
     Within  = 0.1780                                         min =          8
     Between = 0.2192                                         avg =        8.0
     Overall = 0.2002                                         max =          8

                                                Wald chi2(11)     =     976.27
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

------------------------------------------------------------------------------
       lwage | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        educ |   .0946036   .0109043     8.68   0.000     .0732315    .1159757
       black |  -.1388124   .0488709    -2.84   0.005    -.2345977   -.0430271
        hisp |   .0047758   .0426925     0.11   0.911    -.0788999    .0884515
       exper |   .1168467   .0084197    13.88   0.000     .1003444     .133349
     expersq |  -.0043009   .0006053    -7.11   0.000    -.0054872   -.0031146
     married |   .0453033   .0183097     2.47   0.013      .009417    .0811896
       union |   .0820871   .0192907     4.26   0.000      .044278    .1198963
    m1_exper |  -.1672838    .051032    -3.28   0.001    -.2673046    -.067263
  m1_expersq |   .0094254   .0032684     2.88   0.004     .0030195    .0158312
  m1_married |   .0983604   .0450837     2.18   0.029     .0099979    .1867228
    m1_union |   .1885894   .0504022     3.74   0.000     .0898029    .2873759
       _cons |  -.0330167   .1330654    -0.25   0.804    -.2938202    .2277867
-------------+----------------------------------------------------------------
     sigma_u |  .32456727
     sigma_e |  .35125535
         rho |  .46057172   (fraction of variance due to u_i)
------------------------------------------------------------------------------

Final words

All estimation methodologies presented here could also be used in other contexts.
Examples:
- Geronimus and Korenman (1992): Analysis of data siblings outcome accounting for family fixed effects.
- Ashenfelder and Kruger (1994): Return to education using Twins Data.
One may also use the principles of Panel data with linked data in high dimensional data sets.
- Education data and School FE
- Health data and Hospital FE
- Wages and Firm FE
- etc.
In this cases, one may need to also considering explicit clustering in addition to “fixed effects”.

That’s all folks!

Next week: Switching gears to Time Series Analysis - When the present, the past and the future matter