Hierarchical Models

This study focuses on implementing Hierarchical Models with nested data and comparing this method to the Pooled and Unpooled method. The pooled method complete ignores any nesting structure in the data, which does not account for the variability in the response among the nesting groups. The unpooled method overstates the variability among the nesting groups by fitting separate estimates for each nesting group without taking into account information from other groups. Hierarchical models serve as a balance between these two methods. It accounts for the variability in the nesting groups while fitting group estimates by using information across all the groups.

Two key related concepts are present: borrowing strength and shrinkage . Borrowing strength refers to how estimates for groups with small sample sizes are pulled toward the average of all groups. Shrinkage refers to how estimates from hierarchical models are closer together compared to the unpooled method.

Level 1 observational units (i) refer to the units observed at the lowest level and are nested in groups. Level 2 observational units (j) refer to the groups in which the level 1 observational units are nested in. Predictor at both levels can be used but the response variable must be at level 1.

Shiny app by Jimmy Wong

Shiny source files: GitHub Gist

Cal Poly Statistics Dept Shiny Series

Use sample data

The music data set is from a performance anxiety study conducted by Sadler and Miller (2010). They collected data on 37 undergraduate music majors who filled out performance diaries over a whole academic year. Before each performance, each musician also completed a Positive Affect Negative Affect Schedule (PANAS), in which two variables were measured: negative affect (measure of anxiety) and positive affect (measure of happiness). In total, variables were measured on the musicians and each of their performances. This app will focus on how negative affect is associated with characteristics of the musicians and characteristics of the performances.

Browse...

Header

Separator:

Comma

Semicolon

Tab

Quote

None

Double Quote

Single Quote

Customize models if using uploaded data.

Varying-intercept:

Varying-intercept and varying-slope:

Varying-intercept and varying-slope with level 2 predictor:

The pooled method completely pools all level 1 observational units and ignores that there is nesting in level 2 observational units. Therefore, the pooled mean is the overall mean of the response variable, not accounting for variability among the level 2 observational units in the response variable.

The unpooled method fits a separate average for each level 2 observational unit. An issue is that the unpooled method exaggerates the variability in the response among the level 2 observational units because information across all level 2 observational units are ignored. For example, some level 2 observational units may have smaller sample size that would yield unreliable estimates with the pooled method. However, this can be accounted for with hierarchical models.

Model Equation

hlmnopred1 Varying-intercept model:

$$y_i = \alpha_{j[i]} + \epsilon_i, where\, \epsilon_i \sim N(0,\sigma_y^{2})$$

$$\alpha_j = \mu_{\alpha} + \eta_j, where\, \eta_j \sim N(0,\sigma_\alpha^{2})$$

Unified equation

$$y_i = \mu_{\alpha} + \eta_j + \epsilon_i$$

$$\epsilon_i \sim N(0,\sigma_y^{2})$$

$$\eta_j \sim N(0,\sigma_\alpha^{2})$$

Explore equations

$$y_i = true\,response\,for\,observational\,unit\,i$$

$$\alpha_{j[i]} = true\,HLM\,mean\,of\,group\,j\,for\,observational\,unit\,i$$

$$\epsilon_i = deviation\,of\,observational\,unit\,i\,from\,its\,group\,average$$

$$\sigma_y^{2} = within\,group\,variance\,in\,response$$

$$\mu_{\alpha} = true\,average\,of\,group\,averages$$

$$\eta_j = deviation\,of\,group\,j\,from\,true\,average$$

$$\sigma_\alpha^{2} = between\,group\,variance\,in\,response$$

Caterpillar Plot

hlmcat1

Shrinkage of Estimates

hlmnopred2 Hierarchical weighted average for group j:

$$\hat{\alpha}_j^{HLM} = \hat{\omega_j}*\hat{\mu}_\alpha+(1-\hat{\omega_j})*\bar{y}_j$$

$$\hat{\omega}_j = 1-\frac{n_j*\hat{\sigma}_\alpha^{2}}{n_j*\hat{\sigma}_\alpha^{2}+\hat{\sigma}_y^{2}}$$

Explore formula

$$\hat{\omega}_j = pooling\,factor$$

$$n_j = sample\,size\,of\,group\,j$$

$$\hat{\sigma}_y^{2} = within\,group\,(unexplained)\,variance\,in\,response$$

$$\hat{\sigma}_\alpha^{2} = between\,group\,(explained)\,variance\,in\,response$$

$$\bar{y}_{all} = pooled\,mean$$

Distribution of Estimates

hlmnopred3 Distribution of HLM means:

$$\alpha_j \sim N(\mu_{\alpha},\sigma_\alpha^{2})$$

Show distribution of group-level errors

Intraclass Correlation Coefficient

hlmnopred4 Intraclass correlation coefficient:

Explore formula

$$\hat{\sigma}_y^{2} = within\,group\,(unexplained)\,variance\,in\,response$$

$$\hat{\sigma}_\alpha^{2} = between\,group\,(explained)\,variance\,in\,response$$

Show ICC

Ratio of variances

hlmnopred5 Ratio of variances:

Explore formula

$$\hat{\sigma}_y^{2} = within\,group\,(unexplained)\,variance\,in\,response$$

$$\hat{\sigma}_\alpha^{2} = between\,group\,(explained)\,variance\,in\,response$$

Show ratio of variances

HLM Output

hlmnopred1

Show confidence intervals

HLM Estimates Table

hlmnopred10

HLM Model Equation

hlm1 Varying-intercept and varying-slope model:

$$y_i = \alpha_{j[i]} + \beta_{j[i]}x_i + \epsilon_i$$

$$\alpha_j = \mu_{\alpha} + \eta^{\alpha}_j$$

$$\beta_j = \mu_{\beta} + \eta^{\beta}_j$$

$$\begin{pmatrix} \alpha_j \\ \beta_j \end{pmatrix} \sim N\begin{pmatrix} \begin{pmatrix} \mu_{\alpha} \\ \mu_{\beta} \end{pmatrix} , \begin{pmatrix} \sigma_\alpha^{2} & \rho\sigma_\alpha\sigma_\beta \\ \rho\sigma_\alpha\sigma_\beta & \sigma_\beta^{2} \end{pmatrix} \end{pmatrix}$$

Error terms:

$$\epsilon_i \sim N(0,\sigma_y^{2})$$

$$\begin{pmatrix} \eta^{\alpha}_j \\ \eta^{\beta}_j \end{pmatrix} \sim N\begin{pmatrix} \begin{pmatrix} 0 \\ 0 \end{pmatrix} , \begin{pmatrix} \sigma_\alpha^{2} & \rho\sigma_\alpha\sigma_\beta \\ \rho\sigma_\alpha\sigma_\beta & \sigma_\beta^{2} \end{pmatrix} \end{pmatrix}$$

HLM Graph

hlm2

Caterpillar Plot

hlmcat2

HLM Distributions of Estimates

hlm3

Show distributions of group-level errors

HLM Output

hlm10

Show confidence intervals

HLM Estimates Table

hlm11

Intercepts:

Slopes:

HLM Model Equation

hlm21 Varying-intercept and varying-slope with level 2 predictor model:

$$y_i = \alpha_{j[i]} + \beta_{j[i]}x_i + \epsilon_i$$

$$\alpha_j = \gamma^{\alpha}_0 + \gamma^{\alpha}_1\mu_j + \eta^{\alpha}_j$$

$$\beta_j = \gamma^{\beta}_0 + \gamma^{\beta}_1\mu_j + \eta^{\beta}_j$$

$$\begin{pmatrix} \alpha_j \\ \beta_j \end{pmatrix} \sim N\begin{pmatrix} \begin{pmatrix} \gamma^{\alpha}_0 + \gamma^{\alpha}_1\mu_j \\ \gamma^{\beta}_0 + \gamma^{\beta}_1\mu_j \end{pmatrix} , \begin{pmatrix} \sigma_\alpha^{2} & \rho\sigma_\alpha\sigma_\beta \\ \rho\sigma_\alpha\sigma_\beta & \sigma_\beta^{2} \end{pmatrix} \end{pmatrix}$$

Caterpillar Plot

hlmcat3

HLM Hyperparameters

hlm22 Fixed effects

Random effects

Show confidence intervals

HLM Output

hlm23

Customize Model:

HLM Output

hlmdata2