Performance of Wilcoxon-Mann-Whitney Test and t-test



This study compares the Type I error rate and power between the two-sample t-test and the Wilcoxon-Mann-Whitney (WMW) test. The two-sample t-test requires either the two population distributions to be normal or the sample sizes to be large enough in order for the sampling distribution to be normal. The WMW test is a nonparametric test that requires the two population distributions to have the same shape. When two populations have the same mean, Type I error rate is of interest. In contrast, when two populations have different means, power is of interest

Different scenarios are analyzed in this study, such as comparing two Normal distributions, a Normal to a Gamma distribution, and two Gamma distributions with small and large sample sizes. The better test is determined either through a lower Type I error rate or a higher power.


Shiny app by Jimmy Wong
Base R code by Jimmy Wong
Shiny source files: GitHub Gist

It is time to Guess the Population! This game demonstrates the difficulty of identifying which pair of sample data are from the same population. Below are 4 histograms of randomly generated data with sample sizes of 20, where 2 are from N(3,1) (Normal distribution) and 2 are from Gamma(6,.5) (Gamma distribution).

Can you determine which pair came from the Normal distribution and which pair from the Gamma distribution?


Shiny app by Jimmy Wong
Base R code by Jimmy Wong
Shiny source files: GitHub Gist
Normal distribution 1:

Normal distribution 2:





Shiny app by Jimmy Wong
Base R code by Jimmy Wong
Shiny source files: GitHub Gist
normnorm1

Note:

  1. Variances are fixed at 1
  2. P(rejecting Ho | μ12) = Type I error rate
  3. P(rejecting Ho | μ1≠μ2) = Power

These two Normal distributions have the same means; focus on Type I error rate
These two Normal distributions have different means; focus on Power
normcond
normnorm2
Type I error rate:
Power:


Normal distribution 1:

Normal distribution 2:





Shiny app by Jimmy Wong
Base R code by Jimmy Wong
Shiny source files: GitHub Gist
c11 In this scenario, the mean of the 1st Normal distribution varies according to the specified range, while the mean of the 2nd Normal distribution remains constant. The Type 1 error rate and power is compared between the t-test and the WMW test.
c12

In the generated graph, each point is either a Type I error rate or power; there is at most 1 Type I error rate (when the two population means are the same).

Normal distribution:

Gamma distribution:





Shiny app by Jimmy Wong
Base R code by Jimmy Wong
Shiny source files: GitHub Gist
normgam1

Note:

  1. Variance of Normal is fixed at 1
  2. Gamma mean is the product of shape and scale
  3. P(rejecting Ho | μ12) = Type I error rate
  4. P(rejecting Ho | μ1≠μ2) = Power

normgamcond
normgam2


Normal distribution:

Gamma distribution:





Shiny app by Jimmy Wong
Base R code by Jimmy Wong
Shiny source files: GitHub Gist
c21 In this scenario, the mean of the Normal distribution varies according to the specified range, while the mean of the Gamma distribution remains constant. The Type 1 error rate and power is compared between the t-test and the WMW test.
c22

In the generated graph, each point is either a Type I error rate or power; there is at most 1 Type I error rate (when the two population means are the same).

Gamma distribution 1:

Gamma distribution 2:






Shiny app by Jimmy Wong
Base R code by Jimmy Wong
Shiny source files: GitHub Gist
gamgam1

Note:

  1. Gamma mean is the product of shape and scale
  2. P(rejecting Ho | μ12) = Type I error rate
  3. P(rejecting Ho | μ1≠μ2) = Power

gamcond
gamgam2


Gamma distribution 1:

Gamma distribution 2:





Shiny app by Jimmy Wong
Base R code by Jimmy Wong
Shiny source files: GitHub Gist
c31 In this scenario, the mean of the 1st Gamma distribution remains constant, while the mean of the 2nd Gamma distribution varies depending on the specified range of distance. The Type 1 error rate and power is compared between the t-test and the WMW test.
c32

In the generated graph, each point is either a Type I error rate or power; there is at most 1 Type I error rate (when the two population means are the same).