02: Resampling Methods

General Overview

Dean Adams, Iowa State University

Conceptual Motivation

Statistical methods are often used to test hypotheses and make inferences

Requires parametric theory to estimate parameters & CI

Parametric Expected Distributions

Numerous distributions of expected values have been generated from theory for different types of data and hypotheses

Each have assumptions about the behavior of the underlying data

Challenges to Parametric Theory

Power \(\small\downarrow\) as dimensionality \(\small\uparrow\), and eventually computations cannot be completed (the ‘curse’ of dimensionality)

Alternative mechanisms for evaluating hypotheses are required

Resampling Methods

Outline

1: Randomization/Permutation

1: Randomization/Permutation

Schematic of randomization procedure for \(t\)-test

Randomization: Fisher’s Exact Test

Complete enumeration of all possible permutations

Provides exact probability of \(E_{obs}\)

Randomization: Example

Randomization: Example

##    D.obs  P-value 
## 2.445696 0.001000

Randomization: Some Comments

Note the null distribution from our previous example:

How Many Permutations to Use?

\(\small\uparrow\) # of iterations improves precision of estimates of significance

With higher computing power, large numbers of iterations are feasible

2: The Bootstrap

Resample dataset many times with replacement

Each bootstrap iteration contains \({N}\) objects, but some are represented multiple times, and others not at all

The bootstrap is very useful for estimating confidence intervals, because summary measures derived from bootstrap samples approximate those of the original population distribution

Percentile Bookstrap Confidence Intervals

Generate boostrap datasets

Estimate summary statistic from each \(E_{boot}\)

Bootstrap CI are: upper and lower \(\small\alpha/2\) values of \(E_{boot}\) sample (usually 0.025 and 0.975)

Bias-Corrected Percentile Bookstrap Confidence Intervals

Sometimes the the bootstrap distribution is skewed, such that \(\small\mu_{E_{boot}}\neq{E_{obs}}\)

Bias is alleviated by finding fraction (Fr) of bootstrap values above \(\small{E}_{obs}\)

Adjust as \(\small{CI}=\Phi[2\Phi^{-1}(Fr)\pm{Z}_{\alpha/2}]\) where \(\small\Phi\) is the cumulative normal distribution

3: The Jackknife

‘Leave one out’ resampling: each iteration contains \(\small{N-1}\) objects

Investigate the precision of \(\small{E}_{obs}\) and how sensitive it is to specific values in a dataset

Useful to measure bias, standard error or CI of test statistic

\(\small{Bias}(E_{obs})=E_{obs}-\mu_{E_{jack}}\)

Revisiting Permutation: Designing The Test

Devising a proper permutation test requires several components:

1: Identifying the null hypothesis \(\small{H}_{0}\)

2: Determining whether there is a known expected value under \(\small{H}_{0}\)

3: Identifying which values may be permuted and how to estimate expected distribution under \(\small{H}_{0}\)

see: Commanges. (2003). Good. (2004).; Sekora, Adams & Collyer. Heredity (2015). Adams & Collyer. Evol. (2018)

Revisiting Permutation: Designing The Test

Devising a proper permutation test requires several components:

1: Identifying the null hypothesis \(\small{H}_{0}\)

2: Determining whether there is a known expected value under \(\small{H}_{0}\)

3: Identifying which values may be permuted and how to estimate expected distribution under \(\small{H}_{0}\)

see: Commanges. (2003). Good. (2004).; Sekora, Adams & Collyer. Heredity (2015). Adams & Collyer. Evol. (2018)

Essentially, one must determine:

1: What to permute?

2: How to permute it?

The Logic of Permutation Tests

Permutation tests generate empirical sampling distributions under \(\small{H}_{0}\)

How do we accomplish this?

A simple logic flow:

  1. Define \(\small{H}_{0}\) and \(\small{H}_{1}\)

  2. Identify what differs between \(\small{H}_{0}\) and \(\small{H}_{1}\) (i.e., what does \(\small{H}_{1}\) quantify relative to \(\small{H}_{0}\)?)

  3. Permute the data which ‘breaks up’ the signal in \(\small{H}_{1}\) relative to \(\small{H}_{0}\)

The Logic of Permutation Tests: Example

The Logic of Permutation Tests: Example

Permutation Example: Comments

Here, permuting the response data (Y) appears reasonable as a set of exchangeable units (This is ‘full’ randomization)

Exchangeable Units

Exchangeable Units

Example: Say one is interested in the difference between two means. Permuting individuals among groups produces outcomes where the overall mean and overall variance are the same in every permutation. Thus, when compared to this empirical sampling distribution, evaluation of the observed difference is made with respect to constant global mean and variance.

Effect Sizes from Empirical Null Sampling Distributions

  1. Perform RRPP many times.
  2. Calculate \(F\)-value in every random permutation (observed case counts as one permutation)
  3. For \(N\) permutations, \(P = \small\frac{N(F_{random} \geq F_{obs})}{N}\)
  4. Calculate effect size as a standard deviate of the observed value in a normalized distribution of random values (helps for comparing effects within and between models); i.e., \[\small{z = \frac{ \log\left( F\right) - \mu_{\log\left(F\right)} } { \sigma_{\log\left(F\right)} }}\] where \(\small\mu_{\log\left(F\right)}\) and \(\sigma_{\log\left(F\right)}\) are the expected value and standard deviation from the sampling distribution, respectively.
Collyer et al. (2015); Adams & Collyer. (2016; 2018; 2019)

Consequences of Incorrect Permutation

Incorrectly assigning exchangeable units can result in elevated type I error rates and incorrect inferences

Here, permuting phylogenetically independent contrasts is incorrect, because these values contain information from both the response (\(\small\mathbf{Y}\)) data as well as the phylogeny among taxa (see PCM lecture)

Full Randomization

For simple linear models: \(\small\mathbf{Y}=\mathbf{X}\mathbf{\beta } + \mathbf{E}\), permuting \(\small\mathbf{Y}\) relative to \(\small\mathbf{X}\) is often proposed

This is sufficient for \(t\)-test & correlation tests; and for simple linear models (e.g., single-factor models)

Permutation Procedures: Factorial Models

For more complex models with Multiple explanatory factors:

Factorial model: \(\small\mathbf{Y}=\mathbf{X_{A}}\mathbf{\beta_{A}} +\mathbf{X_{B}}\mathbf{\beta_{B}} +\mathbf{X_{AB}}\mathbf{\beta_{AB}}+\mathbf{E}\)

Permuting \(\small\mathbf{Y}\) is possible, but seems inadequate

Restricted Randomization

An alternative is to restrict the resampling to sub-strata of data (strata based on levels within factors)

Permute \(\small\mathbf{Y}\) within levels of A then within levels of B

Permits evaluation of \(\small{SS_{A}}\) and \(\small{SS_{B}}\) but NOT \(\small{SS_{AB}}\)

Factorial Models: Understanding the Null

The key to identifying exchangeable units lies with the \(\small{H}_{0}\):

Factorial models \(\small\mathbf{Y}=\mathbf{X_{A}}\mathbf{\beta_{A}} +\mathbf{X_{B}}\mathbf{\beta_{B}} +\mathbf{X_{AB}}\mathbf{\beta_{AB}}+\mathbf{E}\) are a set of sequential hypotheses comparing full (\(\small\mathbf{X}_{F}\)) and reduced (\(\small\mathbf{X}_{R}\)) models

Testing each \(\small\mathbf{X}_{F}\) requires appropriate permutation procedure for each \(\small\mathbf{X}_{R}\)

Residual randomization provides proper exchangeable units under each \(\small\mathbf{X}_{R}\)

Anderson. (2001); Anderson and terBraak. (2003); Collyer & Adams. (2007); Collyer, Sekora & Adams. (2015); Adams & Collyer. (2018; 2019)

Residual Randomization

Permute residuals \(\mathbf{E}_{R}\) from reduced model \(\small\mathbf{X}_{R}\), rather than original values

Evaluates \(\small{SS}_{\mathbf{X}_{F}}\) while holding effects of \(\small\mathbf{X}_{R}\) constant

Must specify full and reduced models (Type I SS used in example)

Mathematical justification:

1: For any \(\small\mathbf{X}_{R}\): \(\small{SS}_{\mathbf{X}_{F}}=0\)

2: Under \(\small\mathbf{X}_{R}\), \(\mathbf{E}_{R}\) represent those components of SS NOT explained by \(\small\mathbf{X}_{R}\) (includes \(\small{RSS}\) of \(\small\mathbf{X}_{F}\) plus SS from term(s) not in \(\small\mathbf{X}_{R}\))

3: Thus, permuting \(E_{R}\) precisely embodies \(\small{H}_{R}\) of: \(\small{SS}_{\mathbf{X}_{F}}=0\)

Residual Randomization Permutation Procedure (RRPP)

1: Fit \(\small\mathbf{X}_{F}\) for each term in model; obtain coefficients and summary statistics (e.g., \(\small{SS}_{X}\))

2: Fit \(\small\mathbf{X}_{R}\) for each \(\small\mathbf{X}_{F}\); Estimate \(\small\hat{\mathbf{Y}}_{R}\) and \(\mathbf{E}_{R}\)

3: Permute, \(E_{R}\): obtain pseudo values as: \(\small\mathbf{\mathcal{Y}} = \mathbf{\hat{Y}}_{R} + \mathbf{E}_{R}\)

4: Fit \(\small\mathbf{X}_{F}\) using \(\small\mathbf{\mathcal{Y}}\): obtain coefficients and summary statistics

5: Repeat

Residual Randomization (RRPP): Single Factor Models

NOTE: for single-factor models, permuting \(\small\mathbf{Y}\) is equivalent to RRPP

Reason: \(\small{H}_{0}\) for single-factor model is \(\mathbf{Y}\)~1 (intercept model)

Residuals of \(\small{H}_{0}\) are simply deviations from the \(\small\mu_{Y}\)

And since \(\small\mu_{Y}\) is a constant, permutation distribution will be identical

Which Permutation Method Should We Use?

OLS Factorial Model (Adams & Collyer, unpubl.)

Full Randomization, Restricted Randomization, and RRPP are all fine (but not full-model residuals)

Which Permutation Method Should We Use?

GLS Factorial Model (Adams & Collyer, unpubl.)

Only RRPP is appropriate

RRPP & Multivariate Data

RRPP is unaltered for multivariate data

Shuffle ROWS of \(\mathbf{E}_{R}\)

The rest of the procedure is unchanged

RRPP & Distance Data

Mantel tests (see Matrix Covariation) shuffle rows AND columns of distance matrices

Prefered approach: RRPP

1: PCoA of distance matrix to obtain coordinates for \(\small{Y}\)

2: Fit model: \(\small \mathbf{Y}=\mathbf{X}\mathbf{\beta } +E\)

3: RRPP of rows of \(\mathbf{E}_{R}\)

Residual Randomization (RRPP): Example

Does pupfish body shape differ between populations (marsh vs. sinkhole) or between the sexes?

This is a factorial MANOVA: Y~ Pop + Sex + Pop:Sex

##           Df       SS        MS     Rsq      F      Z Pr(>F)   
## Sex        1 0.015780 0.0157802 0.28012 28.209 4.7773  0.001 **
## Pop        1 0.009129 0.0091294 0.16206 16.320 4.7097  0.001 **
## Sex:Pop    1 0.003453 0.0034532 0.06130  6.173 3.7015  0.001 **
## Residuals 50 0.027970 0.0005594 0.49651                        
## Total     53 0.056333                                          
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual Randomization (RRPP): Example

PCA of data and predicted values (see GLM lecture)

More Complex Designs

More generally RRPP is also appropriate for GLS (generalized least squares) models

These are models where covariance between objects is not zero (e.g., phylogenetic non-independence, spatial non-independence, temporal non-independence, etc.).

OLS is a special case of GLS (see LM lecture)

CONCLUSIONS

Resampling methods are of primary importance in multivariate analysis

Methods are flexible, and may be used with univariate or high-dimensional data

RRPP is the most general approach (and the only approach that is appropriate for GLS models)