06: Multivariate Data Spaces and Linear Models

Dean Adams, Iowa State University

Last Time:

GLM is the fitting of models - Fit null model (e.g., \(\small{H}_{0}\) = Y ~ 1) and then more complicated models - Assess fit of different models via SS (i.e,. LRT, \(\small{F}\)-ratios, etc.)

ANOVA and regression are derived from the same linear model, with the key differences being what type of variable is found in the explanatory variable matrix \(\small\mathbf{X}\) (i.e., categorical or continous factors)

Parameters of the linear model may be found using matrix algebra:

\[\small\hat{\mathbf{\beta }}=\left ( \mathbf{X}^{T} \mathbf{X}\right )^{-1}\left ( \mathbf{X}^{T} \mathbf{Y}\right )\]

Question: What about multivariate \(\small\mathbf{Y}\)?

Univariate Versus Multivariate Data

Why Jump to Multivariate?

anova(lm(Y[,1]~gp))
## Analysis of Variance Table
## 
## Response: Y[, 1]
##           Df Sum Sq Mean Sq F value  Pr(>F)  
## gp         2 17.888  8.9442  3.0232 0.07696 .
## Residuals 16 47.337  2.9586                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lm(Y[,2]~gp))
## Analysis of Variance Table
## 
## Response: Y[, 2]
##           Df  Sum Sq Mean Sq F value Pr(>F)
## gp         2  7.6535  3.8268  2.4914 0.1143
## Residuals 16 24.5760  1.5360
summary(manova(Y~gp))
##           Df  Pillai approx F num Df den Df   Pr(>F)   
## gp         2 0.80324   5.3694      4     32 0.002006 **
## Residuals 16                                           
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Here, all of the signal is in the trait covariation, which is missed by univariate analsyes of \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)!

Challenges With Multivariate Data

Three main challenges with multivariate data

1: Describing variation in it

2: Visualizing trends in multivariate spaces

3: Ensuring that the dataspace corresponds to statistical analytics we wish to use

To address these we must first back up and discuss data spaces in general

Vectors are Points in Space

Vectors are Points in Space (Cont.)

Vectors are Points in Space (Cont.)

Vectors are Points in Space (Cont.)

Vectors are Points in Space (Cont.)

Data Spaces

Data Spaces

Data Tables and Data Spaces

\[\small\mathbf{Y}=\begin{bmatrix} 2 & 1 \\ 1 & 3 \\ \vdots & \vdots \\ y_{n1} & y_{n2} \end{bmatrix}\]

Data Tables and Data Spaces

Data Tables and Data Spaces

Data Tables and Data Spaces

Note that for continuous data, all three representations: data as a matrix, data as points in a space, or data as vectors in a space are all equivalent.

Matrix Operations: Geometric Interpretations

\[\small\mathbf{y}_{1}+\mathbf{y}_{2}=\begin{bmatrix} 2 & 1\end{bmatrix} +\begin{bmatrix} 1 & 3\end{bmatrix} \]

Matrix Operations: Geometric Interpretations

\[\small\mathbf{y}_{1}+\mathbf{y}_{2}=\begin{bmatrix} 2 & 1\end{bmatrix} +\begin{bmatrix} 1 & 3\end{bmatrix} \]

\[\small\mathbf{y}_{1}+\mathbf{y}_{2}=\begin{bmatrix} 3 & 4\end{bmatrix} \]

Matrix Operations: Geometric Interpretations

By shifting the red vector to the end of the black vector, the new end point is the geometric result of matrix addition!

Matrix Operations: Geometric Interpretations

Matrix Operations: Geometric Interpretations

Matrix Operations & Geometry: Example

Say we have data (\(\small\mathbf{Y}\)) which is a \(\small{n}\times{2}\) matrix. We wish to obtain deviations from the mean. Using a null model vector of ones (\(\small\mathbf{X}_{0}\)) we perform:

\[\small\mathbf{Y}_{c}=\mathbf{Y}-\mathbf{X}_0 \left( \mathbf{X}_0^T\mathbf{X}_0 \right)^{-1}\mathbf{X}_0^T\mathbf{Y}\]

Matrix Operations & Geometry: Example

Say we have data (\(\small\mathbf{Y}\)) which is a \(\small{n}\times{2}\) matrix. We wish to obtain deviations from the mean. Using a null model vector of ones (\(\small\mathbf{X}_{0}\)) we perform:

\[\small\mathbf{Y}_{c}=\mathbf{Y}-\mathbf{X}_0 \left( \mathbf{X}_0^T\mathbf{X}_0 \right)^{-1}\mathbf{X}_0^T\mathbf{Y}\]

Now plot it! Note that geometrically, the matrix algebra above results in mean centering the data!

The Geometry of Data Spaces: Intuition

The Geometry of Data Spaces: Intuition

The Geometry of Data Spaces: Intuition

Metric Spaces

Property Meaning
\(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}\geq0\) non-negativity
\(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}=0 \Leftrightarrow{\mathbf{y}}_{1}={\mathbf{y}}_{2}\) identity
\(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}={d}_{\mathbf{y}_{2}\mathbf{y}_{1}}\) symmetry
\(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{3}}\leq{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}+{d}_{\mathbf{y}_{2}\mathbf{y}_{3}}\) triangle inequality

Metric (Euclidean) Spaces

NOTE: most multivariate analyses assume that the data reside in a metric space. As we will see later, this is not always the case!

Metric Spaces: Distances and Directions

Vector length = distance = magnitude= ‘norm’: \(\small||\mathbf{y}||=\sqrt{\mathbf{y}^{T}\mathbf{y}}\)

Difference in direction between two vectors: \(\small\theta=cos^{-1}\left(\frac{\mathbf{y}^{T}_{1}}{||\mathbf{y}_{1}||}\right)\left(\frac{\mathbf{y}^{T}_{2}}{||\mathbf{y}_{2}||}\right)\)

IMPORTANT: vectors in metric spaces have a length (distance) and a direction regardless of dimensionality! \(\small||\mathbf{y}||\) & \(\small\theta\) are statistical summary measures of vectors that apply for MVN (multivariate normal) data of any dimensionality.

Geometric Operations: Vector Correlation

\[\small\theta=cos^{-1}\left(\frac{\mathbf{y}^{T}_{1}}{||\mathbf{y}_{1}||}\right)\left(\frac{\mathbf{y}^{T}_{2}}{||\mathbf{y}_{2}||}\right)\]

\[\small{r}_{\mathbf{y}_{1},\mathbf{y}_{2}}=cos\theta\]

Side note: Biologists tend to think of correlations as being between variables for sets of specimens, where we have n pairs of \(\small{y}_{1}\) and \(\small{y}_{2}\) values. Transposing the problem we now have 2 vectors of length n, and the cosine of the angle between them arrives at this same correlation. In fact, this reminds us that “Correlations are cosines of angles.” (a fascinating little paper titled “Thirteen ways of thinking about the correlation coefficient” is quite illuminating!)

Geometric Operations: Projection

Geometric Operations: Rotation

Here is some visual examples of orthogonal rotation:

Invariances in Metric Spaces

Invariances in Metric Spaces

Lack of Invariance is a Problem

Data Spaces from Non-Continuous Data

The above geometric discussion was predicated on the notion that the data were multivariate normal (MVN). In such cases, sets of continuous variables in similar units and scale can be treated as axes of a dataspace, which exhibits metric properties. What happens when we don’t have such data?

Imagine having columns of presence/absence data:

\[\small\mathbf{Y}=\begin{bmatrix} 0 & 0 & 0\\ 0 & 0 & 1\\ 0 & 1 & 1\\ 0 & 1 & 0\\ 1 & 0 & 0\\ \vdots & \vdots \\ 1 & 1 & 1 \end{bmatrix}\]

What does this data space look like?

Binary Data

This is the data space for binary data, and is called a ‘Hamming’ space. Here, differences (distances) between objects traverse the axes of the space. For a pair of objects, these are summarized in the table to the right. The Hamming Distance is an appropriate measure for such data, which is simply the number of positions for which two objects differ: \(\small\mathbf{D}_{Hamming}=b+c\)

Despite being binary data, Hamming distance is metric. Thus, distances and directions could in theory be interpreted in Hamming space.

NOTE: There are quite a few other distance and similarity measures for binary data (Jaccard distance, Dice distance, matching coefficient, etc.).

Abundance (Count) Data

Many times, ecologists have species abundance data, where rows of \(\small\mathbf{Y}\) are sites and columns are species.

A VERY common distance measure for this is Bray-Curtis distance: \(\small\mathbf{D}_{BC}=\frac{\sum{|y_{1i}-y_{2i}|}}{\sum{y_{1.}}+\sum{y_{2.}}}\). For each pair of sites, this is the sum-difference in species abundance across all species, divided by the mean abundance. Bray-Curtis distance is NOT metric!!

The Chi-square distance is another common measure for such data: \(\small\mathbf{D}_{\chi^{2}}=\sqrt{\sum\frac{1}{y_{k}}\left(\frac{y_{ki}}{y_{i}}-\frac{y_{kj}}{y_{j}}\right)}\).

Quite a few other distance and similarity measures for abundance data exist.
NOTE: We will return to matrices of species abundance data briefly at the end of this lecture.

Continuous Data

Thus far we have used Euclidean distance as a measure for continuous data

\[\small\mathbf{d}_{Euclid}=\sqrt{\sum\left({y}_{1i}-{y}_{2i}\right)^{2}}=\sqrt{\left(\mathbf{y}_{1}-\mathbf{y}_{2}\right)^{T}\left(\mathbf{y}_{1}-\mathbf{y}_{2}\right)}\]

This is the distance for metric (Euclidean) spaces:

Many other distance measures exist for continuous variables (Manhattan distance, Mahalanobis distance, Canberra distance, etc.)

NOTE: not all sets of continuous variables create Euclidean data spaces. For example, combining continous variables in different units or scale (e.g., temperature, relative humidity, elevation) results in distances between objects that are not Euclidean and not interpretable. The reason is that deviations in each variable are not commensurate: thus distances and covariances found by combining them have no meaning. (see Adams and Collyer. 2019. Ann. Rev. Ecol. Evol. Syst.)

Distance Metrics Vs. Measures

Not all distance measures are metrics. Metrics must satisfy:

Property Meaning
\(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}\geq0\) non-negativity
\(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}=0 \Leftrightarrow{\mathbf{y}}_{1}={\mathbf{y}}_{2}\) identity
\(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}={d}_{\mathbf{y}_{2}\mathbf{y}_{1}}\) symmetry
\(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{3}}\leq{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}+{d}_{\mathbf{y}_{2}\mathbf{y}_{3}}\) triangle inequality

Some distances are Semimetric (pseudometric): Do not satsify the triangle inequality (e.g., Bray-Curtis distance & Sørenson’s similarity)

Other distances are Nonmetric: can have negative distances (e.g.,Kulczynski’s coefficient)

Obviously, interpreting patterns in dataspaces that have non-metric distance measures are quite complicated, and frequently do not match our intuition of distances and directions.

Danger: Not all Numerical Data are Combinable!

As should be obvious from the preceding discussion, combining different data types is absolute innanity!

Imagine the following ecological measures:

1: Temperature (centigrade)

2: Relative humidity (percentage)

3: Elevation (meters)

4: Presence/Absence of predators

5: Species abundance of 3 different competitors

Generating distances between objects, or covariances between variable from such data is GIGO!!!

Combining Incommensurate Data

Note: there are still issues to be addressed (e.g., how to weight the scores for each data type if \(p_1\neq{p_2}\), etc.)

Multivariate Linear Models

Armed with this information, let’s look at inferences using the (general) linear model:

\[\mathbf{Y}=\mathbf{X}\mathbf{\beta } +\mathbf{E}\]

For multivariate data, \(\small\mathbf{Y}\) is a \(\small n \times p\) matrix, not a vector. How does this affect our computations?

The answer depends upon which step of the procedure: parameter estimation or model evaluation.

Multivariate LM: Parameters and Coefficients

Algebraic computations for estimating model parameters and coefficients are identical to those used for univariate \(\small{Y}\) data:

\(\tiny\mathbf{X}_R = \begin{bmatrix} 1\\ 1\\ 1\\ 1\\ 1\\ 1 \end{bmatrix}\) & \(\tiny\mathbf{X}_F = \begin{bmatrix} 1 & 0 \\ 1 & 0 \\ 1 & 0 \\ 1 & 1 \\ 1 & 1 \\ 1 & 1 \end{bmatrix}\)

Let’s compare the fit of the data to both \(\small\mathbf{X}_{R}\) and \(\small\mathbf{X}_{F}\):

Estimate \(\small\mathbf{X}_{R}\) \(\small\mathbf{X}_{F}\)
Coefficients \(\tiny\hat{\mathbf{\beta_R}}=\left ( \mathbf{X}_R^{T} \mathbf{X}_R\right )^{-1}\left ( \mathbf{X}_R^{T} \mathbf{Y}\right )\) \(\tiny\hat{\mathbf{\beta_F}}=\left ( \mathbf{X}_F^{T} \mathbf{X}_F\right )^{-1}\left ( \mathbf{X}_F^{T} \mathbf{Y}\right )\)
Predicted Values \(\small\hat{\mathbf{Y}}_R=\mathbf{X}_R\hat{\mathbf{\beta}}_R\) \(\small\hat{\mathbf{Y}}_F=\mathbf{X}_F\hat{\mathbf{\beta}}_F\)
Model Residuals \(\small\hat{\mathbf{E}}_R=\mathbf{Y}-\hat{\mathbf{Y}}_R\) \(\small\hat{\mathbf{E}}_F=\mathbf{Y}-\hat{\mathbf{Y}}_F\)
Model Residual Error (\(\small{SSE}\)) \(\small\mathbf{S}_R=\hat{\mathbf{E}}_R^T\hat{\mathbf{E}}_R\) \(\small\mathbf{S}_F=\hat{\mathbf{E}}_F^T\hat{\mathbf{E}}_F\)

Ok, but what about statistical evaluation? What test statistics can we use?

Multivariate LM: Test Statistics

With univariate linear models, our test statistic was the ratio of explained to unexplained variance: \(\small{F}=\frac{\sigma^{2}_{M}}{\sigma^{2}_{\epsilon}}\)

For multivariate \(\small\mathbf{Y}\) data, these would be matrices, implying:

\[\tiny{"F"}=\frac{\begin{bmatrix} \sigma^{2}_{M_{11}} & \sigma_{M_{12}} \\ \sigma_{M_{21}} & \sigma^{2}_{M_{22}} \end{bmatrix}} {\begin{bmatrix} \sigma^{2}_{\epsilon_{11}} & \sigma_{\epsilon_{12}} \\ \sigma_{\epsilon_{21}} & \sigma^{2}_{\epsilon_{22}} \end{bmatrix}}=\frac{\left(\Delta k\right)^{-1}\small\left(\mathbf{S}_R-\small\mathbf{S}_{F}\right)}{\left(n-k_f-1\right)^{-1}\small\mathbf{S}_{F}}\]

But since one cannot ‘divide’ matrices, other summary test measures have been derived:

1: Wilks’ lambda: \(\small\Lambda_{Wilks}=\mathbf{\frac{\begin{vmatrix}S_{F}\end{vmatrix}}{\begin{vmatrix}S_{R}\end{vmatrix}}}\)

2: Pillai’s trace: \(\small\Lambda_{Pillai}=tr(\mathbf{S^{-1}_{F}(S_{R}-S_{F}))}\)

These are then converted to approximate \(\small{F}\) values with yet additional formulae with additional parametric assumptions. Model significance is then evaluated using \(\small{F}\)approx

Limitations to Parametric Theory

Parametric multivariate methods ‘work’, but have limitations. Consider multivariate regression (using \(\small{F}_{approx}\)). Here are simulation results (type I error and power) as \(\small p\) increases:

Statistical power decreases as \(\small p\)-dimensions increases, and power eventually hits zero (as \(\small p\) approaches \(\small n\)).

The reason is that as \(\small p\) approaches \(\small n\), calculating the inverse of \(\small\mathbf{S}_F\) or \(\small\mathbf{S}_R\) becomes harder to accomplish mathematically, because the determinant decreases towards zero (i.e., the matix becomes closer and closer to singular).

We require an alternative implementation for model evaluation with highly multivariate data!

High-Dimensional Data: Challenges

When \(p>n\), standard parametric approaches are not possible. There are several ways of coping with the complications of high-dimensional data

Robust Summary Statistics

Because of the difficulty with high-dimensional shape data, we require summary test measures that do not suffer the ills found with standard, parametric measures. Several come to mind:

From Distances To Summary Statistics

Goodall (1991) and Anderson (2001) independently recognized the link between distances and summary \(\small{F}\)-statistics. Summary \(\small{F}\)-ratios are based on sums-of-squares (SS), and distances are found from the square-root of squared differences between objects. Thus, they are ostensibly the same.

For univariate data, this approach yields IDENTICAL \(F\)-values to standard equations, but provides a generalization of \(\small{F}\) (based on matrix traces) for high-dimensional data.

Adams and Collyer extended this paradigm: applying RRPP for model evaluation (which has higher statistical power for factorial models), and deriving permutation-based effect sizes for comparing signal strength across model effects.

Goodall (1991); Anderson (2001); Collyer et al (2015); Adams & Collyer (2016; 2018, etc.)

RRPP and Empirical Null Sampling Distributions

  1. Perform RRPP many times.
  2. Calculate \(F\)-value in every random permutation (observed case counts as one permutation)
  3. For \(N\) permutations, \(P = \frac{N(F_{random} \geq F_{obs})}{N}\)
  4. Calculate effect size as a standard deviate of the observed value in a normalized distribution of random values (helps for comparing effects within and between models); i.e., \[z = \frac{ \log\left( F\right) - \mu_{\log\left(F\right)} } { \sigma_{\log\left(F\right)} }\] where \(\mu_{\log\left(F\right)}\) and \(\sigma_{\log\left(F\right)}\) are the expected value and standard deviation from the sampling distribution, respectively (Collyer et al. 2015; Adams and Collyer 2016).
Collyer et al. Heredity. (2015); Adams & Collyer. Evolution. (2016); Adams & Collyer. Evolution. (2018)

RRPP and Power

Interestingly, RRPP ‘breaks’ Rao’s paradox, as power increases as the number of trait dimensions (\(\small p\) ) increases.

This may seem paradoxical but makes intuitive sense. Since test measures are based on distances among objects, adding trait dimensions can only provide additional information on any pattern between Y~X if it is present. And since statistical evaluation is based on this, power increases (NOTE: adding noise dimensions neither increases nor decreases power).

Multivariate LM: Example

Note: this is a high-dimensional dataspace (\(\small n=54\) & \(\small p=112\)). Standard parametric methods will fail, as \(\small p >> n\). Thus, we will use RRPP approaches to investigate patterns of phenotypic dispersion in this dataset.

Example 1: ANOVA

MANOVA via RRPP reveals significant phenotypic variation attributable to all model effects. Pairwise comparisons identify significant sexual dimorphism in the Marsh population, but not the Sinkhole population (hence the significant interaction).

data(Pupfish)
Pupfish$Group <- interaction(Pupfish$Sex, Pupfish$Pop)
fit.m <- lm.rrpp(coords ~ Pop * Sex, data = Pupfish, print.progress = FALSE) 
anova(fit.m)$table
##           Df       SS        MS     Rsq      F      Z Pr(>F)   
## Pop        1 0.008993 0.0089927 0.15964 16.076 3.9997  0.001 **
## Sex        1 0.015917 0.0159169 0.28255 28.453 4.1103  0.001 **
## Pop:Sex    1 0.003453 0.0034532 0.06130  6.173 3.7015  0.001 **
## Residuals 50 0.027970 0.0005594 0.49651                        
## Total     53 0.056333                                          
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(pairwise(fit.m,groups = Pupfish$Group))
## 
## Pairwise comparisons
## 
## Groups: F.Marsh M.Marsh F.Sinkhole M.Sinkhole 
## 
## RRPP: 1000 permutations
## 
## LS means:
## Vectors hidden (use show.vectors = TRUE to view)
## 
## Pairwise distances between means, plus statistics
##                                d  UCL (95%)          Z Pr > d
## F.Marsh:M.Marsh       0.04611590 0.04297378  2.3159417  0.007
## F.Marsh:F.Sinkhole    0.03302552 0.03312516  1.6124629  0.055
## F.Marsh:M.Sinkhole    0.03881514 0.04633821 -0.5421145  0.699
## M.Marsh:F.Sinkhole    0.04605211 0.05506197 -0.2523753  0.597
## M.Marsh:M.Sinkhole    0.02802087 0.03343901  0.1854026  0.420
## F.Sinkhole:M.Sinkhole 0.02568508 0.04364031 -2.2111968  0.993

Example 1: ANOVA (Cont.)

Visualization via a principal components plot of fitted values from the model (we’ll discuss PCA in a future lecture). Notice that the statistical results are easily visualized here, despite being highly multivariate data.

Example 2: ANCOVA

Incorporating size as a covariate we find that shape covaries with size (i.e., allometry). Pairwise comparisons reveal that male and female Sinkhole fish differ in their allometry patterns.

Pupfish$logSize <- log(Pupfish$CS)
fit.slopes <- lm.rrpp(coords ~ logSize * Pop * Sex, data = Pupfish, print.progress = FALSE)
anova(fit.slopes)$table
##                 Df       SS        MS     Rsq       F      Z Pr(>F)   
## logSize          1 0.014019 0.0140193 0.24886 29.2078 5.0041  0.001 **
## Pop              1 0.010247 0.0102472 0.18190 21.3491 4.7949  0.001 **
## Sex              1 0.005028 0.0050284 0.08926 10.4763 4.5388  0.001 **
## logSize:Pop      1 0.001653 0.0016528 0.02934  3.4434 2.6259  0.004 **
## logSize:Sex      1 0.001450 0.0014498 0.02574  3.0206 2.2721  0.014 * 
## Pop:Sex          1 0.001370 0.0013696 0.02431  2.8534 2.2866  0.010 * 
## logSize:Pop:Sex  1 0.000486 0.0004865 0.00864  1.0135 0.3337  0.382   
## Residuals       46 0.022079 0.0004800 0.39194                         
## Total           53 0.056333                                           
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
PWS <- pairwise(fit.slopes, groups = Pupfish$Group, covariate = Pupfish$logSize)
summary(PWS, test.type = "VC", angle.type = "deg")
## 
## Pairwise comparisons
## 
## Groups: F.Marsh M.Marsh F.Sinkhole M.Sinkhole 
## 
## RRPP: 1000 permutations
## 
## Slopes (vectors of variate change per one unit of covariate 
##         change, by group):
## Vectors hidden (use show.vectors = TRUE to view)
## 
## Pairwise statistics based on slopes vector correlations (r) 
##           and angles, acos(r)
## The null hypothesis is that r = 1 (parallel vectors).
## This null hypothesis is better treated as the angle 
##           between vectors = 0
##                               r    angle UCL (95%)          Z Pr > angle
## F.Marsh:M.Marsh       0.6139591 52.12367  95.40553 -1.0051685      0.844
## F.Marsh:F.Sinkhole    0.6318267 50.81498  88.71850 -0.8268874      0.782
## F.Marsh:M.Sinkhole    0.4461018 63.50614 116.45215 -1.3459215      0.904
## M.Marsh:F.Sinkhole    0.7175129 44.15048  77.31288 -1.4269023      0.926
## M.Marsh:M.Sinkhole    0.5629975 55.73665  74.08894  0.5136053      0.316
## F.Sinkhole:M.Sinkhole 0.4627734 62.43378  81.64708  0.4261192      0.342

Example 2: ANCOVA (Cont.)

Visualizing multivariate regression is tricky. Several measures summarizing patterns in \(\small\mathbf{Y}\) may be used: regression scores (top) and predicted lines (bottom).

see: Drake & Klingenberg. 2008; Adams & Nistri. 2010 respectively.

Permutational MANOVA & Distance Matrices

Sometimes, one must represent the response \(\small\mathbf{Y}\) data as a matrix of object distances. If the distance measure is metric or approximately Euclidean, one may perform ANOVA via RRPP with this as input.

Here, Principal Coordinates Analysis (PCoA) is used to obtain multivariate continuous coordinates from the distance matrix, which is then subject to permutational MANOVA via RRPP.

Distance Matrix ANOVA: Example

Here we represent the pupfish data by Euclidean distances. Results are identical to before.

Pupfish$Dist <- dist(Pupfish$coords)
anova(lm.rrpp(coords ~ Pop * Sex, data = Pupfish, print.progress = FALSE))$table 
##           Df       SS        MS     Rsq      F      Z Pr(>F)   
## Pop        1 0.008993 0.0089927 0.15964 16.076 3.9997  0.001 **
## Sex        1 0.015917 0.0159169 0.28255 28.453 4.1103  0.001 **
## Pop:Sex    1 0.003453 0.0034532 0.06130  6.173 3.7015  0.001 **
## Residuals 50 0.027970 0.0005594 0.49651                        
## Total     53 0.056333                                          
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lm.rrpp(Dist ~ Pop * Sex, data = Pupfish, print.progress = FALSE))$table 
##           Df       SS        MS     Rsq      F      Z Pr(>F)   
## Pop        1 0.008993 0.0089927 0.15964 16.076 3.9997  0.001 **
## Sex        1 0.015917 0.0159169 0.28255 28.453 4.1103  0.001 **
## Pop:Sex    1 0.003453 0.0034532 0.06130  6.173 3.7015  0.001 **
## Residuals 50 0.027970 0.0005594 0.49651                        
## Total     53 0.056333                                          
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

A Comment on Abundance (Count) Data

Matrices of abundance data (species counts) are common in ecology, but are clearly not MVN. There is a large literature on the subject, and several implementations for analysis are possible.

One school uses matrices of \(log(abund+1)\) or Bray-Curtis distances analyzed using linear models (LM) via permutational MANOVA (e.g., Anderson).

Another approach is to use generalized linear models (GLM with poisson or other link-function) on each species separately, followed by evaluating \(\small\sum{F}\)-ratio found across species (e.g., Wharton).

There are strengths and weaknesses to both approaches, and both have advantages. GLM can be accommodating to specific data types but LM is more robust to departures from assumptions. Importantly, it is suggested that residual resampling (i.e., RRPP!) cures a lot of LM/GLM ills. Future theoretical work on RRPP will investigate this more thoroughly.

A simple example is in the lab tutorial, and those interested should investigate the primary literature, especially:

Wharton et al. 2016. Methods Ecol. Evol.

Linear Models: Conclusions