Dean Adams, Iowa State University
GLM is the fitting of models - Fit null model (e.g., \(\small{H}_{0}\) = Y ~ 1) and then more complicated models - Assess fit of different models via SS (i.e,. LRT, \(\small{F}\)-ratios, etc.)
ANOVA and regression are derived from the same linear model, with the key differences being what type of variable is found in the explanatory variable matrix \(\small\mathbf{X}\) (i.e., categorical or continous factors)
Parameters of the linear model may be found using matrix algebra:
\[\small\hat{\mathbf{\beta }}=\left ( \mathbf{X}^{T} \mathbf{X}\right )^{-1}\left ( \mathbf{X}^{T} \mathbf{Y}\right )\]
Question: What about multivariate \(\small\mathbf{Y}\)?
## Analysis of Variance Table
##
## Response: Y[, 1]
## Df Sum Sq Mean Sq F value Pr(>F)
## gp 2 17.888 8.9442 3.0232 0.07696 .
## Residuals 16 47.337 2.9586
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Analysis of Variance Table
##
## Response: Y[, 2]
## Df Sum Sq Mean Sq F value Pr(>F)
## gp 2 7.6535 3.8268 2.4914 0.1143
## Residuals 16 24.5760 1.5360
## Df Pillai approx F num Df den Df Pr(>F)
## gp 2 0.80324 5.3694 4 32 0.002006 **
## Residuals 16
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Three main challenges with multivariate data
1: Describing variation in it
2: Visualizing trends in multivariate spaces
3: Ensuring that the dataspace corresponds to statistical analytics we wish to use
To address these we must first back up and discuss data spaces in general
What we’ve just done is take data and draw a data space. Data spaces result from us taking sets of measured variables and plotting the values we observe in our data.
Each measured variable represents a dimension of our dataspace. If we measure a single variable, our data ‘live’ in the one-dimensional space of the number line. Statistical analyses of univariate data are concerned with patterns of dispersion along the number line.
What we’ve just done is take data and draw a data space. Data spaces result from us taking sets of measured variables and plotting the values we observe in our data.
Each measured variable represents a dimension of our dataspace. If we measure a single variable, our data ‘live’ in the one-dimensional space of the number line. Statistical analyses of univariate data are concerned with patterns of dispersion along the number line.
If we measure multiple variables, our dataspace is of a higher dimension. Typically we refer to p-dimensions for the p measured variables. For example, if we measure length, width, and height on 30 plant specimens then each plant is represented by a point in a three-dimensional dataspace.
Note that while our data take on finite values, the axes of the dataspace span the entire set of possible values for all variables (dimensions) that could hypothetically exist. Depending on our variables, some regions of the dataspace may be inaccessible empirically (e.g., that combination of values is an impossible object), yet it would exist as a region of the dataspace.
As we will soon see, different data types generate data spaces with different properties. Despite our natural inclination to plot things as such, not all dataspaces are Euclidean!!!
The preceding visual example emphasizes several critical points:
1: Data are dots in space
2: There are relationships between dots in a dataspace that are worth exploring.
Let’s examine these a bit more closely, recalling what we just learned about matrix algebra operations.
Say we have measured two variables, \(\small{y}_{1}\) and \(\small{y}_{2}\). Let’s combined them into a matrix containing two columns:
\[\small\mathbf{Y}=\begin{bmatrix} 2 & 1 \\ 1 & 3 \\ \vdots & \vdots \\ y_{n1} & y_{n2} \end{bmatrix}\]
Note that for continuous data, all three representations: data as a matrix, data as points in a space, or data as vectors in a space are all equivalent.
Matrix algebra has a direct connection to geometry.
Imagine the following (SIMPLE!) operation:
\[\small\mathbf{y}_{1}+\mathbf{y}_{2}=\begin{bmatrix} 2 & 1\end{bmatrix} +\begin{bmatrix} 1 & 3\end{bmatrix} \]
Matrix algebra has a direct connection to geometry.
Imagine the following (SIMPLE!) operation:
\[\small\mathbf{y}_{1}+\mathbf{y}_{2}=\begin{bmatrix} 2 & 1\end{bmatrix} +\begin{bmatrix} 1 & 3\end{bmatrix} \]
\[\small\mathbf{y}_{1}+\mathbf{y}_{2}=\begin{bmatrix} 3 & 4\end{bmatrix} \]
By shifting the red vector to the end of the black vector, the new end point is the geometric result of matrix addition!
The previous example is admittedly simple, but it serves to demonstrate an important point: that algebra and geometry are ‘duals’ of one another.
In fact, because of the relationship between algebra and geometry, the properties of vectors in geometric spaces and their corresponding matrix operations holds regardless of data dimensionality.
For single values, algebraic operations correspond to movements along the number line. For 2D vectors, it is movement in the 2-dimensional data space. For higher-dimensional data, the algebra holds, as do the geometric interpretations of those operations.
The previous example is admitedly simple, but it serves to demonstrate an important point: that algebra and geometry are ‘duals’ of one another.
In fact, because of the relationship between algebra and geometry, the properties of vectors in geometric spaces and their corresponding matrix operations holds regardless of data dimensionality.
For single values, algebraic operations correspond to movements along the number line. For 2D vectors, it is movement in the 2-dimensional data space. For higher-dimensional data, the algebra holds, as do the geometric interpretations of those operations.
In other words, EVERY algebraic operation has a geometric interpretation. One is free (and in fact encouraged) to think of the geometric consequences of matrix algebra. One’s head may itch from this exercise, but it will provide wisdom!
Perhaps one more example will illustrate the point…
Say we have data (\(\small\mathbf{Y}\)) which is a \(\small{n}\times{2}\) matrix. We wish to obtain deviations from the mean. Using a null model vector of ones (\(\small\mathbf{X}_{0}\)) we perform:
\[\small\mathbf{Y}_{c}=\mathbf{Y}-\mathbf{X}_0 \left( \mathbf{X}_0^T\mathbf{X}_0 \right)^{-1}\mathbf{X}_0^T\mathbf{Y}\]
Say we have data (\(\small\mathbf{Y}\)) which is a \(\small{n}\times{2}\) matrix. We wish to obtain deviations from the mean. Using a null model vector of ones (\(\small\mathbf{X}_{0}\)) we perform:
\[\small\mathbf{Y}_{c}=\mathbf{Y}-\mathbf{X}_0 \left( \mathbf{X}_0^T\mathbf{X}_0 \right)^{-1}\mathbf{X}_0^T\mathbf{Y}\]
Now plot it! Note that geometrically, the matrix algebra above results in mean centering the data!
Points that are close together (black and red) are more similar in their values than points that are far apart (black and green)
Some vectors between dots point in similar directions (red & blue) while others point in different directions (red and green)
Points that are close together (black and red) are more similar in their values than points that are far apart (black and green)
Some vectors between dots point in similar directions (red & blue) while others point in different directions (red and green)
Our intuition suggests that data spaces convey meaningful information about distances and directions. These are properties of METRIC dataspaces!
| Property | Meaning |
|---|---|
| \(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}\geq0\) | non-negativity |
| \(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}=0 \Leftrightarrow{\mathbf{y}}_{1}={\mathbf{y}}_{2}\) | identity |
| \(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}={d}_{\mathbf{y}_{2}\mathbf{y}_{1}}\) | symmetry |
| \(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{3}}\leq{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}+{d}_{\mathbf{y}_{2}\mathbf{y}_{3}}\) | triangle inequality |
Euclidean spaces are metric spaces. Distances and directions have meaning in such spaces, and can be interpreted visually and compared algebraically. In fact, many multivariate analyses essentially evaluate distances and directions among points in metric spaces to evaluate patterns
Also implied by metric spaces is the notion of axis perpendicularity (axis orthogonality). Without this property, axes are oblique to one another, and algebraic operations can result in unintuitive outcomes
Vector length = distance = magnitude= ‘norm’: \(\small||\mathbf{y}||=\sqrt{\mathbf{y}^{T}\mathbf{y}}\)
Difference in direction between two vectors: \(\small\theta=cos^{-1}\left(\frac{\mathbf{y}^{T}_{1}}{||\mathbf{y}_{1}||}\right)\left(\frac{\mathbf{y}^{T}_{2}}{||\mathbf{y}_{2}||}\right)\)
When operating in metric spaces, several operations are quite useful.
1: Vector Correlation: For any pair of vectors one can calculate the angle between them, which describes the difference in direction that the vectors display in their orientation. These can then be converted to the correlation between the two vectors.
\[\small\theta=cos^{-1}\left(\frac{\mathbf{y}^{T}_{1}}{||\mathbf{y}_{1}||}\right)\left(\frac{\mathbf{y}^{T}_{2}}{||\mathbf{y}_{2}||}\right)\]
\[\small{r}_{\mathbf{y}_{1},\mathbf{y}_{2}}=cos\theta\]
Another very useful mathematical operation is projection.
Algebraically, projection is the multiplication of a data vector (or matrix) by some other vector (or matrix). This results in a set of values representing the original objects, but in the new (projected) space based on the projection vector/matrix. Typically, the projection vector or matrix is of Euclidean geometry.
Ok, what precisely does that mean? Let’s take the case of a single data point in a two-dimensional space (call it a). We wish to project it onto another vector (b) in our dataspace. Algebraically, this is accomplished by: \(\small\mathbf{a}_{proj}=\mathbf{a}*\mathbf{b}\)
What we see from the example is that the resulting value (for a vector) corresponds to the length of a in the projected space of b.
Note 1: Projection of matrices operates identically: \(\small\mathbf{A}_{proj}=\mathbf{A}*\mathbf{B}\)
Note 2: Projection is an exceedingly useful operation in high-dimensional data analysis. For linear models, both fitted values and residuals are found from projection, and visualizations of data in metric spaces are frequently accomplished by projection.
Rotations ‘spin’ the dataspace in some fashion; typically, via a rigid (orthogonal) rotation. Mathematically, rotation is accomplished as a projection: \(\small\mathbf{HY}\): where \(\small\mathbf{H}\) is an \(n \times n\) projection matrix and \(\small\mathbf{Y}\) is the data matrix
If \(\small\mathbf{H}\) is an orthogonal matrix, this is a rigid rotation, and corresponds to an embodiment of our ‘intuition’ of rotating axes of a data space such that they remain at right angles to one another.
With rigid rotations, distances and directions among objects are preserved. Thus, orthogonal rotations do nothing to the data in terms of statistical inferences, so long as all dimensions of the dataspace are used in summary estimates.
Here is some visual examples of orthogonal rotation:
If data reside in a metric space, they are invariant (‘immune’) to certain algebraic operations. This means that the relationships among objects, and the relative distances between them, are the same before and after the algebraic manipulation. This is important, because multivariate analyses are essentially just a series of algebraic manipulations of the data.
Notably, metric spaces are invariant to linear transformations. These include: translations, scaling, and rotations.
Note that in each plot the RELATIVE distances between objects remains unchanged. This is important, because it means that the patterns in our data have been preserved, and thus any multivariate analysis that is also invariant to linear transformations will be unaffected by these operations.
Maintaining metric properties is an incredibly important part of proper analysis of high-dimensional data.
However, some multivariate methods do not retain the metric properties of the data. This means that the relative distances and directions between objects in the dataspace change or are otherwise distorted during the mathematical operations of that method. This is rather undesirable (particularly if one intends to analyze the scores)! With such methods one cannot reliably interpret patterns in the dataspace because the method has altered them.
Other methods violate the invariance properties mentioned above. For instance, methods that summarize patterns axis-by-axis rather than treating the entire high-dimensional dataset at once are typically not rotation-invariant. This can also be bad because it means that one can obtain different statistical estimates for the same data in different orientations!
The above geometric discussion was predicated on the notion that the data were multivariate normal (MVN). In such cases, sets of continuous variables in similar units and scale can be treated as axes of a dataspace, which exhibits metric properties. What happens when we don’t have such data?
Imagine having columns of presence/absence data:
\[\small\mathbf{Y}=\begin{bmatrix} 0 & 0 & 0\\ 0 & 0 & 1\\ 0 & 1 & 1\\ 0 & 1 & 0\\ 1 & 0 & 0\\ \vdots & \vdots \\ 1 & 1 & 1 \end{bmatrix}\]
What does this data space look like?
This is the data space for binary data, and is called a ‘Hamming’ space. Here, differences (distances) between objects traverse the axes of the space. For a pair of objects, these are summarized in the table to the right. The Hamming Distance is an appropriate measure for such data, which is simply the number of positions for which two objects differ: \(\small\mathbf{D}_{Hamming}=b+c\)
Despite being binary data, Hamming distance is metric. Thus, distances and directions could in theory be interpreted in Hamming space.
Many times, ecologists have species abundance data, where rows of \(\small\mathbf{Y}\) are sites and columns are species.
A VERY common distance measure for this is Bray-Curtis distance: \(\small\mathbf{D}_{BC}=\frac{\sum{|y_{1i}-y_{2i}|}}{\sum{y_{1.}}+\sum{y_{2.}}}\). For each pair of sites, this is the sum-difference in species abundance across all species, divided by the mean abundance. Bray-Curtis distance is NOT metric!!
The Chi-square distance is another common measure for such data: \(\small\mathbf{D}_{\chi^{2}}=\sqrt{\sum\frac{1}{y_{k}}\left(\frac{y_{ki}}{y_{i}}-\frac{y_{kj}}{y_{j}}\right)}\).
Thus far we have used Euclidean distance as a measure for continuous data
\[\small\mathbf{d}_{Euclid}=\sqrt{\sum\left({y}_{1i}-{y}_{2i}\right)^{2}}=\sqrt{\left(\mathbf{y}_{1}-\mathbf{y}_{2}\right)^{T}\left(\mathbf{y}_{1}-\mathbf{y}_{2}\right)}\]
This is the distance for metric (Euclidean) spaces:
Many other distance measures exist for continuous variables (Manhattan distance, Mahalanobis distance, Canberra distance, etc.)
Not all distance measures are metrics. Metrics must satisfy:
| Property | Meaning |
|---|---|
| \(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}\geq0\) | non-negativity |
| \(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}=0 \Leftrightarrow{\mathbf{y}}_{1}={\mathbf{y}}_{2}\) | identity |
| \(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}={d}_{\mathbf{y}_{2}\mathbf{y}_{1}}\) | symmetry |
| \(\small{d}_{\mathbf{y}_{1}\mathbf{y}_{3}}\leq{d}_{\mathbf{y}_{1}\mathbf{y}_{2}}+{d}_{\mathbf{y}_{2}\mathbf{y}_{3}}\) | triangle inequality |
Some distances are Semimetric (pseudometric): Do not satsify the triangle inequality (e.g., Bray-Curtis distance & Sørenson’s similarity)
Other distances are Nonmetric: can have negative distances (e.g.,Kulczynski’s coefficient)
Obviously, interpreting patterns in dataspaces that have non-metric distance measures are quite complicated, and frequently do not match our intuition of distances and directions.
As should be obvious from the preceding discussion, combining different data types is absolute innanity!
Imagine the following ecological measures:
1: Temperature (centigrade)
2: Relative humidity (percentage)
3: Elevation (meters)
4: Presence/Absence of predators
5: Species abundance of 3 different competitors
Generating distances between objects, or covariances between variable from such data is GIGO!!!
One approach that can be used (but is rarely implemented) is the following:
Separate data into common types (continuous data, binary data, etc.)
Perform ordination on each separately, based on the appropriate distance measure/metric
Combine ordination scores (which are all continuous and hopefully represent Euclidean spaces) for overall data matrix
Armed with this information, let’s look at inferences using the (general) linear model:
\[\mathbf{Y}=\mathbf{X}\mathbf{\beta } +\mathbf{E}\]
For multivariate data, \(\small\mathbf{Y}\) is a \(\small n \times p\) matrix, not a vector. How does this affect our computations?
The answer depends upon which step of the procedure: parameter estimation or model evaluation.
Algebraic computations for estimating model parameters and coefficients are identical to those used for univariate \(\small{Y}\) data:
\(\tiny\mathbf{X}_R = \begin{bmatrix} 1\\ 1\\ 1\\ 1\\ 1\\ 1 \end{bmatrix}\) & \(\tiny\mathbf{X}_F = \begin{bmatrix} 1 & 0 \\ 1 & 0 \\ 1 & 0 \\ 1 & 1 \\ 1 & 1 \\ 1 & 1 \end{bmatrix}\)
Let’s compare the fit of the data to both \(\small\mathbf{X}_{R}\) and \(\small\mathbf{X}_{F}\):
| Estimate | \(\small\mathbf{X}_{R}\) | \(\small\mathbf{X}_{F}\) |
|---|---|---|
| Coefficients | \(\tiny\hat{\mathbf{\beta_R}}=\left ( \mathbf{X}_R^{T} \mathbf{X}_R\right )^{-1}\left ( \mathbf{X}_R^{T} \mathbf{Y}\right )\) | \(\tiny\hat{\mathbf{\beta_F}}=\left ( \mathbf{X}_F^{T} \mathbf{X}_F\right )^{-1}\left ( \mathbf{X}_F^{T} \mathbf{Y}\right )\) |
| Predicted Values | \(\small\hat{\mathbf{Y}}_R=\mathbf{X}_R\hat{\mathbf{\beta}}_R\) | \(\small\hat{\mathbf{Y}}_F=\mathbf{X}_F\hat{\mathbf{\beta}}_F\) |
| Model Residuals | \(\small\hat{\mathbf{E}}_R=\mathbf{Y}-\hat{\mathbf{Y}}_R\) | \(\small\hat{\mathbf{E}}_F=\mathbf{Y}-\hat{\mathbf{Y}}_F\) |
| Model Residual Error (\(\small{SSE}\)) | \(\small\mathbf{S}_R=\hat{\mathbf{E}}_R^T\hat{\mathbf{E}}_R\) | \(\small\mathbf{S}_F=\hat{\mathbf{E}}_F^T\hat{\mathbf{E}}_F\) |
Ok, but what about statistical evaluation? What test statistics can we use?
With univariate linear models, our test statistic was the ratio of explained to unexplained variance: \(\small{F}=\frac{\sigma^{2}_{M}}{\sigma^{2}_{\epsilon}}\)
For multivariate \(\small\mathbf{Y}\) data, these would be matrices, implying:
\[\tiny{"F"}=\frac{\begin{bmatrix} \sigma^{2}_{M_{11}} & \sigma_{M_{12}} \\ \sigma_{M_{21}} & \sigma^{2}_{M_{22}} \end{bmatrix}} {\begin{bmatrix} \sigma^{2}_{\epsilon_{11}} & \sigma_{\epsilon_{12}} \\ \sigma_{\epsilon_{21}} & \sigma^{2}_{\epsilon_{22}} \end{bmatrix}}=\frac{\left(\Delta k\right)^{-1}\small\left(\mathbf{S}_R-\small\mathbf{S}_{F}\right)}{\left(n-k_f-1\right)^{-1}\small\mathbf{S}_{F}}\]
But since one cannot ‘divide’ matrices, other summary test measures have been derived:
1: Wilks’ lambda: \(\small\Lambda_{Wilks}=\mathbf{\frac{\begin{vmatrix}S_{F}\end{vmatrix}}{\begin{vmatrix}S_{R}\end{vmatrix}}}\)
2: Pillai’s trace: \(\small\Lambda_{Pillai}=tr(\mathbf{S^{-1}_{F}(S_{R}-S_{F}))}\)
These are then converted to approximate \(\small{F}\) values with yet additional formulae with additional parametric assumptions. Model significance is then evaluated using \(\small{F}\)approx
Parametric multivariate methods ‘work’, but have limitations. Consider multivariate regression (using \(\small{F}_{approx}\)). Here are simulation results (type I error and power) as \(\small p\) increases:
Statistical power decreases as \(\small p\)-dimensions increases, and power eventually hits zero (as \(\small p\) approaches \(\small n\)).
We require an alternative implementation for model evaluation with highly multivariate data!
When \(p>n\), standard parametric approaches are not possible. There are several ways of coping with the complications of high-dimensional data
Reduce dimensionality: Reduce the number of variables (\(p\)) in some manner (typically use PCA and retain only a subset of axes for the analysis)
Because of the difficulty with high-dimensional shape data, we require summary test measures that do not suffer the ills found with standard, parametric measures. Several come to mind:
Goodall (1991) and Anderson (2001) independently recognized the link between distances and summary \(\small{F}\)-statistics. Summary \(\small{F}\)-ratios are based on sums-of-squares (SS), and distances are found from the square-root of squared differences between objects. Thus, they are ostensibly the same.
For univariate data, this approach yields IDENTICAL \(F\)-values to standard equations, but provides a generalization of \(\small{F}\) (based on matrix traces) for high-dimensional data.
Adams and Collyer extended this paradigm: applying RRPP for model evaluation (which has higher statistical power for factorial models), and deriving permutation-based effect sizes for comparing signal strength across model effects.
Interestingly, RRPP ‘breaks’ Rao’s paradox, as power increases as the number of trait dimensions (\(\small p\) ) increases.
Y~X if it is present. And since statistical evaluation is based on this, power increases (NOTE: adding noise dimensions neither increases nor decreases power).Note: this is a high-dimensional dataspace (\(\small n=54\) & \(\small p=112\)). Standard parametric methods will fail, as \(\small p >> n\). Thus, we will use RRPP approaches to investigate patterns of phenotypic dispersion in this dataset.
MANOVA via RRPP reveals significant phenotypic variation attributable to all model effects. Pairwise comparisons identify significant sexual dimorphism in the Marsh population, but not the Sinkhole population (hence the significant interaction).
data(Pupfish)
Pupfish$Group <- interaction(Pupfish$Sex, Pupfish$Pop)
fit.m <- lm.rrpp(coords ~ Pop * Sex, data = Pupfish, print.progress = FALSE)
anova(fit.m)$table## Df SS MS Rsq F Z Pr(>F)
## Pop 1 0.008993 0.0089927 0.15964 16.076 3.9997 0.001 **
## Sex 1 0.015917 0.0159169 0.28255 28.453 4.1103 0.001 **
## Pop:Sex 1 0.003453 0.0034532 0.06130 6.173 3.7015 0.001 **
## Residuals 50 0.027970 0.0005594 0.49651
## Total 53 0.056333
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Pairwise comparisons
##
## Groups: F.Marsh M.Marsh F.Sinkhole M.Sinkhole
##
## RRPP: 1000 permutations
##
## LS means:
## Vectors hidden (use show.vectors = TRUE to view)
##
## Pairwise distances between means, plus statistics
## d UCL (95%) Z Pr > d
## F.Marsh:M.Marsh 0.04611590 0.04297378 2.3159417 0.007
## F.Marsh:F.Sinkhole 0.03302552 0.03312516 1.6124629 0.055
## F.Marsh:M.Sinkhole 0.03881514 0.04633821 -0.5421145 0.699
## M.Marsh:F.Sinkhole 0.04605211 0.05506197 -0.2523753 0.597
## M.Marsh:M.Sinkhole 0.02802087 0.03343901 0.1854026 0.420
## F.Sinkhole:M.Sinkhole 0.02568508 0.04364031 -2.2111968 0.993
Visualization via a principal components plot of fitted values from the model (we’ll discuss PCA in a future lecture). Notice that the statistical results are easily visualized here, despite being highly multivariate data.
Incorporating size as a covariate we find that shape covaries with size (i.e., allometry). Pairwise comparisons reveal that male and female Sinkhole fish differ in their allometry patterns.
Pupfish$logSize <- log(Pupfish$CS)
fit.slopes <- lm.rrpp(coords ~ logSize * Pop * Sex, data = Pupfish, print.progress = FALSE)
anova(fit.slopes)$table## Df SS MS Rsq F Z Pr(>F)
## logSize 1 0.014019 0.0140193 0.24886 29.2078 5.0041 0.001 **
## Pop 1 0.010247 0.0102472 0.18190 21.3491 4.7949 0.001 **
## Sex 1 0.005028 0.0050284 0.08926 10.4763 4.5388 0.001 **
## logSize:Pop 1 0.001653 0.0016528 0.02934 3.4434 2.6259 0.004 **
## logSize:Sex 1 0.001450 0.0014498 0.02574 3.0206 2.2721 0.014 *
## Pop:Sex 1 0.001370 0.0013696 0.02431 2.8534 2.2866 0.010 *
## logSize:Pop:Sex 1 0.000486 0.0004865 0.00864 1.0135 0.3337 0.382
## Residuals 46 0.022079 0.0004800 0.39194
## Total 53 0.056333
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
PWS <- pairwise(fit.slopes, groups = Pupfish$Group, covariate = Pupfish$logSize)
summary(PWS, test.type = "VC", angle.type = "deg")##
## Pairwise comparisons
##
## Groups: F.Marsh M.Marsh F.Sinkhole M.Sinkhole
##
## RRPP: 1000 permutations
##
## Slopes (vectors of variate change per one unit of covariate
## change, by group):
## Vectors hidden (use show.vectors = TRUE to view)
##
## Pairwise statistics based on slopes vector correlations (r)
## and angles, acos(r)
## The null hypothesis is that r = 1 (parallel vectors).
## This null hypothesis is better treated as the angle
## between vectors = 0
## r angle UCL (95%) Z Pr > angle
## F.Marsh:M.Marsh 0.6139591 52.12367 95.40553 -1.0051685 0.844
## F.Marsh:F.Sinkhole 0.6318267 50.81498 88.71850 -0.8268874 0.782
## F.Marsh:M.Sinkhole 0.4461018 63.50614 116.45215 -1.3459215 0.904
## M.Marsh:F.Sinkhole 0.7175129 44.15048 77.31288 -1.4269023 0.926
## M.Marsh:M.Sinkhole 0.5629975 55.73665 74.08894 0.5136053 0.316
## F.Sinkhole:M.Sinkhole 0.4627734 62.43378 81.64708 0.4261192 0.342
Visualizing multivariate regression is tricky. Several measures summarizing patterns in \(\small\mathbf{Y}\) may be used: regression scores (top) and predicted lines (bottom).
Sometimes, one must represent the response \(\small\mathbf{Y}\) data as a matrix of object distances. If the distance measure is metric or approximately Euclidean, one may perform ANOVA via RRPP with this as input.
Here, Principal Coordinates Analysis (PCoA) is used to obtain multivariate continuous coordinates from the distance matrix, which is then subject to permutational MANOVA via RRPP.
Here we represent the pupfish data by Euclidean distances. Results are identical to before.
Pupfish$Dist <- dist(Pupfish$coords)
anova(lm.rrpp(coords ~ Pop * Sex, data = Pupfish, print.progress = FALSE))$table ## Df SS MS Rsq F Z Pr(>F)
## Pop 1 0.008993 0.0089927 0.15964 16.076 3.9997 0.001 **
## Sex 1 0.015917 0.0159169 0.28255 28.453 4.1103 0.001 **
## Pop:Sex 1 0.003453 0.0034532 0.06130 6.173 3.7015 0.001 **
## Residuals 50 0.027970 0.0005594 0.49651
## Total 53 0.056333
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Df SS MS Rsq F Z Pr(>F)
## Pop 1 0.008993 0.0089927 0.15964 16.076 3.9997 0.001 **
## Sex 1 0.015917 0.0159169 0.28255 28.453 4.1103 0.001 **
## Pop:Sex 1 0.003453 0.0034532 0.06130 6.173 3.7015 0.001 **
## Residuals 50 0.027970 0.0005594 0.49651
## Total 53 0.056333
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Matrices of abundance data (species counts) are common in ecology, but are clearly not MVN. There is a large literature on the subject, and several implementations for analysis are possible.
One school uses matrices of \(log(abund+1)\) or Bray-Curtis distances analyzed using linear models (LM) via permutational MANOVA (e.g., Anderson).
Another approach is to use generalized linear models (GLM with poisson or other link-function) on each species separately, followed by evaluating \(\small\sum{F}\)-ratio found across species (e.g., Wharton).
There are strengths and weaknesses to both approaches, and both have advantages. GLM can be accommodating to specific data types but LM is more robust to departures from assumptions. Importantly, it is suggested that residual resampling (i.e., RRPP!) cures a lot of LM/GLM ills. Future theoretical work on RRPP will investigate this more thoroughly.
A simple example is in the lab tutorial, and those interested should investigate the primary literature, especially:
Linear models provide flexible tools for evaluating:
For multivariate and high-dimensional data (and distance data), permutational MANOVA facilitates evaluation of models via RRPP
The strength of signal across model effects can be compared using effect sizes (Z-scores)
Pairwise comparisons of groups, slopes, or other attributes can also be evaluated using RRPP