As we discussed in lecture, interactions between main effects in a linear model describe context-dependent patterns in one’s data. For instance, in the model: Y~cov+group+cov:group, a significant cov:group term implies that the relationship between Y and the covariate differs in some way among the groups. Such patterns are very common in biology and in fact describe patterns of considerable interest. However, with multivariate data, simply identifying a significant interaction in a linear model is only the first step towards understanding.
Once a significant interaction has been identified, one must then dig deeper to understand its meaning. Below is an example for a factrial MANOVA.
As seen in previous weeks, one can perform pairwise comparisons among groups via RRPP. The approach is based on distances between group means.
library(RRPP)
data(Pupfish)
Pupfish$Group <- interaction(Pupfish$Sex, Pupfish$Pop)
fit.m <- lm.rrpp(coords ~ Pop * Sex, data = Pupfish, print.progress = FALSE)
anova(fit.m)$table
## Df SS MS Rsq F Z Pr(>F)
## Pop 1 0.008993 0.0089927 0.15964 16.076 3.9997 0.001 **
## Sex 1 0.015917 0.0159169 0.28255 28.453 4.1103 0.001 **
## Pop:Sex 1 0.003453 0.0034532 0.06130 6.173 3.7015 0.001 **
## Residuals 50 0.027970 0.0005594 0.49651
## Total 53 0.056333
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note that the interaction between pop and sex is significant. This implies that some pop:sexcombination(s) differ from the others, but not which ones. To identify which groups differ, we need multiple comparison tests.
summary(pairwise(fit.m,groups = Pupfish$Group))
##
## Pairwise comparisons
##
## Groups: F.Marsh M.Marsh F.Sinkhole M.Sinkhole
##
## RRPP: 1000 permutations
##
## LS means:
## Vectors hidden (use show.vectors = TRUE to view)
##
## Pairwise distances between means, plus statistics
## d UCL (95%) Z Pr > d
## F.Marsh:M.Marsh 0.04611590 0.04297378 2.3159417 0.007
## F.Marsh:F.Sinkhole 0.03302552 0.03312516 1.6124629 0.055
## F.Marsh:M.Sinkhole 0.03881514 0.04633821 -0.5421145 0.699
## M.Marsh:F.Sinkhole 0.04605211 0.05506197 -0.2523753 0.597
## M.Marsh:M.Sinkhole 0.02802087 0.03343901 0.1854026 0.420
## F.Sinkhole:M.Sinkhole 0.02568508 0.04364031 -2.2111968 0.993
plot(fit.m, type = "PC", pch=21, bg = Pupfish$Group, cex=2)
legend("topright", levels(Pupfish$Group), pch = 21, pt.bg = 1:4)
Note the major differences between sympatric populations of both species (blue and green), and the overlap between allopetric populations of the same species (black and red). This is the quintessential example of character displacement.
For MANCOVA models, a covariate is included in the model. Now several comparisons are of interest. First, one can compare groups (LS-means) while accounting for the covariate.
Pupfish$logSize <- log(Pupfish$CS)
fit.slopes <- lm.rrpp(coords ~ logSize * Pop * Sex, data = Pupfish, print.progress = FALSE)
anova(fit.slopes)$table
## Df SS MS Rsq F Z Pr(>F)
## logSize 1 0.014019 0.0140193 0.24886 29.2078 5.0041 0.001 **
## Pop 1 0.010247 0.0102472 0.18190 21.3491 4.7949 0.001 **
## Sex 1 0.005028 0.0050284 0.08926 10.4763 4.5388 0.001 **
## logSize:Pop 1 0.001653 0.0016528 0.02934 3.4434 2.6259 0.004 **
## logSize:Sex 1 0.001450 0.0014498 0.02574 3.0206 2.2721 0.014 *
## Pop:Sex 1 0.001370 0.0013696 0.02431 2.8534 2.2866 0.010 *
## logSize:Pop:Sex 1 0.000486 0.0004865 0.00864 1.0135 0.3337 0.382
## Residuals 46 0.022079 0.0004800 0.39194
## Total 53 0.056333
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
PWS.gp <- pairwise(fit.slopes, groups = Pupfish$Group)
summary(PWS.gp)
##
## Pairwise comparisons
##
## Groups: F.Marsh M.Marsh F.Sinkhole M.Sinkhole
##
## RRPP: 1000 permutations
##
## LS means:
## Vectors hidden (use show.vectors = TRUE to view)
##
## Pairwise distances between means, plus statistics
## d UCL (95%) Z Pr > d
## F.Marsh:M.Marsh 0.03750941 0.04531287 0.002246837 0.503
## F.Marsh:F.Sinkhole 0.03719039 0.04374134 -0.221789776 0.590
## F.Marsh:M.Sinkhole 0.03808475 0.04277692 -0.495662717 0.696
## M.Marsh:F.Sinkhole 0.04409490 0.05311727 0.352149542 0.354
## M.Marsh:M.Sinkhole 0.02842126 0.03729487 0.008211561 0.482
## F.Sinkhole:M.Sinkhole 0.02477046 0.03228372 -0.122666544 0.550
plot(fit.slopes, type = "PC", pch=21, bg = Pupfish$Group, cex=2)
legend("topright", levels(Pupfish$Group), pch = 21, pt.bg = 1:4)
Note that the distances between pairs of group means are not the same as in the first example, because the covariate has been taken into account in the analysis.
Additionally, one can compare the slopes among groups.
PWS <- pairwise(fit.slopes, groups = Pupfish$Group, covariate = Pupfish$logSize)
summary(PWS)
##
## Pairwise comparisons
##
## Groups: F.Marsh M.Marsh F.Sinkhole M.Sinkhole
##
## RRPP: 1000 permutations
##
## Slopes (vectors of variate change per one unit of covariate
## change, by group):
## Vectors hidden (use show.vectors = TRUE to view)
##
## Pairwise distances between slope vector
## (end-points), plus statistics
## d UCL (95%) Z Pr > d
## F.Marsh:M.Marsh 0.08279940 0.2033602 -1.9636763 0.977
## F.Marsh:F.Sinkhole 0.07848121 0.1846494 -1.8491879 0.970
## F.Marsh:M.Sinkhole 0.12322729 0.2581796 -1.4771494 0.928
## M.Marsh:F.Sinkhole 0.06418279 0.1293582 -2.3787828 0.992
## M.Marsh:M.Sinkhole 0.10863856 0.1498666 0.4143532 0.348
## F.Sinkhole:M.Sinkhole 0.11706696 0.1566431 0.3505166 0.377
summary(PWS, test.type = "VC", angle.type = "deg")
##
## Pairwise comparisons
##
## Groups: F.Marsh M.Marsh F.Sinkhole M.Sinkhole
##
## RRPP: 1000 permutations
##
## Slopes (vectors of variate change per one unit of covariate
## change, by group):
## Vectors hidden (use show.vectors = TRUE to view)
##
## Pairwise statistics based on slopes vector correlations (r)
## and angles, acos(r)
## The null hypothesis is that r = 1 (parallel vectors).
## This null hypothesis is better treated as the angle
## between vectors = 0
## r angle UCL (95%) Z Pr > angle
## F.Marsh:M.Marsh 0.6139591 52.12367 95.40553 -1.0051685 0.844
## F.Marsh:F.Sinkhole 0.6318267 50.81498 88.71850 -0.8268874 0.782
## F.Marsh:M.Sinkhole 0.4461018 63.50614 116.45215 -1.3459215 0.904
## M.Marsh:F.Sinkhole 0.7175129 44.15048 77.31288 -1.4269023 0.926
## M.Marsh:M.Sinkhole 0.5629975 55.73665 74.08894 0.5136053 0.316
## F.Sinkhole:M.Sinkhole 0.4627734 62.43378 81.64708 0.4261192 0.342
plot(fit.slopes, type = "regression", reg.type = "RegScore", pch=21, bg = Pupfish$Group, predictor = Pupfish$logSize, cex=2)
legend("topleft", levels(Pupfish$Group), pch = 21, pt.bg = 1:4)
plot(fit.slopes, type = "regression", reg.type = "PredLine", pch=21, bg = Pupfish$Group, predictor = Pupfish$logSize, cex=2)
legend("topleft", levels(Pupfish$Group), pch = 21, pt.bg = 1:4)
Note that the above provides comparisons allow for the length of the regression vector, as well as a test of the angular direction of the regressions in the multivariate trait space.
Together, these procedures provide considerable flexibility in evaluating patterns that comprise significant interaction terms.
Traditional pairwise comparisons following MANOVA are conducted using the distance between group means, as in our examples above. However, for factorial models, this is rather limiting and does not tell the entire story; the approach might leave some relevant biological signal unexamined. For instance, consider the model in the example above: Y~species+site+species:site. In this case, we know that each species is found in each site. Thus, there is a linkage between site and species. To leverage this knowledge, we use trajectory analysis.
In this analysis, each level of factor 1 (e.g., species) is a trajectory in the multivariate space, which is formed by its corresponding least squares means (\(\overline{Y}\)) for the levels of factor B (e.g., site). This effectively ‘links’ the data from factor B within its level of factor A, forming the trajectory.
These trajectories then have higher-order attributes in the multivariate space, and these may be statistically compared. First, they have a magnitude (length), and second, they have a direction. For trajectories with 3 or more levels they also have a shape.
Let’s show an example for the data above. Here, our trajectories represent the degree of sexual dimorphism in each pupfish population.
fit <- lm.rrpp(coords ~ Pop * Sex, data = Pupfish, iter = 999,
print.progress = FALSE)
TA <- trajectory.analysis(fit, groups = Pupfish$Pop,
traj.pts = Pupfish$Sex, print.progress = FALSE)
# Magnitude difference (absolute difference between path distances)
summary(TA, attribute = "MD")
##
## Trajectory analysis
##
## 1000 permutations.
##
## Points projected onto trajectory PCs
##
## Trajectories:
## Trajectories hidden (use show.trajectories = TRUE to view)
##
## Observed path distances by group
##
## Marsh Sinkhole
## 0.04611590 0.02568508
##
## Pairwise absolute differences in path distances, plus statistics
## d UCL (95%) Z Pr > d
## Marsh:Sinkhole 0.02043082 0.012731 2.671911 0.001
# Correlations (angles) between trajectories
summary(TA, attribute = "TC", angle.type = "deg")
##
## Trajectory analysis
##
## 1000 permutations.
##
## Points projected onto trajectory PCs
##
## Trajectories:
## Trajectories hidden (use show.trajectories = TRUE to view)
##
## Pairwise correlations between trajectories, plus statistics
## r angle UCL (95%) Z Pr > angle
## Marsh:Sinkhole 0.7393716 42.32209 24.12074 3.54113 0.001
Notice that in this example, there is evidence for differences in trajectory length, as well as trajectory direction. This means that there is greater sexual dimorphism in one population as compared to another, and also in the manner in which sexual dimorphism occurs.
Looking at these graphically we see the pattern:
TP <- plot(TA, pch = as.numeric(Pupfish$Pop) + 20, bg = as.numeric(Pupfish$Sex),
cex = 1, col = "black")
add.trajectories(TP, traj.pch = c(21, 22), start.bg = 1, end.bg = 2)
legend("topright", levels(Pupfish$Pop), pch = c(21, 22), pt.bg = 1)
The approach works unaltered for more complicated datasets.
data(motionpaths)
fit <- lm.rrpp(trajectories ~ groups, data = motionpaths, iter = 999,
print.progress = FALSE)
TA <- trajectory.analysis(fit, groups = motionpaths$groups, traj.pts = 5,
print.progress = FALSE)
# Magnitude difference (absolute difference between path distances)
summary(TA, attribute = "MD")
##
## Trajectory analysis
##
## 1000 permutations.
##
##
## Trajectories:
## Trajectories hidden (use show.trajectories = TRUE to view)
##
## Observed path distances by group
##
## 1 2 3 4
## 10.144421 6.557011 10.949722 10.097348
##
## Pairwise absolute differences in path distances, plus statistics
## d UCL (95%) Z Pr > d
## 1:2 3.58741011 1.317461 3.8913405 0.001
## 1:3 0.80530081 1.247854 0.7320629 0.250
## 1:4 0.04707265 1.284309 -1.6674019 0.950
## 2:3 4.39271092 1.262817 4.5955836 0.001
## 2:4 3.54033746 1.333765 3.8250970 0.001
## 3:4 0.85237346 1.251712 0.8703440 0.213
# Correlations (angles) between trajectories
summary(TA, attribute = "TC", angle.type = "deg")
##
## Trajectory analysis
##
## 1000 permutations.
##
##
## Trajectories:
## Trajectories hidden (use show.trajectories = TRUE to view)
##
## Pairwise correlations between trajectories, plus statistics
## r angle UCL (95%) Z Pr > angle
## 1:2 0.9989147 2.669593 20.70022 -0.8792581 0.808
## 1:3 0.6122502 52.247611 22.06576 3.5215617 0.001
## 1:4 0.9996899 1.426827 20.81629 -1.3327426 0.901
## 2:3 0.6484120 49.578018 21.87022 3.3651369 0.001
## 2:4 0.9974452 4.096420 21.08371 -0.5446100 0.700
## 3:4 0.5923727 53.674438 21.14215 3.5973574 0.001
# Shape differences between trajectories
summary(TA, attribute = "SD")
##
## Trajectory analysis
##
## 1000 permutations.
##
##
## Trajectories:
## Trajectories hidden (use show.trajectories = TRUE to view)
##
## Pairwise trajectory shape differences, plus statistics
## d UCL (95%) Z Pr > d
## 1:2 0.10279471 0.1855284 -0.15347678 0.551
## 1:3 0.10686382 0.1879151 -0.04044511 0.512
## 1:4 0.28213028 0.1881749 2.98789933 0.002
## 2:3 0.07843294 0.1862637 -0.91031401 0.814
## 2:4 0.36491114 0.1877718 3.94123541 0.001
## 3:4 0.35086211 0.1859428 4.01741549 0.001
TP <- plot(TA, pch = 21, bg = as.numeric(motionpaths$groups),
cex = 0.7, col = "gray")
add.trajectories(TP, traj.pch = 21, traj.bg = 1:4)