Dean Adams, Iowa State University
Univariate covariation, correlation
\[\small{cov}_{y_{1},y_{2}}=\sigma_{y_{1},y_{2}}=\frac{1}{n-1} \sum_{i=1}^{n}\left(y_{i1}-\bar{y}_{1}\right) \left(y_{i2}-\bar{y}_{2}\right)\]
\[\small{cor}_{y_{1},y_{2}}=r_{1,2}=\frac{1}{n-1} \sum_{i=1}^{n}\frac{\left(y_{i1}-\bar{y_1}\right)}{s_{1}} \frac{\left(y_{i2}-\bar{y_2}\right)}{s_{2}}\]
What is the relationship between mass-specific metabolic rate and body mass across 580 mammal species (data from Capellini et al. Ecol. 2010)
Univariate covariation, correlation
Sums of squares (SS):
\(\small{SS}_{y_1}=\sum_{i=1}^{n}\left(y_{i1}-\bar{y_1}\right)^{2}=\sum_{i=1}^{n}\left(y_{i1}-\bar{y_1}\right)\left(y_{i1}-\bar{y_1}\right)\)
\(\small{SS}_{y_2}=\sum_{i=1}^{n}\left(y_{i2}-\bar{y_2}\right)^{2}=\sum_{i=1}^{n}\left(y_{i2}-\bar{y_2}\right)\left(y_{i2}-\bar{y_2}\right)\)
\(\small{SC}_{y_1y_2}=\sum_{i=1}^{n}\left(y_{i1}-\bar{y_1}\right)\left(y_{i2}-\bar{y_2}\right)\)
Variance components
\(\small{s}^{2}_{y_1}=\sigma^{2}_{y_1}=\frac{\sum_{i=1}^{n}\left(y_{i1}-\bar{y_1}\right)^{2}}{n-1}\)
\(\small{s}^{2}_{y_2}=\sigma^{2}_{y_2}=\frac{\sum_{i=1}^{n}\left(y_{i2}-\bar{y_2}\right)^{2}}{n-1}\)
\(\small{cov}_{y_1y_2}=\sigma_{y_1y_2}=\frac{\sum_{i=1}^{n}\left(y_{i1}-\bar{y_1}\right)\left(y_{i2}-\bar{y_2}\right)}{n-1}\)
Univariate covariation, correlation
\[\small{cov}_{y_{1},y_{2}}=\sigma_{y_{1},y_{2}}=\frac{1}{n-1} \sum_{i=1}^{n}\left(y_{i1}-\bar{y_1}\right) \left(y_{i2}-\bar{y_2}\right)\]
\[\small{cor}_{y_{1},y_{2}}=r_{1,2}=\frac{1}{n-1} \sum_{i=1}^{n}\frac{\left(y_{i1}-\bar{y_1}\right)}{s_{1}} \frac{\left(y_{i2}-\bar{y_2}\right)}{s_{2}}\]
Univariate covariation, correlation
\[\small{cov}_{y_{1},y_{2}}=\sigma_{y_{1},y_{2}}=\frac{1}{n-1} \sum_{i=1}^{n}\left(y_{i1}-\bar{y_1}\right) \left(y_{i2}-\bar{y_2}\right)\]
\[\small{cor}_{y_{1},y_{2}}=r_{1,2}=\frac{1}{n-1} \sum_{i=1}^{n}\frac{\left(y_{i1}-\bar{y_1}\right)}{s_{1}} \frac{\left(y_{i2}-\bar{y_2}\right)}{s_{2}}\]
Re-express:
\[\small{cor}_{y_{1},y_{2}}=r_{1,2}=\frac{1}{n-1} \sum_{i=1}^{n}\frac{\left(y_{i1}-\bar{y_1}\right)}{s_{1}} \frac{\left(y_{i2}-\bar{y_2}\right)}{s_{2}}=\]
\[\small{r}_{1,2}=\frac{1}{n-1}\sum_{i=1}^{n}\frac{\left(y_{i1}-\bar{y_1}\right)}{\sqrt{\frac{SS_{y_1}}{n-1}}}\frac{\left(y_{i2}-\bar{y_2}\right)}{\sqrt{\frac{SS_{y_2}}{n-1}}}=\]
\[\small{r}_{1,2}=\frac{SC_{y_1y_2}}{\sqrt{SS_{y_1}SS_{y_2}}}=\frac{\sigma_{y_1y_2}}{\sqrt{\sigma^{2}_{y_1}\sigma^{2}_{y_2}}}=\frac{\sigma_{y_1y_2}}{\sigma_{y_1}\sigma_{y_2}}\]
X: Independent (predictor) variable Y: Dependent (response) variable
X: Independent (predictor) variable Y: Dependent (response) variable
\(\small{SS}_{x}=2997.23\); \(\small{SS}_{y}=371.48\); \(\small{SC}_{xy}=-925.99\)
\(\small{r}_{xy}=\frac{-925.99}{\sqrt{2997.23*371.48}}=-0.877\)
X: Independent (predictor) variable Y: Dependent (response) variable
\(\small{SS}_{x}=2997.23\); \(\small{SS}_{y}=371.48\); \(\small{SC}_{xy}=-925.99\)
\(\small{r}_{xy}=\frac{-925.99}{\sqrt{2997.23*371.48}}=-0.877\)
Strength of association: \(\small{r}^{2}=0.770\)
Consider two data matrices \(\small\mathbf{X}\) & \(\small\mathbf{Y}\). The dispersion of objects in each may be described by the distance matrices, \(\small\mathbf{D}_X\) & \(\small\mathbf{D}_Y\). A Mantel test is used to evaluate the association between two or more distance matrices.
First, obtain: \(\small{z}_M= \sum{\mathbf{X}_{i}\mathbf{Y}_{i}}\) where \(\small\mathbf{X}\) & \(\small\mathbf{Y}\) are the unfolded distance matrices
Next, obtain the Mantel correlation coefficient: \(\small{r}_M = \frac{z_M}{[n(n-1)/2]-1}\)
Assess significance of \(\small{r_M}\) via permutation, where R/C of distance matrix are permuted.
Consider two data matrices \(\small\mathbf{X}\) & \(\small\mathbf{Y}\). The dispersion of objects in each may be described by the distance matrices, \(\small\mathbf{D}_X\) & \(\small\mathbf{D}_Y\). A Mantel test is used to evaluate the association between two or more distance matrices.
First, obtain: \(\small{z}_M= \sum{\mathbf{X}_{i}\mathbf{Y}_{i}}\) where \(\small\mathbf{X}\) & \(\small\mathbf{Y}\) are the unfolded distance matrices
Next, obtain the Mantel correlation coefficient: \(\small{r}_M = \frac{z_M}{[n(n-1)/2]-1}\)
Assess significance of \(\small{r_M}\) via permutation, where R/C of distance matrix are permuted.
Another way to approach this is via matrix algebra
Y1 = log(mass) & Y2 = VO2
\[\mathbf{Y}=[\mathbf{y}_{1}\mathbf{y}_{2}]\]
Let \(\small\mathbf{1}^{T}=[1 1 1 ... 1]\) then…
\[\small\bar{\mathbf{y}}^{T}=\left(\mathbf{1}^{T}\mathbf{1}\right)^{-1}\mathbf{1}^{T}\mathbf{Y}\]
Another way to approach this is via matrix algebra
Y1 = log(mass) & Y2 = VO2
\[\mathbf{Y}=[\mathbf{y}_{1}\mathbf{y}_{2}]\]
Let \(\small\mathbf{1}^{T}=[1 1 1 ... 1]\) then…
\[\small\bar{\mathbf{y}}^{T}=\left(\mathbf{1}^{T}\mathbf{1}\right)^{-1}\mathbf{1}^{T}\mathbf{Y}\]
\[\small\bar{\mathbf{Y}}=\mathbf{1}\left(\mathbf{1}^{T}\mathbf{1}\right)^{-1}\mathbf{1}^{T}\mathbf{Y}\]
Another way to approach this is via matrix algebra
Y1 = log(mass) & Y2 = VO2
\[\mathbf{Y}=[\mathbf{y}_{1}\mathbf{y}_{2}]\]
Let \(\small\mathbf{1}^{T}=[1 1 1 ... 1]\) then…
\[\small\bar{\mathbf{y}}^{T}=\left(\mathbf{1}^{T}\mathbf{1}\right)^{-1}\mathbf{1}^{T}\mathbf{Y}\]
\[\small\bar{\mathbf{Y}}=\mathbf{1}\left(\mathbf{1}^{T}\mathbf{1}\right)^{-1}\mathbf{1}^{T}\mathbf{Y}\]
\[\small\mathbf{E}=\mathbf{Y}_{c}=\mathbf{Y}-\bar{\mathbf{Y}}\]
\[\small\mathbf{SSCP}=\mathbf{E}^{T}\mathbf{E}=\mathbf{Y}_{c}^{T}\mathbf{Y}_{c}\]
\[\mathbf{SSCP}=\begin{bmatrix} \mathbf{SS}_{y_{1}} & \mathbf{SC}_{y_{1}y_{2}} \\ \mathbf{SC}_{y_{2}y_{1}} & \mathbf{SS}_{y_{2}} \end{bmatrix}\]
\(\small{SSCP}\) is a matrix of sums of squares and cross-products (found using deviations from the mean)
\[\mathbf{SSCP}=\begin{bmatrix} \mathbf{SS}_{y_{1}} & \mathbf{SC}_{y_{1}y_{2}} \\ \mathbf{SC}_{y_{2}y_{1}} & \mathbf{SS}_{y_{2}} \end{bmatrix}\]
The covariance matrix is simply the \(\small{SSCP}\) standardized by \(\small{n-1}\):
\[\hat{\mathbf{\Sigma}}=\frac{\mathbf{E}^{T}\mathbf{E}}{n-1}\]
\[\mathbf{SSCP}=\begin{bmatrix} \mathbf{SS}_{y_{1}} & \mathbf{SC}_{y_{1}y_{2}} \\ \mathbf{SC}_{y_{2}y_{1}} & \mathbf{SS}_{y_{2}} \end{bmatrix}\]
The covariance matrix is simply the \(\small{SSCP}\) standardized by \(\small{n-1}\):
\[\hat{\mathbf{\Sigma}}=\frac{\mathbf{E}^{T}\mathbf{E}}{n-1}\]
\[\hat{\mathbf{\Sigma}}=\frac{\begin{bmatrix} \mathbf{SS}_{y_{1}} & \mathbf{SC}_{y_{1}y_{2}} \\ \mathbf{SC}_{y_{2}y_{1}} & \mathbf{SS}_{y_{2}} \end{bmatrix}}{n-1}\]
The covariance matrix is:
\[\hat{\mathbf{\Sigma}}=\begin{bmatrix} \sigma^{2}_{y_{1}} & \sigma_{y_{1}y_{2}} \\ \sigma_{y_{2}y_{1}} & \sigma^{2}_{y_{2}} \end{bmatrix}\]
Correlations are simply standardized covariances
\[\small{cov}_{y_{1},y_{2}}=\sigma_{y_{1},y_{2}}=\frac{1}{n-1} \sum_{i=1}^{n}\left(y_{i1}-\bar{y_1}\right) \left(y_{i2}-\bar{y_2}\right)\]
\[\small{cor}_{y_{1},y_{2}}=r_{1,2}=\frac{1}{n-1} \sum_{i=1}^{n}\frac{\left(y_{i1}-\bar{y_1}\right)}{s_{1}} \frac{\left(y_{i2}-\bar{y_2}\right)}{s_{2}}\]
Correlations are simply standardized covariances
\[\small{cov}_{y_{1},y_{2}}=\sigma_{y_{1},y_{2}}=\frac{1}{n-1} \sum_{i=1}^{n}\left(y_{i1}-\bar{y_1}\right) \left(y_{i2}-\bar{y_2}\right)\]
\[\small{cor}_{y_{1},y_{2}}=r_{1,2}=\frac{1}{n-1} \sum_{i=1}^{n}\frac{\left(y_{i1}-\bar{y_1}\right)}{s_{1}} \frac{\left(y_{i2}-\bar{y_2}\right)}{s_{2}}\]
In matrix notation let:
\[\hat{\mathbf{\Sigma}}=\begin{bmatrix} \sigma^{2}_{y_{1}} & \sigma_{y_{1}y_{2}} \\ \sigma_{y_{2}y_{1}} & \sigma^{2}_{y_{2}} \end{bmatrix}\]
\[\mathbf{V}=\begin{bmatrix} \sigma^{2}_{y_{1}} & 0 \\ 0 & \sigma^{2}_{y_{2}} \end{bmatrix}\]
Correlations are simply standardized covariances
\[\small{cov}_{y_{1},y_{2}}=\sigma_{y_{1},y_{2}}=\frac{1}{n-1} \sum_{i=1}^{n}\left(y_{i1}-\bar{y_1}\right) \left(y_{i2}-\bar{y_2}\right)\]
\[\small{cor}_{y_{1},y_{2}}=r_{1,2}=\frac{1}{n-1} \sum_{i=1}^{n}\frac{\left(y_{i1}-\bar{y_1}\right)}{s_{1}} \frac{\left(y_{i2}-\bar{y_2}\right)}{s_{2}}\]
In matrix notation let:
\[\hat{\mathbf{\Sigma}}=\begin{bmatrix} \sigma^{2}_{y_{1}} & \sigma_{y_{1}y_{2}} \\ \sigma_{y_{2}y_{1}} & \sigma^{2}_{y_{2}} \end{bmatrix}\]
\[\mathbf{V}=\begin{bmatrix} \sigma^{2}_{y_{1}} & 0 \\ 0 & \sigma^{2}_{y_{2}} \end{bmatrix}\]
Then: \[\mathbf{R}=\mathbf{V}^{-\frac{1}{2}}\hat{\mathbf{\Sigma}}\mathbf{V}^{-\frac{1}{2}}=\begin{bmatrix} 1 & r_{y_{1}y_{2}} \\ r_{y_{2}y_{1}} & 1 \end{bmatrix}\]
\[\small\mathbf{SSCP}=\begin{bmatrix} \mathbf{SS}_{y_{1}} & \mathbf{SC}_{y_{1}y_{2}} \\ \mathbf{SC}_{y_{2}y_{1}} & \mathbf{SS}_{y_{2}} \end{bmatrix} = \begin{bmatrix} 2997.23 \\ -925.99 & 371.48 \end{bmatrix}\]
\[\small\hat{\mathbf{\Sigma}}=\begin{bmatrix} \sigma^{2}_{y_{1}} & \sigma_{y_{1}y_{2}} \\ \sigma_{y_{2}y_{1}} & \sigma^{2}_{y_{2}} \end{bmatrix} = \begin{bmatrix} 5.167 \\ -1.596 & 0.640 \end{bmatrix}\]
\[\small{R}=\begin{bmatrix} 1 \\ -0.877 & 1 \end{bmatrix}\]
Consider \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\) as matrices not vectors:
\[\mathbf{Y}=[\mathbf{Y}_{1}\mathbf{Y}_{2}]\]
We can still envision SSCP , \(\small\hat{\mathbf{\Sigma}}\), and \(\small\mathbf{R}\)
Consider \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\) as matrices not vectors:
\[\mathbf{Y}=[\mathbf{Y}_{1}\mathbf{Y}_{2}]\]
We can still envision SSCP , \(\small\hat{\mathbf{\Sigma}}\), and \(\small\mathbf{R}\)
\[\hat{\mathbf{\Sigma}}=\begin{bmatrix} \mathbf{S}_{11} & \mathbf{S}_{12} \\ \mathbf{S}_{21} & \mathbf{S}_{22} \end{bmatrix}\]
\[\mathbf{R}=\begin{bmatrix} \mathbf{R}_{11} & \mathbf{R}_{12} \\ \mathbf{R}_{21} & \mathbf{R}_{22} \end{bmatrix}\]
\(\small\hat{\mathbf{\Sigma}}\) and \(\small\mathbf{R}\) now describe covariation and correlations between BLOCKS of variables
Dimensionality of \(\small\hat{\mathbf{\Sigma}}\) and \(\small\mathbf{R}\) determined by dimensions of \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)
\(\small\mathbf{Y}_{1}\) is \(\small{n} \times {p}_{1}\) dimensions
\(\small\mathbf{Y}_{2}\) is \(\small{n} \times {p}_{2}\) dimensions
Dimensionality of \(\small\hat{\mathbf{\Sigma}}\) and \(\small\mathbf{R}\) determined by dimensions of \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)
\(\small\mathbf{Y}_{1}\) is \(\small{n} \times {p}_{1}\) dimensions
\(\small\mathbf{Y}_{2}\) is \(\small{n} \times {p}_{2}\) dimensions
Therefore \(\small\hat{\mathbf{\Sigma}}\) is \(\small({p}_{1}+{p}_{2}) \times ({p}_{1}+{p}_{2})\) dimensions
Blocks can have different numbers of variables (\(\small{p}_{1}\neq \small{p}_{2}\) )
Different sub-blocks within \(\small\hat{\mathbf{\Sigma}}\) describe distinct components of trait covariation
\(\small\mathbf{S}_{11}\): covariation of variables in \(\small\mathbf{Y}_{1}\)
\(\small\mathbf{S}_{22}\): covariation of variables in \(\small\mathbf{Y}_{2}\)
\(\small\mathbf{S}_{21}=\mathbf{S}_{12}^{T}\): covariation between \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)
\(\small\mathbf{S}_{21}=\mathbf{S}_{12}^{T}\) is the multivariate equivalent of \(\small\sigma_{21}\)
Different sub-blocks within \(\small\hat{\mathbf{\Sigma}}\) describe distinct components of trait covariation
\(\small\mathbf{S}_{11}\): covariation of variables in \(\small\mathbf{Y}_{1}\)
\(\small\mathbf{S}_{22}\): covariation of variables in \(\small\mathbf{Y}_{2}\)
\(\small\mathbf{S}_{21}=\mathbf{S}_{12}^{T}\): covariation between \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)
\(\small\mathbf{S}_{21}=\mathbf{S}_{12}^{T}\) is the multivariate equivalent of \(\small\sigma_{21}\)
Can we generalize an association measure from \(\small\mathbf{S}_{21}\)?
Recall our derivation of the correlation coefficient
\[\small{r}_{y_1,y_2}=\frac{1}{n-1}\sum_{i=1}^{n}\frac{\left(y_{i1}-\bar{y_1}\right)}{s_{1}}\frac{\left(y_{i2}-\bar{y_2}\right)}{s_{2}}\]
\[\small{r}_{y_1,y_2}=\frac{\sigma_{y_1y_2}}{\sigma_{y_1}\sigma_{y_2}}\]
The squared correlation is then:
\[\small{r}^{2}=\frac{\sigma^{2}_{y_1y_2}}{\sigma^{2}_{y_1}\sigma^{2}_{y_2}}\]
Recall our derivation of the correlation coefficient
\[\small{r}_{y_1,y_2}=\frac{1}{n-1}\sum_{i=1}^{n}\frac{\left(y_{i1}-\bar{y_1}\right)}{s_{1}}\frac{\left(y_{i2}-\bar{y_2}\right)}{s_{2}}\]
\[\small{r}_{y_1,y_2}=\frac{\sigma_{y_1y_2}}{\sigma_{y_1}\sigma_{y_2}}\]
The squared correlation is then:
\[\small{r}^{2}=\frac{\sigma^{2}_{y_1y_2}}{\sigma^{2}_{y_1}\sigma^{2}_{y_2}}\]
Analogously, for multivariate one may consider:
\[\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}\]
Range of \(\small\mathbf{RV}\): \(\small{0}\rightarrow{1}\)
The RV coefficient is analogous to \(\small{r}^{2}\) but it is not a strict mathematical generalization
\(\small{r}^{2}=\frac{\sigma^{2}_{y_1y_2}}{\sigma^{2}_{y_1}\sigma^{2}_{y_2}}\) VS. \(\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}\)
The numerator of \(\small{r}^{2}\) & \(\small{RV}\) describes the covariation between \(\small\mathbf{Y}_{1}\) & \(\small\mathbf{Y}_{2}\)
The denominator of \(\small{r}^{2}\) & \(\small{RV}\) describes variation within \(\small\mathbf{Y}_{1}\) & \(\small\mathbf{Y}_{2}\)
Thus, \(\small{RV}\) (like \(\small{r}^{2}\)) is a ratio of between-block relative to within-block variation
The RV coefficient is analogous to \(\small{r}^{2}\) but it is not a strict mathematical generalization
\(\small{r}^{2}=\frac{\sigma^{2}_{y_1y_2}}{\sigma^{2}_{y_1}\sigma^{2}_{y_2}}\) VS. \(\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}\)
The numerator of \(\small{r}^{2}\) & \(\small{RV}\) describes the covariation between \(\small\mathbf{Y}_{1}\) & \(\small\mathbf{Y}_{2}\)
The denominator of \(\small{r}^{2}\) & \(\small{RV}\) describes variation within \(\small\mathbf{Y}_{1}\) & \(\small\mathbf{Y}_{2}\)
Thus, \(\small{RV}\) (like \(\small{r}^{2}\)) is a ratio of between-block relative to within-block variation
However, because each \(\small\mathbf{S}\) is a covariance matrix, the sub-components of \(\small\mathbf{RV}\) are squared variances and covariances: not variances as in \(\small{r}^{2}\): \(\tiny\text{hence, range of } \mathbf{RV}={0}\rightarrow{1}\)
The RV coefficient is analogous to \(\small{r}^{2}\) but it is not a strict mathematical generalization
\(\small{r}^{2}=\frac{\sigma^{2}_{y_1y_2}}{\sigma^{2}_{y_1}\sigma^{2}_{y_2}}\) VS. \(\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}\)
The numerator of \(\small{r}^{2}\) & \(\small{RV}\) describes the covariation between \(\small\mathbf{Y}_{1}\) & \(\small\mathbf{Y}_{2}\)
The denominator of \(\small{r}^{2}\) & \(\small{RV}\) describes variation within \(\small\mathbf{Y}_{1}\) & \(\small\mathbf{Y}_{2}\)
Thus, \(\small{RV}\) (like \(\small{r}^{2}\)) is a ratio of between-block relative to within-block variation
However, because each \(\small\mathbf{S}\) is a covariance matrix, the sub-components of \(\small\mathbf{RV}\) are squared variances and covariances: not variances as in \(\small{r}^{2}\): \(\tiny\text{hence, range of } \mathbf{RV}={0}\rightarrow{1}\)
Pecos pupfish (shape obtained using geometric morphometrics)
Is there an association between head shape and body shape?
Pecos pupfish (shape obtained using geometric morphometrics)
Is there an association between head shape and body shape?
\[\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}=0.607\]
\[\small\sqrt{RV}=0.779\]
Pecos pupfish (shape obtained using geometric morphometrics)
Is there an association between head shape and body shape?
\[\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}=0.607\]
\[\small\sqrt{RV}=0.779\]
Conclusion: head shape and body shape appear to be correlated (associated). However, we still require a formal test of this summary measure!
Another way to summarize the covariation between blocks is via Partial Least Squares (PLS)
\(\small\mathbf{S}_{21}=\mathbf{S}_{12}^{T}\): expresses covariation between \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)
Another way to summarize the covariation between blocks is via Partial Least Squares (PLS)
\(\small\mathbf{S}_{21}=\mathbf{S}_{12}^{T}\): expresses covariation between \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)
Decomposing the information in \(\small\mathbf{S}_{12}\) to find rotational solution (direction) that describes greatest covariation between \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)
\[\small\mathbf{S}_{12}=\mathbf{U\Lambda{V}}^T\]
Another way to summarize the covariation between blocks is via Partial Least Squares (PLS)
\(\small\mathbf{S}_{21}=\mathbf{S}_{12}^{T}\): expresses covariation between \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)
Decomposing the information in \(\small\mathbf{S}_{12}\) to find rotational solution (direction) that describes greatest covariation between \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)
\[\small\mathbf{S}_{12}=\mathbf{U\Lambda{V}}^T\]
\(\small\mathbf{U}\) is the matrix of left singular vectors, which are eigenvectors of \(\small\mathbf{Y}_{1}\) aligned in the direction of maximum covariation between \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)
\(\small\mathbf{V}\) is the matrix of right singular vectors, which are eigenvectors of \(\small\mathbf{Y}_{2}\) aligned in the direction of maximum covariation between \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)
\(\small\mathbf{\Lambda}\) contains the singular values (squared eigenvalues: \(\small\lambda^{2}\)) describe the covariation between pairs of singular vectors \(\small\mathbf{U}\) and \(\small\mathbf{V}\)
Note: \(\small\frac{\lambda^{2}_{1}}{\sum\lambda^{2}_{i}}\times100\) (percent covariation explained by first pair of axes)
\[\small\mathbf{S}_{12}=\mathbf{U\Lambda{V}}^T\]
Ordination scores found by projection of centered data on vectors \(\small\mathbf{U}\) and \(\small\mathbf{V}\)
\[\small\mathbf{P}_{1}=\mathbf{Y}_{1}\mathbf{U}\]
\[\small\mathbf{P}_{2}=\mathbf{Y}_{2}\mathbf{V}\]
The first columns of \(\small\mathbf{P}_{1}\) and \(\small\mathbf{P}_{2}\) describe the maximal covariation between \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)
\[\small\mathbf{S}_{12}=\mathbf{U\Lambda{V}}^T\]
Ordination scores found by projection of centered data on vectors \(\small\mathbf{U}\) and \(\small\mathbf{V}\)
\[\small\mathbf{P}_{1}=\mathbf{Y}_{1}\mathbf{U}\]
\[\small\mathbf{P}_{2}=\mathbf{Y}_{2}\mathbf{V}\]
The first columns of \(\small\mathbf{P}_{1}\) and \(\small\mathbf{P}_{2}\) describe the maximal covariation between \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)
The correlation between \(\small\mathbf{P}_{11}\) and \(\small\mathbf{P}_{21}\) is the PLS-correlation
\[\small{r}_{PLS}={cor}_{P_{11}P_{21}}\]
\(\tiny{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}=0.607\) and \(\tiny\sqrt{RV}=0.779\)
\(\small{r}_{PLS}={cor}_{P_{11}P_{21}}=0.917\)
## Data in either A1 or A2 do not have names. It is assumed data in both A1 and A2 are ordered the same.
##
## Call:
## two.b.pls(A1 = y, A2 = x, iter = 999, print.progress = FALSE)
##
##
##
## r-PLS: 0.917
##
## Effect Size (Z): 5.4019
##
## P-value: 0.001
##
## Based on 1000 random permutations
Again, head shape and body shape appear to be correlated, but we still require a formal test of this summary measure!
We now have two potential test measures of multivariate correlation
\[\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}\]
\[\small{r}_{PLS}={cor}_{P_{11}P_{21}}\]
Two questions emerge:
We now have two potential test measures of multivariate correlation
\[\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}\]
\[\small{r}_{PLS}={cor}_{P_{11}P_{21}}\]
Two questions emerge:
1: How do we assess significance of the test measures?
2: Is one approach preferable over the other?
We now have two potential test measures of multivariate correlation
\[\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}\]
\[\small{r}_{PLS}={cor}_{P_{11}P_{21}}\]
Two questions emerge:
1: How do we assess significance of the test measures?
2: Is one approach preferable over the other?
We will use permutation methods (RRPP) for assessing significance
Test statistics:
\(\small\hat\rho=\sqrt{RV}\) and \(\small\hat\rho={r}_{PLS}\)
H0: \(\small\rho=0\)
H1: \(\small\rho>0\)
Test statistics:
\(\small\hat\rho=\sqrt{RV}\) and \(\small\hat\rho={r}_{PLS}\)
H0: \(\small\rho=0\)
H1: \(\small\rho>0\)
RRPP Approach:
1: Represent \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\) as deviations from mean (H0)
2: Estimate \(\small\hat\rho=\sqrt{RV}_{obs}\) and \(\small\hat\rho={r}_{PLS_{obs}}\)
3: Permute rows of \(\small\mathbf{Y}_{2}\), obtain \(\small\hat\rho=\sqrt{RV}_{rand}\) and \(\small\hat\rho={r}_{PLS_{rand}}\)
4: Repeat many times to generate sampling distribution
Test statistics:
\(\small\hat\rho=\sqrt{RV}\) and \(\small\hat\rho={r}_{PLS}\)
H0: \(\small\rho=0\)
H1: \(\small\rho>0\)
For the pupfish dataset, both are significant at p = 0.001
Test statistics:
\(\small\hat\rho=\sqrt{RV}\) and \(\small\hat\rho={r}_{PLS}\)
H0: \(\small\rho=0\)
H1: \(\small\rho>0\)
Compare permutation distributions with one another (minus observed in this case)
All things considered, rPLS performs better.
Because \(\small\mathbf{RV}\) has a finite range of \(\small{0}\rightarrow{1}\), with 0 representing no covariation between blocks, it may seem intuitive to qualitatively compare \(\small\mathbf{RV}\) values across datasets
THIS TEMPTATION SHOULD BE AVOIDED!
Because \(\small\mathbf{RV}\) has a finite range of \(\small{0}\rightarrow{1}\), with 0 representing no covariation between blocks, it may seem intuitive to qualitatively compare \(\small\mathbf{RV}\) values across datasets
THIS TEMPTATION SHOULD BE AVOIDED!
With random MVN data, \(\small\mathbf{RV}\) varies with both n and p!
Straight-up comparisons of \(\small\mathbf{RV}\) are not useful
Similar patterns are found with \({r}_{PLS}\):
Similar patterns are found with \({r}_{PLS}\):
However, conversion to effect sizes eliminates trends with n and p, allowing meaningful comparisons
where: \(\small\mathbf{Z}=\frac{r_{PLS_{obs}}-\mu_{r_{PLS_{rand}}}}{\sigma_{r_{PLS_{rand}}}}\)
Similar patterns are found with \({r}_{PLS}\):
However, conversion to effect sizes eliminates trends with n and p, allowing meaningful comparisons
where: \(\small\mathbf{Z}=\frac{r_{PLS_{obs}}-\mu_{r_{PLS_{rand}}}}{\sigma_{r_{PLS_{rand}}}}\)
Effect sizes then compared as: \(\small\mathbf{Z}_{12}=\frac{\vert{(r_{PLS1}-\mu_{r_{PLS1}})-(r_{PLS2}-\mu_{r_{PLS2}})}\vert}{\sqrt{\sigma^2_{r_{PLS1}}+\sigma^2_{r_{PLS2}}}}\)
Similar patterns are found with \({r}_{PLS}\):
However, conversion to effect sizes eliminates trends with n and p, allowing meaningful comparisons
where: \(\small\mathbf{Z}=\frac{r_{PLS_{obs}}-\mu_{r_{PLS_{rand}}}}{\sigma_{r_{PLS_{rand}}}}\)
Effect sizes then compared as: \(\small\mathbf{Z}_{12}=\frac{\vert{(r_{PLS1}-\mu_{r_{PLS1}})-(r_{PLS2}-\mu_{r_{PLS2}})}\vert}{\sqrt{\sigma^2_{r_{PLS1}}+\sigma^2_{r_{PLS2}}}}\)
Morphological Integration describes the covariation among sets of variables. Such patterns are often disrupted in organisms living in non-pristine habitats.
Mediterranean wall lizards (Podarcis) live in both rural and urban areas. The question then, is whether patterns of association in cranial morphology differ between habitats, and between juvenile and adult individuals.
Levels of covariation (integration) are NOT different across habitats, but are lower in juveniles as compared to adults.
Various approaches to multivariate association
Mantel Tests: associate distance matrices - High type I error, low power, and bias
RV: strength of association - \(\small{RV}_{Null}\) varies with N & p
PLS: maximum covariation between sets of variables - Can use Z-score transform to compare across datasets
Other methods exist (e.g., canonical correlation: CCorA)
PLS is most general and with fewest mathematical constraints
Considers association of two matrices: \(\small\mathbf{X}\) & \(\small\mathbf{Y}\)
Provides visualization (ordination) from Y~X (differing from PCA/PCoA in this manner)
Considers association of two matrices: \(\small\mathbf{X}\) & \(\small\mathbf{Y}\)
Provides visualization (ordination) from Y~X (differing from PCA/PCoA in this manner)
Recall univariate multiple regression: \(\small{Y}=\beta_{0}+\beta_{1}X_{1}+\beta_{2}X_{2}+\dots+\epsilon\)
Here, predicted values \(\small\mathbf{\hat{Y}}\) represent best Ordination of \(\small\mathbf{Y}\) along the regression
This regression maximizes the \(\small{R}^2\) between \(\small\mathbf{Y}\) & \(\small\mathbf{\hat{Y}}\), and represents the optimal least squares relationship
Considers association of two matrices: \(\small\mathbf{X}\) & \(\small\mathbf{Y}\)
Provides visualization (ordination) from Y~X (differing from PCA/PCoA in this manner)
Recall univariate multiple regression: \(\small{Y}=\beta_{0}+\beta_{1}X_{1}+\beta_{2}X_{2}+\dots+\epsilon\)
Here, predicted values \(\small\mathbf{\hat{Y}}\) represent best Ordination of \(\small\mathbf{Y}\) along the regression
This regression maximizes the \(\small{R}^2\) between \(\small\mathbf{Y}\) & \(\small\mathbf{\hat{Y}}\), and represents the optimal least squares relationship
Canonical analyses share this property for multivariate \(\small\mathbf{Y}\), and generate ordinations of \(\small\mathbf{Y}\) constrained by the maximal LS relationship to \(\small\mathbf{X}\)
They can be considered ‘regression-based’ methods for describing covariation between \(\small\mathbf{X}\) & \(\small\mathbf{Y}\): \(\small\mathbf{S}_{XY}\)
Several canonical ordination methods exist, including: Canonical Variates Analysis (CVA) / Discriminant Analysis (DA); Redundancy Analysis (RDA); Canonical Correspondence Analysis (CCA)
Ordination that maximally discriminates among known groups (g)
Variation expressed as ratio of between-group variation (\(\small\mathbf{A}\)) relative to within-group variation (\(\small\mathbf{V}\)): \(\small\mathbf{V}^{-1}\mathbf{A}\)
CVA accompished via decomposition: \(\small\mathbf{V}^{-1}\mathbf{A}=\mathbf{C\Lambda{C}^T}\)
Normalized canonical axes then found as: \(\small\mathbf{C}^*=\mathbf{C(C^TVC)}^{-1/2}\)
Scores are then found through projection: \(\small\mathbf{Y}_{score}=\mathbf{(Y-\overline{Y})C^{*}}\)
CVA suggests which groups differ on which variables. Within-group variation in CVA plot is circular
CVA rotates and shears data space to a spaces of normalized canonical axes (group variation will be circular)
Classify objects (known or unknown) to groups (VERY useful)
Obtain Mahalanobis distance from objects to group means: \(\small\mathbf{D}_{Mahal}=\mathbf{(Y-\overline{Y})^TV^{-1}(Y-\overline{Y})}\)
Assign objects to group to which it is closest (his weights distance by variation in each dimension)
Determine mis-classification rates
Pupfish (simplified data)
Left: Actual data (PCA). Right: CVA plot (distortion of actual dataspace)
CV Axes not orthogonal in the original dataspace
And linear discrimination ONLY forms linear plane IFF within-group covariances identical (shown as ‘equal probability classification lines below)
Increasing number of variables enhances PERCEIVED group differences, even when groups are perfectly overlapping!
\[\small\mathbf{S}_{\hat{\mathbf{Y}}^T\hat{\mathbf{Y}}}=\frac{1}{N-1}{\hat{\mathbf{Y}}^T\hat{\mathbf{Y}}}\]
\[\small\mathbf{S}_{\hat{\mathbf{Y}}^T\hat{\mathbf{Y}}}=\mathbf{C\Lambda{C}^T}\]
Thus, RDA preserves Euclidean distances of objects in space of predicted values \(\small\hat{\mathbf{Y}}\) (appropriate for continuous \(\small\mathbf{Y}\) variables)
RDA provides ordination of \(\small\mathbf{Y}\) as maximally described by \(\small\mathbf{X}\). Thus, one can think of the RDA ordination as being ‘standardized’ for the regression Y~X
Several RDA ordinations of pupfish:
Y~size+species+sex+species:sex (Common slope MANCOVA)Y~size*species*sex(Separate slopes MANCOVA)Y~species*sex (Factorial MANOVA)Plots are quite different depending on model!
RDA should serve as a visual guide to interpret patterns, RELATIVE to a particular model. Decision should thus be based on evaluating the models first (not just plotting the RDA ordination).
For the pupfish data, evaluate full MANCOVA, then visualize from reduced model:
## Df SS MS Rsq F Z Pr(>F)
## SL 1 632.53 632.53 0.91317 1326.4627 9.3185 0.001 **
## species 1 20.03 20.03 0.02892 42.0070 4.8100 0.001 **
## sex 1 7.33 7.33 0.01059 15.3816 4.2227 0.001 **
## SL:species 1 6.18 6.18 0.00892 12.9539 3.6882 0.001 **
## SL:sex 1 0.65 0.65 0.00094 1.3679 0.6042 0.282
## species:sex 1 0.35 0.35 0.00050 0.7318 -0.0435 0.516
## SL:species:sex 1 0.81 0.81 0.00117 1.6941 0.9719 0.160
## Residuals 52 24.80 0.48 0.03580
## Total 59 692.68
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Visualizing variation in residuals from a linear model is quite useful
But easy to mis-interpret (wittingly or unwittingly). If the wrong model Y~X is used, interpreting patterns in RDA ordination not helpful
Always consider RDA ordination in the context of visualizing patterns in original dataspace via PCA/PCoA
PCA provides visualization of ‘raw’ dataspace, RDA provides visualization of predicted values from a model (both are useful if done correctly).