guppies = read.csv("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter02/chap02e3bGuppyFatherSonAttractiveness.csv")
summary(guppies)
## fatherOrnamentation sonAttractiveness
## Min. :0.0300 Min. :-0.3200
## 1st Qu.:0.3275 1st Qu.: 0.1800
## Median :0.4550 Median : 0.4500
## Mean :0.4908 Mean : 0.4872
## 3rd Qu.:0.6100 3rd Qu.: 0.8025
## Max. :1.1200 Max. : 1.2300
Is there a relationship between the father’s ornamentation and the son’s attractiveness? Both variables are numeric so we could plot the data and then run a correlation.
plot(guppies$fatherOrnamentation, guppies$sonAttractiveness, pch = 16, xlab = "Father's Ornamentation", ylab = "Son's Attractiveness", col = "tomato4")
library(car)
scatterplot(sonAttractiveness ~ fatherOrnamentation, data = guppies, ellipse = TRUE, levels = c(0.5, 0.95), reg.line = FALSE, smoother = FALSE, xlim = c(0, 1.2), ylim = c(-0.4, 1.5), xlab = "Father's Ornamentation", ylab = "Son's Attractiveness")
## Warning in plot.window(...): "levels" is not a graphical parameter
## Warning in plot.window(...): "reg.line" is not a graphical parameter
## Warning in plot.window(...): "smoother" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "levels" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "reg.line" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "smoother" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "levels" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "reg.line" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "smoother" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "levels" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "reg.line" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "smoother" is
## not a graphical parameter
## Warning in box(...): "levels" is not a graphical parameter
## Warning in box(...): "reg.line" is not a graphical parameter
## Warning in box(...): "smoother" is not a graphical parameter
## Warning in title(...): "levels" is not a graphical parameter
## Warning in title(...): "reg.line" is not a graphical parameter
## Warning in title(...): "smoother" is not a graphical parameter
Note the presence of outliers. Is this sufficient to reject the assumption of bivariate normality? Let us test this.
library(MVN)
mardiaTest(guppies, qqplot = FALSE)
## Warning: 'mardiaTest' is deprecated.
## Use 'mvn' instead.
## See help("Deprecated")
In this case we can stick with the Pearson correlation but if the p-value were small enough Spearman would have been appropriate.
cor(guppies, use = "complete.obs", method = "pearson")
## fatherOrnamentation sonAttractiveness
## fatherOrnamentation 1.0000000 0.6141043
## sonAttractiveness 0.6141043 1.0000000
As father’s ornamentation increases, so does the son’s attractiveness. The Spearman correlation is positive and estimated to be 0.6141.
H0: Son’s attractiveness is not correlated with father’s ornamentation \((\rho = 0)\)
HA: Son’s attractiveness is correlated with father’s ornamentation \((\rho \neq 0)\)
cor.test(guppies$sonAttractiveness, guppies$fatherOrnamentation)
##
## Pearson's product-moment correlation
##
## data: guppies$sonAttractiveness and guppies$fatherOrnamentation
## t = 4.5371, df = 34, p-value = 6.784e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.3577455 0.7843860
## sample estimates:
## cor
## 0.6141043
The p-value is quite small so we can reject H0; there appears to be a significant positive correlation between father’s ornamentation and son’s attractiveness.
When one or both measures is not numeric and/or one or more measures is non-normally distributed, Pearson’s correlation coefficient is inappropriate. Rather, Spearman’s rank-order correlation is used. We see this in action here with the Indian Rope trick.
rope = read.csv("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter16/chap16e5IndianRopeTrick.csv")
summary(rope)
## years impressivenessScore
## Min. : 2.00 Min. :1.000
## 1st Qu.:17.00 1st Qu.:2.000
## Median :28.00 Median :4.000
## Mean :27.29 Mean :3.429
## 3rd Qu.:39.00 3rd Qu.:4.000
## Max. :50.00 Max. :5.000
plot(rope$years, rope$impressivenessScore, pch=16, xlab = "Time Lapsed (in years) ", ylab = "Impressiveness Score", col = "cornflowerblue")
mardiaTest(rope, qqplot = FALSE)
## Warning: 'mardiaTest' is deprecated.
## Use 'mvn' instead.
## See help("Deprecated")
The multivariate normality test shows no violations so we could stick with Pearson if we wanted to. Since the text uses these data for Spearman’s I’ll show you that as well.
cor.test(rope$impressivenessScore, rope$years)
##
## Pearson's product-moment correlation
##
## data: rope$impressivenessScore and rope$years
## t = 6.9679, df = 19, p-value = 1.223e-06
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.6562731 0.9366690
## sample estimates:
## cor
## 0.84778
cor.test(rope$impressivenessScore, rope$years, method = "spearm")
## Warning in cor.test.default(rope$impressivenessScore, rope$years, method =
## "spearm"): Cannot compute exact p-value with ties
##
## Spearman's rank correlation rho
##
## data: rope$impressivenessScore and rope$years
## S = 332.12, p-value = 2.571e-05
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.7843363
Ties prevent the calculation of exact p-values so like in the past, we can switch to another package. Here I’ll use the coin
library.
library(coin)
spearman_test(impressivenessScore ~ years, data = rope)
##
## Asymptotic Spearman Correlation Test
##
## data: impressivenessScore by years
## Z = 3.5077, p-value = 0.0004521
## alternative hypothesis: true rho is not equal to 0
The p-value is 0.0004 so we can reject the NULL; there appears to be a positive association between the impressiveness of eyewitness accounts and the years lapsed since the account was penned. Note also that regardless of whether we go with Pearson’s correlation or Spearman’s, we see a significant positive correlation between time lapsed and the impressiveness score.
Birds of many species retain the same breeding partner year after year. In some of these species, male and female partners migrate separately and spend the winter in different places, often thousands of kilometers apart. Yet they manage to find one another again each spring. In a field study of individually banded pairs of black-tailed godwits, Gunnarsson et al. (2004) recorded spring arrival dates of males and females on the breeding grounds in the year after they were observed breeding together. The data for 10 pairs are provided in the data, with arrival date measured in days since March 31.
birds = read.csv("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter16/chap16q05GodwitArrivalDates.csv")
summary(birds)
## femaleDate maleDate
## Min. :24.00 Min. :22.00
## 1st Qu.:35.25 1st Qu.:37.25
## Median :44.00 Median :48.00
## Mean :45.50 Mean :45.80
## 3rd Qu.:55.75 3rd Qu.:55.75
## Max. :69.00 Max. :59.00
plot(birds$femaleDate, birds$maleDate, pch = 16, xlab = "Female Arrival Date", ylab = "Male Arrival Date")
library(MVN)
mardiaTest(birds, qqplot = TRUE)
## Warning: 'mardiaTest' is deprecated.
## Use 'mvn' instead.
## See help("Deprecated")
mardiaTest(birds, qqplot = FALSE)
## Warning: 'mardiaTest' is deprecated.
## Use 'mvn' instead.
## See help("Deprecated")
library(car)
scatterplot(femaleDate ~ maleDate, data = birds, ellipse = TRUE, levels = c(0.5, 0.95), reg.line = TRUE, smoother = TRUE, xlim = c(0, 100), ylim = c(0, 100))
## Warning in plot.window(...): "levels" is not a graphical parameter
## Warning in plot.window(...): "reg.line" is not a graphical parameter
## Warning in plot.window(...): "smoother" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "levels" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "reg.line" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "smoother" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "levels" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "reg.line" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "smoother" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "levels" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "reg.line" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "smoother" is
## not a graphical parameter
## Warning in box(...): "levels" is not a graphical parameter
## Warning in box(...): "reg.line" is not a graphical parameter
## Warning in box(...): "smoother" is not a graphical parameter
## Warning in title(...): "levels" is not a graphical parameter
## Warning in title(...): "reg.line" is not a graphical parameter
## Warning in title(...): "smoother" is not a graphical parameter
cor.test(birds$femaleDate, birds$maleDate)
##
## Pearson's product-moment correlation
##
## data: birds$femaleDate and birds$maleDate
## t = 7.0144, df = 8, p-value = 0.000111
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.7157944 0.9830331
## sample estimates:
## cor
## 0.9274395
Left-handed people have an advantage in many sports, and it has been suggested that left-handedness might have been an advantage in hand-to-hand combat in early societies. To explore this potential advantage, reearchers compared the frequency of left-handed individuals in societies with measures of the level of violence in these societies. These data are given below:
left = read.csv("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter16/chap16q13LefthandednessAndViolence.csv")
summary(left)
## percentLeft murder
## Min. : 3.50 Min. :0.0100
## 1st Qu.: 6.55 1st Qu.:0.0275
## Median : 9.20 Median :0.3350
## Mean :11.38 Mean :1.6375
## 3rd Qu.:14.85 3rd Qu.:3.2600
## Max. :22.70 Max. :5.3700
mardiaTest(left)
## Warning: 'mardiaTest' is deprecated.
## Use 'mvn' instead.
## See help("Deprecated")
Plot these data. Does the plot indicate any association between left-handedness and propensity for violence?
Use an appropriate method to test for a correlation and report your conclusion.