{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Homework 3 notes\n",
"\n",
"---\n",
"\n",
"In these notes we are using a lot of Tidyverse functions. However, they are\n",
"not essential for this homework, and merely reflect a stylistic preference."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Cauline leaf number (50%)\n",
"\n",
"We read in the data using scripts we have used before. We also load the tidyverse package."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"── \u001b[1mAttaching core tidyverse packages\u001b[22m ────────────────────────────────── tidyverse 2.0.0 ──\n",
"\u001b[32m✔\u001b[39m \u001b[34mdplyr \u001b[39m 1.1.4 \u001b[32m✔\u001b[39m \u001b[34mreadr \u001b[39m 2.1.5\n",
"\u001b[32m✔\u001b[39m \u001b[34mforcats \u001b[39m 1.0.0 \u001b[32m✔\u001b[39m \u001b[34mstringr \u001b[39m 1.5.1\n",
"\u001b[32m✔\u001b[39m \u001b[34mggplot2 \u001b[39m 3.4.4 \u001b[32m✔\u001b[39m \u001b[34mtibble \u001b[39m 3.2.1\n",
"\u001b[32m✔\u001b[39m \u001b[34mlubridate\u001b[39m 1.9.3 \u001b[32m✔\u001b[39m \u001b[34mtidyr \u001b[39m 1.3.1\n",
"\u001b[32m✔\u001b[39m \u001b[34mpurrr \u001b[39m 1.0.2 \n",
"── \u001b[1mConflicts\u001b[22m ──────────────────────────────────────────────────── tidyverse_conflicts() ──\n",
"\u001b[31m✖\u001b[39m \u001b[34mdplyr\u001b[39m::\u001b[32mfilter()\u001b[39m masks \u001b[34mstats\u001b[39m::filter()\n",
"\u001b[31m✖\u001b[39m \u001b[34mdplyr\u001b[39m::\u001b[32mlag()\u001b[39m masks \u001b[34mstats\u001b[39m::lag()\n",
"\u001b[36mℹ\u001b[39m Use the conflicted package (\u001b[3m\u001b[34m\u001b[39m\u001b[23m) to force all conflicts to become errors\n"
]
},
{
"data": {
"text/html": [
"
\n",
"A data.frame: 6 × 21\n",
"\n",
"\t | Genotype | Background | Background.simple | Treatment | Treatment.V | Chamber.ID | Chamber.Irradiance | Vernalization | Daylength | Temperature | ⋯ | Survival.Bolt | Bolt | Days.to.Bolt | Days.to.Flower | Rosette.leaf.num | Cauline.leaf.num | Blade.length.mm | Total.leaf.length.mm | Blade.ratio | Notes |
\n",
"\t | <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | <chr> | <int> | <int> | ⋯ | <chr> | <chr> | <int> | <int> | <int> | <int> | <dbl> | <dbl> | <dbl> | <chr> |
\n",
"\n",
"\n",
"\t1 | agl24-1 | Col | Col | 12ConLD | 12ConLDNV | 7B | Low | NV | 16 | 12 | ⋯ | Y | Y | 35 | 48 | 18 | 7 | 15.6 | 29.7 | 0.5252525 | |
\n",
"\t2 | agl24-1 | Col | Col | 12ConLD | 12ConLDNV | 7B | Low | NV | 16 | 12 | ⋯ | Y | Y | 35 | 48 | 17 | 5 | 16.4 | 32.9 | 0.4984802 | |
\n",
"\t3 | agl24-1 | Col | Col | 12ConLD | 12ConLDNV | 4T | High | NV | 16 | 12 | ⋯ | Y | Y | 36 | 48 | 22 | 6 | 10.6 | 20.4 | 0.5196078 | |
\n",
"\t4 | agl24-1 | Col | Col | 12ConLD | 12ConLDNV | 4T | High | NV | 16 | 12 | ⋯ | Y | Y | 36 | 47 | 18 | 5 | 15.9 | 25.6 | 0.6210938 | |
\n",
"\t5 | agl24-1 | Col | Col | 12ConLD | 12ConLDNV | 4T | High | NV | 16 | 12 | ⋯ | Y | Y | 38 | 50 | 24 | 8 | 14.4 | 26.4 | 0.5454545 | |
\n",
"\t6 | agl24-1 | Col | Col | 12ConLD | 12ConLDNV | 7B | Low | NV | 16 | 12 | ⋯ | Y | Y | 39 | 53 | 20 | 7 | 15.9 | 29.7 | 0.5353535 | |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 6 × 21\n",
"\\begin{tabular}{r|lllllllllllllllllllll}\n",
" & Genotype & Background & Background.simple & Treatment & Treatment.V & Chamber.ID & Chamber.Irradiance & Vernalization & Daylength & Temperature & ⋯ & Survival.Bolt & Bolt & Days.to.Bolt & Days.to.Flower & Rosette.leaf.num & Cauline.leaf.num & Blade.length.mm & Total.leaf.length.mm & Blade.ratio & Notes\\\\\n",
" & & & & & & & & & & & ⋯ & & & & & & & & & & \\\\\n",
"\\hline\n",
"\t1 & agl24-1 & Col & Col & 12ConLD & 12ConLDNV & 7B & Low & NV & 16 & 12 & ⋯ & Y & Y & 35 & 48 & 18 & 7 & 15.6 & 29.7 & 0.5252525 & \\\\\n",
"\t2 & agl24-1 & Col & Col & 12ConLD & 12ConLDNV & 7B & Low & NV & 16 & 12 & ⋯ & Y & Y & 35 & 48 & 17 & 5 & 16.4 & 32.9 & 0.4984802 & \\\\\n",
"\t3 & agl24-1 & Col & Col & 12ConLD & 12ConLDNV & 4T & High & NV & 16 & 12 & ⋯ & Y & Y & 36 & 48 & 22 & 6 & 10.6 & 20.4 & 0.5196078 & \\\\\n",
"\t4 & agl24-1 & Col & Col & 12ConLD & 12ConLDNV & 4T & High & NV & 16 & 12 & ⋯ & Y & Y & 36 & 47 & 18 & 5 & 15.9 & 25.6 & 0.6210938 & \\\\\n",
"\t5 & agl24-1 & Col & Col & 12ConLD & 12ConLDNV & 4T & High & NV & 16 & 12 & ⋯ & Y & Y & 38 & 50 & 24 & 8 & 14.4 & 26.4 & 0.5454545 & \\\\\n",
"\t6 & agl24-1 & Col & Col & 12ConLD & 12ConLDNV & 7B & Low & NV & 16 & 12 & ⋯ & Y & Y & 39 & 53 & 20 & 7 & 15.9 & 29.7 & 0.5353535 & \\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 6 × 21\n",
"\n",
"| | Genotype <chr> | Background <chr> | Background.simple <chr> | Treatment <chr> | Treatment.V <chr> | Chamber.ID <chr> | Chamber.Irradiance <chr> | Vernalization <chr> | Daylength <int> | Temperature <int> | ⋯ ⋯ | Survival.Bolt <chr> | Bolt <chr> | Days.to.Bolt <int> | Days.to.Flower <int> | Rosette.leaf.num <int> | Cauline.leaf.num <int> | Blade.length.mm <dbl> | Total.leaf.length.mm <dbl> | Blade.ratio <dbl> | Notes <chr> |\n",
"|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n",
"| 1 | agl24-1 | Col | Col | 12ConLD | 12ConLDNV | 7B | Low | NV | 16 | 12 | ⋯ | Y | Y | 35 | 48 | 18 | 7 | 15.6 | 29.7 | 0.5252525 | |\n",
"| 2 | agl24-1 | Col | Col | 12ConLD | 12ConLDNV | 7B | Low | NV | 16 | 12 | ⋯ | Y | Y | 35 | 48 | 17 | 5 | 16.4 | 32.9 | 0.4984802 | |\n",
"| 3 | agl24-1 | Col | Col | 12ConLD | 12ConLDNV | 4T | High | NV | 16 | 12 | ⋯ | Y | Y | 36 | 48 | 22 | 6 | 10.6 | 20.4 | 0.5196078 | |\n",
"| 4 | agl24-1 | Col | Col | 12ConLD | 12ConLDNV | 4T | High | NV | 16 | 12 | ⋯ | Y | Y | 36 | 47 | 18 | 5 | 15.9 | 25.6 | 0.6210938 | |\n",
"| 5 | agl24-1 | Col | Col | 12ConLD | 12ConLDNV | 4T | High | NV | 16 | 12 | ⋯ | Y | Y | 38 | 50 | 24 | 8 | 14.4 | 26.4 | 0.5454545 | |\n",
"| 6 | agl24-1 | Col | Col | 12ConLD | 12ConLDNV | 7B | Low | NV | 16 | 12 | ⋯ | Y | Y | 39 | 53 | 20 | 7 | 15.9 | 29.7 | 0.5353535 | |\n",
"\n"
],
"text/plain": [
" Genotype Background Background.simple Treatment Treatment.V Chamber.ID\n",
"1 agl24-1 Col Col 12ConLD 12ConLDNV 7B \n",
"2 agl24-1 Col Col 12ConLD 12ConLDNV 7B \n",
"3 agl24-1 Col Col 12ConLD 12ConLDNV 4T \n",
"4 agl24-1 Col Col 12ConLD 12ConLDNV 4T \n",
"5 agl24-1 Col Col 12ConLD 12ConLDNV 4T \n",
"6 agl24-1 Col Col 12ConLD 12ConLDNV 7B \n",
" Chamber.Irradiance Vernalization Daylength Temperature ⋯ Survival.Bolt Bolt\n",
"1 Low NV 16 12 ⋯ Y Y \n",
"2 Low NV 16 12 ⋯ Y Y \n",
"3 High NV 16 12 ⋯ Y Y \n",
"4 High NV 16 12 ⋯ Y Y \n",
"5 High NV 16 12 ⋯ Y Y \n",
"6 Low NV 16 12 ⋯ Y Y \n",
" Days.to.Bolt Days.to.Flower Rosette.leaf.num Cauline.leaf.num Blade.length.mm\n",
"1 35 48 18 7 15.6 \n",
"2 35 48 17 5 16.4 \n",
"3 36 48 22 6 10.6 \n",
"4 36 47 18 5 15.9 \n",
"5 38 50 24 8 14.4 \n",
"6 39 53 20 7 15.9 \n",
" Total.leaf.length.mm Blade.ratio Notes\n",
"1 29.7 0.5252525 \n",
"2 32.9 0.4984802 \n",
"3 20.4 0.5196078 \n",
"4 25.6 0.6210938 \n",
"5 26.4 0.5454545 \n",
"6 29.7 0.5353535 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# downloading data file from Dryad\n",
"dryadFileDownload <- function(filenum,filename,baseurl=\"https://datadryad.org/api/v2\")\n",
"{\n",
" download.file(paste(baseurl,\"/files/\",filenum,\"/download\",sep=\"\"),filename,mode=\"wb\")\n",
"}\n",
"# load tidyverse\n",
"library(tidyverse)\n",
"# read in the dataset and show the top few lines\n",
"vernFileID <- \"22636\"\n",
"tmpfile <- tempfile(fileext=\"txt\")\n",
"dryadFileDownload(vernFileID,tmpfile)\n",
"vern <- read.table(tmpfile)\n",
"head(vern)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### a. Select data for two genotypes and calculate the mean cauline leaf number for both genotypes in both vernalization conditions.\n",
"\n",
"We select the data for the `hua2-3` and `hua2-3 FRI` genotypes."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
" Genotype Vernalization Cauline.leaf.num\n",
" Length:3197 Length:3197 Min. : 1.000 \n",
" Class :character Class :character 1st Qu.: 4.000 \n",
" Mode :character Mode :character Median : 7.000 \n",
" Mean : 6.474 \n",
" 3rd Qu.: 9.000 \n",
" Max. :29.000 \n",
" NA's :120 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# select the three columns of interest: genotype, vernalization, and phenotype\n",
"hua <- select(vern,Genotype,Vernalization,Cauline.leaf.num)\n",
"summary(hua)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
" Genotype Vernalization Cauline.leaf.num\n",
" Length:270 Length:270 Min. : 1.000 \n",
" Class :character Class :character 1st Qu.: 3.000 \n",
" Mode :character Mode :character Median : 5.000 \n",
" Mean : 5.801 \n",
" 3rd Qu.: 8.000 \n",
" Max. :16.000 \n",
" NA's :3 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# filter only observations with `hua2-3` and `hua2-3 FRI` genotypes.\n",
"hua <- filter(hua, (Genotype == \"hua2-3\") | (Genotype == \"hua2-3 FRI\"))\n",
"summary(hua)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We update factor levels after filtering the data frame, so that the levels\n",
"we have dropped do not appear in the data."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
" Genotype Vernalization Cauline.leaf.num\n",
" Length:270 Length:270 Min. : 1.000 \n",
" Class :character Class :character 1st Qu.: 3.000 \n",
" Mode :character Mode :character Median : 5.000 \n",
" Mean : 5.801 \n",
" 3rd Qu.: 8.000 \n",
" Max. :16.000 \n",
" NA's :3 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# update factor levels\n",
"hua <- droplevels(hua)\n",
"summary(hua)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Calculate the mean cauline leaf number for both genotypes in both vernalization conditions. We also calculate the sample sizes (number of non-missing observations) in each of the four categories. Note that they are not exactly balanced."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\u001b[1m\u001b[22m`summarise()` has grouped output by 'Genotype'. You can override using the `.groups`\n",
"argument.\n"
]
},
{
"data": {
"text/html": [
"\n",
"A grouped_df: 4 × 4\n",
"\n",
"\tGenotype | Vernalization | meancauline | n |
\n",
"\t<chr> | <chr> | <dbl> | <int> |
\n",
"\n",
"\n",
"\thua2-3 | NV | 5.720588 | 68 |
\n",
"\thua2-3 | V | 4.761194 | 67 |
\n",
"\thua2-3 FRI | NV | 6.453125 | 64 |
\n",
"\thua2-3 FRI | V | 6.294118 | 68 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A grouped\\_df: 4 × 4\n",
"\\begin{tabular}{llll}\n",
" Genotype & Vernalization & meancauline & n\\\\\n",
" & & & \\\\\n",
"\\hline\n",
"\t hua2-3 & NV & 5.720588 & 68\\\\\n",
"\t hua2-3 & V & 4.761194 & 67\\\\\n",
"\t hua2-3 FRI & NV & 6.453125 & 64\\\\\n",
"\t hua2-3 FRI & V & 6.294118 & 68\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A grouped_df: 4 × 4\n",
"\n",
"| Genotype <chr> | Vernalization <chr> | meancauline <dbl> | n <int> |\n",
"|---|---|---|---|\n",
"| hua2-3 | NV | 5.720588 | 68 |\n",
"| hua2-3 | V | 4.761194 | 67 |\n",
"| hua2-3 FRI | NV | 6.453125 | 64 |\n",
"| hua2-3 FRI | V | 6.294118 | 68 |\n",
"\n"
],
"text/plain": [
" Genotype Vernalization meancauline n \n",
"1 hua2-3 NV 5.720588 68\n",
"2 hua2-3 V 4.761194 67\n",
"3 hua2-3 FRI NV 6.453125 64\n",
"4 hua2-3 FRI V 6.294118 68"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# calculate means grouping by genotype and vernalization\n",
"huameans <- group_by(hua, Genotype, Vernalization) %>% \n",
" summarise(meancauline = mean(Cauline.leaf.num, na.rm = T), \n",
" n = sum(!is.na(Cauline.leaf.num)))\n",
"huameans"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"-0.159007"
],
"text/latex": [
"-0.159007"
],
"text/markdown": [
"-0.159007"
],
"text/plain": [
"[1] -0.159007"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"6.294118 - 6.453125 # vern effect in FRI"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"-0.959394000000001"
],
"text/latex": [
"-0.959394000000001"
],
"text/markdown": [
"-0.959394000000001"
],
"text/plain": [
"[1] -0.959394"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"4.761194 - 5.720588 # vern effect in hua2-3"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"0.800387000000001"
],
"text/latex": [
"0.800387000000001"
],
"text/markdown": [
"0.800387000000001"
],
"text/plain": [
"[1] 0.800387"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"-0.159007 - (-0.959394000000001)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"0.732537"
],
"text/latex": [
"0.732537"
],
"text/markdown": [
"0.732537"
],
"text/plain": [
"[1] 0.732537"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"6.453125 - 5.720588\t"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### b. Calculate main effect of FRI genotype.\n",
"\n",
"The main effect is defined as the difference in the means of the two genotype categories. In this case, since we have genotype means calculated for two vernalization conditions, we have to use a weighted mean (weighted by the number of observations)."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A tibble: 2 × 2\n",
"\n",
"\tGenotype | meancauline |
\n",
"\t<chr> | <dbl> |
\n",
"\n",
"\n",
"\thua2-3 | 5.244444 |
\n",
"\thua2-3 FRI | 6.371212 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A tibble: 2 × 2\n",
"\\begin{tabular}{ll}\n",
" Genotype & meancauline\\\\\n",
" & \\\\\n",
"\\hline\n",
"\t hua2-3 & 5.244444\\\\\n",
"\t hua2-3 FRI & 6.371212\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A tibble: 2 × 2\n",
"\n",
"| Genotype <chr> | meancauline <dbl> |\n",
"|---|---|\n",
"| hua2-3 | 5.244444 |\n",
"| hua2-3 FRI | 6.371212 |\n",
"\n"
],
"text/plain": [
" Genotype meancauline\n",
"1 hua2-3 5.244444 \n",
"2 hua2-3 FRI 6.371212 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# group the previous dataframe by genotype\n",
"# calculate a weighted mean since the two vernalization groups\n",
"# have difference sample sizes\n",
"huagenomeans <- group_by(huameans,Genotype) %>% \n",
" summarize(meancauline = weighted.mean(meancauline, w = n ))\n",
"huagenomeans"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"1.1268"
],
"text/latex": [
"1.1268"
],
"text/markdown": [
"1.1268"
],
"text/plain": [
"[1] 1.1268"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# the main effect is the difference\n",
"huagenomeans$meancauline |> diff() |> round(4)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We could have used the direct definition, as the difference in genotype means as follows. In that case we don't have to use a weighted mean, as we are ignoring vernalization altogether."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A tibble: 2 × 2\n",
"\n",
"\tGenotype | meancauline |
\n",
"\t<chr> | <dbl> |
\n",
"\n",
"\n",
"\thua2-3 | 5.244444 |
\n",
"\thua2-3 FRI | 6.371212 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A tibble: 2 × 2\n",
"\\begin{tabular}{ll}\n",
" Genotype & meancauline\\\\\n",
" & \\\\\n",
"\\hline\n",
"\t hua2-3 & 5.244444\\\\\n",
"\t hua2-3 FRI & 6.371212\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A tibble: 2 × 2\n",
"\n",
"| Genotype <chr> | meancauline <dbl> |\n",
"|---|---|\n",
"| hua2-3 | 5.244444 |\n",
"| hua2-3 FRI | 6.371212 |\n",
"\n"
],
"text/plain": [
" Genotype meancauline\n",
"1 hua2-3 5.244444 \n",
"2 hua2-3 FRI 6.371212 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# the main effect calulated from the hua data\n",
"group_by(hua,Genotype) %>% \n",
" summarize(meancauline = mean(Cauline.leaf.num,na.rm=T))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### c. Calculate interaction effect of genotype and vernalization.\n",
"\n",
"We use the the definition of an interaction effect. The interaction term is the difference in the main effect of vernalization calculated in two genotype categories."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"0.8004"
],
"text/latex": [
"0.8004"
],
"text/markdown": [
"0.8004"
],
"text/plain": [
"[1] 0.8004"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# interaction term is the difference in the vernalization main effects stratified by genotype\n",
"# main effect of vernalization in FRI\n",
"( (huameans$meancauline[4] - huameans$meancauline[3]) - \n",
"# main effect of vernalization in hua2-3\n",
" (huameans$meancauline[2] - huameans$meancauline[1]) ) |> \n",
" round(4) # round to 4 digits"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### d. Verify calculations using `lm`\n",
"\n",
"Note that the main effect is the same as what we calculated above."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\n",
"Call:\n",
"lm(formula = Cauline.leaf.num ~ Genotype, data = hua)\n",
"\n",
"Residuals:\n",
" Min 1Q Median 3Q Max \n",
"-5.3712 -2.3712 -0.3712 2.6288 9.6288 \n",
"\n",
"Coefficients:\n",
" Estimate Std. Error t value Pr(>|t|) \n",
"(Intercept) 5.2444 0.2564 20.45 < 2e-16 ***\n",
"Genotypehua2-3 FRI 1.1268 0.3646 3.09 0.00221 ** \n",
"---\n",
"Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1\n",
"\n",
"Residual standard error: 2.979 on 265 degrees of freedom\n",
" (3 observations deleted due to missingness)\n",
"Multiple R-squared: 0.03478,\tAdjusted R-squared: 0.03114 \n",
"F-statistic: 9.548 on 1 and 265 DF, p-value: 0.002215\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# main effect of genotype\n",
"lm(Cauline.leaf.num ~ Genotype, data = hua) |> summary()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\n",
"Call:\n",
"lm(formula = Cauline.leaf.num ~ Vernalization, data = hua)\n",
"\n",
"Residuals:\n",
" Min 1Q Median 3Q Max \n",
"-4.5333 -3.0758 -0.5333 2.4667 9.9242 \n",
"\n",
"Coefficients:\n",
" Estimate Std. Error t value Pr(>|t|) \n",
"(Intercept) 6.0758 0.2629 23.115 <2e-16 ***\n",
"VernalizationV -0.5424 0.3697 -1.467 0.143 \n",
"---\n",
"Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1\n",
"\n",
"Residual standard error: 3.02 on 265 degrees of freedom\n",
" (3 observations deleted due to missingness)\n",
"Multiple R-squared: 0.00806,\tAdjusted R-squared: 0.004316 \n",
"F-statistic: 2.153 on 1 and 265 DF, p-value: 0.1435\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# main effect of vernalization\n",
"lm(Cauline.leaf.num ~ Vernalization, data = hua) |> summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The same is true for the interaction term."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\n",
"Call:\n",
"lm(formula = Cauline.leaf.num ~ Genotype * Vernalization, data = hua)\n",
"\n",
"Residuals:\n",
" Min 1Q Median 3Q Max \n",
"-5.2941 -2.7206 -0.2941 2.5469 9.5469 \n",
"\n",
"Coefficients:\n",
" Estimate Std. Error t value Pr(>|t|) \n",
"(Intercept) 5.7206 0.3602 15.883 <2e-16 ***\n",
"Genotypehua2-3 FRI 0.7325 0.5172 1.416 0.1579 \n",
"VernalizationV -0.9594 0.5112 -1.877 0.0617 . \n",
"Genotypehua2-3 FRI:VernalizationV 0.8004 0.7273 1.101 0.2721 \n",
"---\n",
"Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1\n",
"\n",
"Residual standard error: 2.97 on 263 degrees of freedom\n",
" (3 observations deleted due to missingness)\n",
"Multiple R-squared: 0.04787,\tAdjusted R-squared: 0.03701 \n",
"F-statistic: 4.408 on 3 and 263 DF, p-value: 0.004801\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# interaction effect of genotype and vernalization\n",
"lm(Cauline.leaf.num ~ Genotype * Vernalization, data = hua) |> summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notice how the genotype and vernalization effects change when the interaction term is included."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using `lmabc`\n",
"\n",
"The `lmabc` package uses abundance weighted contrasts, i.e. a difference between the category mean from the overall (weighted) mean. This has the property that the value of the main effects does not change much, whether or not the interaction term is included."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"library(lmabc)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\n",
"Call:\n",
"lmabc(formula = Cauline.leaf.num ~ Genotype, data = hua)\n",
"\n",
"Residuals:\n",
" Min 1Q Median 3Q Max \n",
"-5.3712 -2.3712 -0.3712 2.6288 9.6288 \n",
"\n",
"Coefficients:\n",
" Estimate Std. Error t value Pr(>|t|) \n",
"(Intercept) 5.8015 0.1823 31.82 < 2e-16 ***\n",
"Genotypehua2-3 -0.5571 0.1803 -3.09 0.00221 ** \n",
"Genotypehua2-3 FRI 0.5697 0.1844 3.09 0.00221 ** \n",
"---\n",
"Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1\n",
"\n",
"Residual standard error: 2.979 on 265 degrees of freedom\n",
"Multiple R-squared: 0.03478,\tAdjusted R-squared: 0.03114 \n",
"F-statistic: 9.548 on 1 and 265 DF, p-value: 0.002215\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"lmabc(Cauline.leaf.num ~ Genotype, data = hua) |> summary()"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\n",
"Call:\n",
"lmabc(formula = Cauline.leaf.num ~ Vernalization, data = hua)\n",
"\n",
"Residuals:\n",
" Min 1Q Median 3Q Max \n",
"-4.5333 -3.0758 -0.5333 2.4667 9.9242 \n",
"\n",
"Coefficients:\n",
" Estimate Std. Error t value Pr(>|t|) \n",
"(Intercept) 5.8015 0.1848 31.390 <2e-16 ***\n",
"VernalizationNV 0.2743 0.1869 1.467 0.143 \n",
"VernalizationV -0.2682 0.1828 -1.467 0.143 \n",
"---\n",
"Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1\n",
"\n",
"Residual standard error: 3.02 on 265 degrees of freedom\n",
"Multiple R-squared: 0.00806,\tAdjusted R-squared: 0.004316 \n",
"F-statistic: 2.153 on 1 and 265 DF, p-value: 0.1435\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"lmabc(Cauline.leaf.num ~ Vernalization, data = hua) |> summary()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\n",
"Call:\n",
"lmabc(formula = Cauline.leaf.num ~ Genotype * Vernalization, \n",
" data = hua)\n",
"\n",
"Residuals:\n",
" Min 1Q Median 3Q Max \n",
"-5.2941 -2.7206 -0.2941 2.5469 9.5469 \n",
"\n",
"Coefficients:\n",
" Estimate Std. Error t value Pr(>|t|) \n",
"(Intercept) 5.8015 0.1818 31.919 < 2e-16 ***\n",
"Genotypehua2-3 -0.5623 0.1798 -3.128 0.00196 ** \n",
"Genotypehua2-3 FRI 0.5751 0.1838 3.128 0.00196 ** \n",
"VernalizationNV 0.2851 0.1838 1.551 0.12216 \n",
"VernalizationV -0.2788 0.1798 -1.551 0.12216 \n",
"Genotypehua2-3:VernalizationNV 0.1963 0.1784 1.101 0.27210 \n",
"Genotypehua2-3 FRI:VernalizationNV -0.2086 0.1895 -1.101 0.27210 \n",
"Genotypehua2-3:VernalizationV -0.1992 0.1810 -1.101 0.27210 \n",
"Genotypehua2-3 FRI:VernalizationV 0.1963 0.1784 1.101 0.27210 \n",
"---\n",
"Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1\n",
"\n",
"Residual standard error: 2.97 on 263 degrees of freedom\n",
"Multiple R-squared: 0.04787,\tAdjusted R-squared: 0.03701 \n",
"F-statistic: 4.408 on 3 and 263 DF, p-value: 0.004801\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"lmabc(Cauline.leaf.num ~ Genotype*Vernalization, data = hua) |> summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Contrasts (50%)\n",
"\n",
"We select all six genotypes for this exercise, and the cauline leaf number trait. We check the numbers in each group which are approximately the same."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A tibble: 6 × 2\n",
"\n",
"\tGenotype | n |
\n",
"\t<fct> | <int> |
\n",
"\n",
"\n",
"\tCol Ama | 135 |
\n",
"\tCol FRI | 128 |
\n",
"\thua2-3 | 136 |
\n",
"\thua2-3 FRI | 134 |
\n",
"\tvin3-4 | 135 |
\n",
"\tvin3-4 FRI | 121 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A tibble: 6 × 2\n",
"\\begin{tabular}{ll}\n",
" Genotype & n\\\\\n",
" & \\\\\n",
"\\hline\n",
"\t Col Ama & 135\\\\\n",
"\t Col FRI & 128\\\\\n",
"\t hua2-3 & 136\\\\\n",
"\t hua2-3 FRI & 134\\\\\n",
"\t vin3-4 & 135\\\\\n",
"\t vin3-4 FRI & 121\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A tibble: 6 × 2\n",
"\n",
"| Genotype <fct> | n <int> |\n",
"|---|---|\n",
"| Col Ama | 135 |\n",
"| Col FRI | 128 |\n",
"| hua2-3 | 136 |\n",
"| hua2-3 FRI | 134 |\n",
"| vin3-4 | 135 |\n",
"| vin3-4 FRI | 121 |\n",
"\n"
],
"text/plain": [
" Genotype n \n",
"1 Col Ama 135\n",
"2 Col FRI 128\n",
"3 hua2-3 136\n",
"4 hua2-3 FRI 134\n",
"5 vin3-4 135\n",
"6 vin3-4 FRI 121"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# filter by the six genotypes\n",
"vern1 <- filter(vern,(Genotype == \"hua2-3\") | (Genotype == \"hua2-3 FRI\")|\n",
" (Genotype == \"vin3-4\") | (Genotype == \"vin3-4 FRI\")|\n",
" (Genotype == \"Col Ama\")| (Genotype == \"Col FRI\")) %>% droplevels()\n",
"vern1$Genotype <- factor(vern1$Genotype)\n",
"# get counts by genotype\n",
"group_by(vern1,Genotype) %>% summarise(n=n())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### a. Calulate the means when plants are not vernalized\n",
"\n",
"We calculate the mean trait for all six genotypes when the plants vere not vernalized. First, we select the non-vernalized plants, and then calculate means for each genotype group."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A tibble: 6 × 3\n",
"\n",
"\tGenotype | meancauline | n |
\n",
"\t<fct> | <dbl> | <int> |
\n",
"\n",
"\n",
"\tCol Ama | 5.803030 | 66 |
\n",
"\tCol FRI | 8.176471 | 51 |
\n",
"\thua2-3 | 5.720588 | 68 |
\n",
"\thua2-3 FRI | 6.453125 | 64 |
\n",
"\tvin3-4 | 5.878788 | 66 |
\n",
"\tvin3-4 FRI | 7.729167 | 48 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A tibble: 6 × 3\n",
"\\begin{tabular}{lll}\n",
" Genotype & meancauline & n\\\\\n",
" & & \\\\\n",
"\\hline\n",
"\t Col Ama & 5.803030 & 66\\\\\n",
"\t Col FRI & 8.176471 & 51\\\\\n",
"\t hua2-3 & 5.720588 & 68\\\\\n",
"\t hua2-3 FRI & 6.453125 & 64\\\\\n",
"\t vin3-4 & 5.878788 & 66\\\\\n",
"\t vin3-4 FRI & 7.729167 & 48\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A tibble: 6 × 3\n",
"\n",
"| Genotype <fct> | meancauline <dbl> | n <int> |\n",
"|---|---|---|\n",
"| Col Ama | 5.803030 | 66 |\n",
"| Col FRI | 8.176471 | 51 |\n",
"| hua2-3 | 5.720588 | 68 |\n",
"| hua2-3 FRI | 6.453125 | 64 |\n",
"| vin3-4 | 5.878788 | 66 |\n",
"| vin3-4 FRI | 7.729167 | 48 |\n",
"\n"
],
"text/plain": [
" Genotype meancauline n \n",
"1 Col Ama 5.803030 66\n",
"2 Col FRI 8.176471 51\n",
"3 hua2-3 5.720588 68\n",
"4 hua2-3 FRI 6.453125 64\n",
"5 vin3-4 5.878788 66\n",
"6 vin3-4 FRI 7.729167 48"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# filter by not vernalized\n",
"# group by genotype\n",
"# get mean cauline leaf number (make sure that NA's are removed)\n",
"cmeans <- filter(vern1, Vernalization == \"NV\") %>% \n",
" group_by(Genotype) %>% \n",
" summarise(meancauline = mean(Cauline.leaf.num,na.rm=T),\n",
" n = sum(!is.na(Cauline.leaf.num)))\n",
"cmeans"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### b. Calculate the main effect of genotype with the reference being the mean of all genotypes\n",
"\n",
"For this experiment (with six genotypes), there is no natural reference genotype, so we use the mean of all genotypes mean as the reference. Then we subtract each genotype mean from that number."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A tibble: 6 × 4\n",
"\n",
"\tGenotype | meancauline | n | maineffect |
\n",
"\t<fct> | <dbl> | <int> | <dbl> |
\n",
"\n",
"\n",
"\tCol Ama | 5.803030 | 66 | -0.8238311 |
\n",
"\tCol FRI | 8.176471 | 51 | 1.5496091 |
\n",
"\thua2-3 | 5.720588 | 68 | -0.9062732 |
\n",
"\thua2-3 FRI | 6.453125 | 64 | -0.1737364 |
\n",
"\tvin3-4 | 5.878788 | 66 | -0.7480736 |
\n",
"\tvin3-4 FRI | 7.729167 | 48 | 1.1023052 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A tibble: 6 × 4\n",
"\\begin{tabular}{llll}\n",
" Genotype & meancauline & n & maineffect\\\\\n",
" & & & \\\\\n",
"\\hline\n",
"\t Col Ama & 5.803030 & 66 & -0.8238311\\\\\n",
"\t Col FRI & 8.176471 & 51 & 1.5496091\\\\\n",
"\t hua2-3 & 5.720588 & 68 & -0.9062732\\\\\n",
"\t hua2-3 FRI & 6.453125 & 64 & -0.1737364\\\\\n",
"\t vin3-4 & 5.878788 & 66 & -0.7480736\\\\\n",
"\t vin3-4 FRI & 7.729167 & 48 & 1.1023052\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A tibble: 6 × 4\n",
"\n",
"| Genotype <fct> | meancauline <dbl> | n <int> | maineffect <dbl> |\n",
"|---|---|---|---|\n",
"| Col Ama | 5.803030 | 66 | -0.8238311 |\n",
"| Col FRI | 8.176471 | 51 | 1.5496091 |\n",
"| hua2-3 | 5.720588 | 68 | -0.9062732 |\n",
"| hua2-3 FRI | 6.453125 | 64 | -0.1737364 |\n",
"| vin3-4 | 5.878788 | 66 | -0.7480736 |\n",
"| vin3-4 FRI | 7.729167 | 48 | 1.1023052 |\n",
"\n"
],
"text/plain": [
" Genotype meancauline n maineffect\n",
"1 Col Ama 5.803030 66 -0.8238311\n",
"2 Col FRI 8.176471 51 1.5496091\n",
"3 hua2-3 5.720588 68 -0.9062732\n",
"4 hua2-3 FRI 6.453125 64 -0.1737364\n",
"5 vin3-4 5.878788 66 -0.7480736\n",
"6 vin3-4 FRI 7.729167 48 1.1023052"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# main effect with the sum contrast is the genotype mean minus mean of all genotype means\n",
"dfMainEffect <- mutate(cmeans, maineffect = meancauline-mean(meancauline))\n",
"dfMainEffect"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 'Col Ama'
- 'Col FRI'
- 'hua2-3'
- 'hua2-3 FRI'
- 'vin3-4'
- 'vin3-4 FRI'
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 'Col Ama'\n",
"\\item 'Col FRI'\n",
"\\item 'hua2-3'\n",
"\\item 'hua2-3 FRI'\n",
"\\item 'vin3-4'\n",
"\\item 'vin3-4 FRI'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 'Col Ama'\n",
"2. 'Col FRI'\n",
"3. 'hua2-3'\n",
"4. 'hua2-3 FRI'\n",
"5. 'vin3-4'\n",
"6. 'vin3-4 FRI'\n",
"\n",
"\n"
],
"text/plain": [
"[1] \"Col Ama\" \"Col FRI\" \"hua2-3\" \"hua2-3 FRI\" \"vin3-4\" \n",
"[6] \"vin3-4 FRI\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"levels(as.factor(vern1$Genotype))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that the main effects sum to zero."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A tibble: 1 × 1\n",
"\n",
"\tsum |
\n",
"\t<dbl> |
\n",
"\n",
"\n",
"\t0 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A tibble: 1 × 1\n",
"\\begin{tabular}{l}\n",
" sum\\\\\n",
" \\\\\n",
"\\hline\n",
"\t 0\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A tibble: 1 × 1\n",
"\n",
"| sum <dbl> |\n",
"|---|\n",
"| 0 |\n",
"\n"
],
"text/plain": [
" sum\n",
"1 0 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# the sum of the main effects is zero\n",
"mutate(cmeans,maineffect = meancauline - mean(meancauline)) %>% summarise(sum = sum(maineffect)) %>% round(5) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### c. Main effect calculated using `lm`\n",
"\n",
"We change our options to use `contr.sum` as the contrasts function. Then we use `lm` to calculate the main effect of each genotype. The values agree with what we get above. "
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"- unordered
- 'contr.treatment'
- ordered
- 'contr.poly'
\n"
],
"text/latex": [
"\\begin{description*}\n",
"\\item[unordered] 'contr.treatment'\n",
"\\item[ordered] 'contr.poly'\n",
"\\end{description*}\n"
],
"text/markdown": [
"unordered\n",
": 'contr.treatment'ordered\n",
": 'contr.poly'\n",
"\n"
],
"text/plain": [
" unordered ordered \n",
"\"contr.treatment\" \"contr.poly\" "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"options()$contrasts"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\n",
"Call:\n",
"lm(formula = Cauline.leaf.num ~ Genotype, data = vern1[vern1$Vernalization == \n",
" \"NV\", ])\n",
"\n",
"Residuals:\n",
" Min 1Q Median 3Q Max \n",
"-5.1765 -2.7206 -0.1765 2.2708 9.5469 \n",
"\n",
"Coefficients:\n",
" Estimate Std. Error t value Pr(>|t|) \n",
"(Intercept) 6.6269 0.1458 45.444 < 2e-16 ***\n",
"Genotype1 -0.8238 0.3127 -2.635 0.00879 ** \n",
"Genotype2 1.5496 0.3468 4.468 1.06e-05 ***\n",
"Genotype3 -0.9063 0.3091 -2.932 0.00358 ** \n",
"Genotype4 -0.1737 0.3165 -0.549 0.58337 \n",
"Genotype5 -0.7481 0.3127 -2.392 0.01725 * \n",
"---\n",
"Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1\n",
"\n",
"Residual standard error: 2.752 on 357 degrees of freedom\n",
" (25 observations deleted due to missingness)\n",
"Multiple R-squared: 0.1043,\tAdjusted R-squared: 0.09175 \n",
"F-statistic: 8.314 on 5 and 357 DF, p-value: 1.898e-07\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# change contrast option\n",
"options(contrasts=c(\"contr.sum\",\"contr.poly\"))\n",
"# use lm to get main effects\n",
"(out1 <- lm(Cauline.leaf.num ~ Genotype, data = vern1[vern1$Vernalization==\"NV\",])) %>% summary"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\n",
"Call:\n",
"lmabc(formula = Cauline.leaf.num ~ Genotype, data = vern1[vern1$Vernalization == \n",
" \"NV\", ])\n",
"\n",
"Residuals:\n",
" Min 1Q Median 3Q Max \n",
"-5.1765 -2.7206 -0.1765 2.2708 9.5469 \n",
"\n",
"Coefficients:\n",
" Estimate Std. Error t value Pr(>|t|) \n",
"(Intercept) 6.50413 0.14445 45.028 < 2e-16 ***\n",
"GenotypeCol Ama -0.70110 0.30642 -2.288 0.02272 * \n",
"GenotypeCol FRI 1.67234 0.35727 4.681 4.06e-06 ***\n",
"Genotypehua2-3 -0.78354 0.30086 -2.604 0.00959 ** \n",
"Genotypehua2-3 FRI -0.05101 0.31222 -0.163 0.87032 \n",
"Genotypevin3-4 -0.62534 0.30642 -2.041 0.04200 * \n",
"Genotypevin3-4 FRI 1.22503 0.37004 3.311 0.00103 ** \n",
"---\n",
"Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1\n",
"\n",
"Residual standard error: 2.752 on 357 degrees of freedom\n",
"Multiple R-squared: 0.1043,\tAdjusted R-squared: 0.09175 \n",
"F-statistic: 8.314 on 5 and 357 DF, p-value: 1.898e-07\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"lmabc(Cauline.leaf.num ~ Genotype, data = vern1[vern1$Vernalization==\"NV\",]) |> summary()"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 'contr.sum'
- 'contr.poly'
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 'contr.sum'\n",
"\\item 'contr.poly'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 'contr.sum'\n",
"2. 'contr.poly'\n",
"\n",
"\n"
],
"text/plain": [
"[1] \"contr.sum\" \"contr.poly\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"options()$contrasts"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" One genotype level is left out of the `lm` output. Since the sum of the main effects is zero, the one left out is the negative of the sum of the others."
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A matrix: 1 × 5 of type dbl\n",
"\n",
"\tGenotype1 | Genotype2 | Genotype3 | Genotype4 | Genotype5 |
\n",
"\n",
"\n",
"\t-0.8238 | 1.5496 | -0.9063 | -0.1737 | -0.7481 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 1 × 5 of type dbl\n",
"\\begin{tabular}{lllll}\n",
" Genotype1 & Genotype2 & Genotype3 & Genotype4 & Genotype5\\\\\n",
"\\hline\n",
"\t -0.8238 & 1.5496 & -0.9063 & -0.1737 & -0.7481\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 1 × 5 of type dbl\n",
"\n",
"| Genotype1 | Genotype2 | Genotype3 | Genotype4 | Genotype5 |\n",
"|---|---|---|---|---|\n",
"| -0.8238 | 1.5496 | -0.9063 | -0.1737 | -0.7481 |\n",
"\n"
],
"text/plain": [
" Genotype1 Genotype2 Genotype3 Genotype4 Genotype5\n",
"[1,] -0.8238 1.5496 -0.9063 -0.1737 -0.7481 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# all but one main effect\n",
"(meff <- t(coef(out1)[-1])) %>% round(4)"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"1.1023"
],
"text/latex": [
"1.1023"
],
"text/markdown": [
"1.1023"
],
"text/plain": [
"[1] 1.1023"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# negative of the sum\n",
"-sum(meff) %>% round(4)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Thus, the main effect of the \"Col Ama\" (which is first in alphabetical order) genotype is 1.1023.\n",
"We can get this number by also releveling the factor to make \"vin3-4 FRI\" the reference."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [],
"source": [
"# relevel\n",
"vern1$Genotype <- relevel(vern1$Genotype,ref=\"vin3-4 FRI\")"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\n",
"Call:\n",
"lm(formula = Cauline.leaf.num ~ Genotype, data = vern1[vern1$Vernalization == \n",
" \"NV\", ])\n",
"\n",
"Residuals:\n",
" Min 1Q Median 3Q Max \n",
"-5.1765 -2.7206 -0.1765 2.2708 9.5469 \n",
"\n",
"Coefficients:\n",
" Estimate Std. Error t value Pr(>|t|) \n",
"(Intercept) 6.6269 0.1458 45.444 < 2e-16 ***\n",
"Genotype1 1.1023 0.3556 3.100 0.00209 ** \n",
"Genotype2 -0.8238 0.3127 -2.635 0.00879 ** \n",
"Genotype3 1.5496 0.3468 4.468 1.06e-05 ***\n",
"Genotype4 -0.9063 0.3091 -2.932 0.00358 ** \n",
"Genotype5 -0.1737 0.3165 -0.549 0.58337 \n",
"---\n",
"Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1\n",
"\n",
"Residual standard error: 2.752 on 357 degrees of freedom\n",
" (25 observations deleted due to missingness)\n",
"Multiple R-squared: 0.1043,\tAdjusted R-squared: 0.09175 \n",
"F-statistic: 8.314 on 5 and 357 DF, p-value: 1.898e-07\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# fit new model\n",
"out2 <- lm(Cauline.leaf.num ~ Genotype, \n",
" data = vern1[vern1$Vernalization==\"NV\",])\n",
"summary(out2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can see that the intercept is unchanged, and that the main effect for the releveled factor\n",
"indicates that the \"Col Ama\" genotype has main effect 1.1023."
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A data.frame: 6 × 2\n",
"\n",
"\t | original | releveled |
\n",
"\t | <dbl> | <dbl> |
\n",
"\n",
"\n",
"\t(Intercept) | 6.6268614 | 6.6268614 |
\n",
"\tGenotype1 | -0.8238311 | 1.1023052 |
\n",
"\tGenotype2 | 1.5496091 | -0.8238311 |
\n",
"\tGenotype3 | -0.9062732 | 1.5496091 |
\n",
"\tGenotype4 | -0.1737364 | -0.9062732 |
\n",
"\tGenotype5 | -0.7480736 | -0.1737364 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 6 × 2\n",
"\\begin{tabular}{r|ll}\n",
" & original & releveled\\\\\n",
" & & \\\\\n",
"\\hline\n",
"\t(Intercept) & 6.6268614 & 6.6268614\\\\\n",
"\tGenotype1 & -0.8238311 & 1.1023052\\\\\n",
"\tGenotype2 & 1.5496091 & -0.8238311\\\\\n",
"\tGenotype3 & -0.9062732 & 1.5496091\\\\\n",
"\tGenotype4 & -0.1737364 & -0.9062732\\\\\n",
"\tGenotype5 & -0.7480736 & -0.1737364\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 6 × 2\n",
"\n",
"| | original <dbl> | releveled <dbl> |\n",
"|---|---|---|\n",
"| (Intercept) | 6.6268614 | 6.6268614 |\n",
"| Genotype1 | -0.8238311 | 1.1023052 |\n",
"| Genotype2 | 1.5496091 | -0.8238311 |\n",
"| Genotype3 | -0.9062732 | 1.5496091 |\n",
"| Genotype4 | -0.1737364 | -0.9062732 |\n",
"| Genotype5 | -0.7480736 | -0.1737364 |\n",
"\n"
],
"text/plain": [
" original releveled \n",
"(Intercept) 6.6268614 6.6268614\n",
"Genotype1 -0.8238311 1.1023052\n",
"Genotype2 1.5496091 -0.8238311\n",
"Genotype3 -0.9062732 1.5496091\n",
"Genotype4 -0.1737364 -0.9062732\n",
"Genotype5 -0.7480736 -0.1737364"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"data.frame(original=coef(out1),releveled=coef(out2))"
]
}
],
"metadata": {
"jupytext": {
"formats": "ipynb,Rmd"
},
"kernelspec": {
"display_name": "R",
"language": "R",
"name": "ir"
},
"language_info": {
"codemirror_mode": "r",
"file_extension": ".r",
"mimetype": "text/x-r-source",
"name": "R",
"pygments_lexer": "r",
"version": "4.4.2"
}
},
"nbformat": 4,
"nbformat_minor": 4
}