Homework 9

Due: 2020-10-20, 11:59pm

For this homework you should submit a ZIP archive called firstnameLastnameHW9.zip. When unzipped there should be a single directory/folder called firstnameLastnameHW9 and all files should be within that directory. For this homework that directory should contain:

  • A single document with the answers to all the following items in HTML or IPYNB format. Make sure you include plain English blocks in between the code and its output, to interpret what R is giving you. There should also be comments in the R code blocks prefixed with #.
  • Code file used to generate the HTML file in RMD format (not needed if using IPYNB).

In this homework we will practice hypothesis tests and set the stage for the final project.

1. Arabidopsis data (50%)

  • The fitness of the plants appear to vary from year to year, and between Sweden and Italy. Using hypothesis tests and confidence intervals, what can you say about the the average fitness in Sweden in 2010 and 2011? What can you say about the difference in average fitness in 2010 between Italy and Sweden?

  • The FLC gene is known to affect floweting time, a key driver of response to the environment, and fitness in plants. Using hypothesis tests and confidence intervals what can you say about the effect of the gene on fitness in Arabidopsis in Sweden and Italy in the years 2009, 2010, and 2011.

2. Soccer data (25%)

  • Do half of Premier League games end in a draw? Use the soccer data to answer this question. Use prop.test to perform the hypothesis test. Comment on the conclusion.

3. Project preparation (25%)

  • State the main goal of your project (it should succintly articulate a specific aim).
  • In a paragraph or two explain the background scientific knowledge leading up to your project.
  • Describe the data/study that you will be using for your project (about a paragraph). State if you have access to it, how large it is (number of variables, number of individuals), and if it is in one file or multiple files.
  • Write a brief data analysis plan. Imagine that you are giving instructions to another version of yourself. Make a list of specific analyses you should do. Broadly they should cover: reading the data, data exploration and visualization, summary statistics, and hypothesis-driven analyses.

4. Acknowledgements

Cite resources or individuals helping you.