Homework 4
Homework 4
Due: 2020-09-08, 11:59pm
In this homework you will practice organizing your work, and in particular, all future homeworks and projects.
1. Organize your folders and files (25%)
For this homework you should submit a ZIP archive called
firstnameLastnameHW4.zip
whose directory/folder structure is as
follows:
firstnameLastnameHW4
primaryData
README
myrecipe.md
E0.csv
processedData
myrecipe.csv
analysis
HW4.Rmd
HW4.html
HW4.ipynb
- All files should be inside the directories.
- Make sure you only use relative path names in your code.
- Put a
README
file in the primary data directory that describes where the data came from and what they are.
Your homework file will be in the analysis
directory and can be
either a RMarkdown (RMD/HTML) or Jupyter notebook (IPYNB) format.
Each item below should be a separate section in the homework file.
2. Read in your recipe (25%)
Repeat the process of reading in your recipe in R, and writing out a CSV file of the ingredients, amounts, and units. This time, organize it properly, paying attention to the following:
- The recipe file should be in the
primaryData
folder. Add info to the README file in theprimaryData
folder describing the file including how you obtained them. - The output CSV file should be in the
processedData
folder. - The homework files should be in the
analysis
folder.
3. Read in English Premier League data (25%)
You will use the small datasets archive on Blackboard for this work.
Go to the soccer
directory (folder) that contains data for all games
in the English Premier League for one season.
- Copy the file
E0.csv
into theprimaryData
folder. Update the README file to indicate the source of the data. - Read in the data using
read.csv
, and print out the first 10 lines usinghead
. - How many games were played in that season (use
nrow
)? - The data are for which season (look at the original README)?
4. Project (25%)
In this item you will be developing your project idea a bit further than last time. You should have the answers to the following items as subsections (you may reuse some of this content when you write the formal report).
- Background: Give a one para scientific background of the project.
- Research question: State the research question for this specific project. For example, “We want to examine the whether FLC genotype and fitness in previous years are predictive of future fitness of these recombinant inbred lines.” It should suggest a specific direction for your analysis.
- Data: Briefly describe the origin and content of the data you are proposing to use. State any issues with obtaining and sharing the data. What is the expected format of the data (CSV files, Excel files, database, etc.)? How large is the dataset (number of files, variables, and observation units)?
If you do not have a project idea, and need assistance, please say so. The ideal project is something that is closely related to your research. You can use the class to make research progress.
It is okay for your project idea to drift a bit as you refine it.
5. Acknowledgements
Cite resources or individuals helping you.