Homework 4
Homework 4
Due: 2020-09-08, 11:59pm
In this homework you will practice organizing your work, and in particular, all future homeworks and projects.
1. Organize your folders and files (25%)
For this homework you should submit a ZIP archive called
firstnameLastnameHW4.zip whose directory/folder structure is as
follows:
firstnameLastnameHW4
primaryData
README
myrecipe.md
E0.csv
processedData
myrecipe.csv
analysis
HW4.Rmd
HW4.html
HW4.ipynb
- All files should be inside the directories.
- Make sure you only use relative path names in your code.
- Put a
READMEfile in the primary data directory that describes where the data came from and what they are.
Your homework file will be in the analysis directory and can be
either a RMarkdown (RMD/HTML) or Jupyter notebook (IPYNB) format.
Each item below should be a separate section in the homework file.
2. Read in your recipe (25%)
Repeat the process of reading in your recipe in R, and writing out a CSV file of the ingredients, amounts, and units. This time, organize it properly, paying attention to the following:
- The recipe file should be in the
primaryDatafolder. Add info to the README file in theprimaryDatafolder describing the file including how you obtained them. - The output CSV file should be in the
processedDatafolder. - The homework files should be in the
analysisfolder.
3. Read in English Premier League data (25%)
You will use the small datasets archive on Blackboard for this work.
Go to the soccer directory (folder) that contains data for all games
in the English Premier League for one season.
- Copy the file
E0.csvinto theprimaryDatafolder. Update the README file to indicate the source of the data. - Read in the data using
read.csv, and print out the first 10 lines usinghead. - How many games were played in that season (use
nrow)? - The data are for which season (look at the original README)?
4. Project (25%)
In this item you will be developing your project idea a bit further than last time. You should have the answers to the following items as subsections (you may reuse some of this content when you write the formal report).
- Background: Give a one para scientific background of the project.
- Research question: State the research question for this specific project. For example, “We want to examine the whether FLC genotype and fitness in previous years are predictive of future fitness of these recombinant inbred lines.” It should suggest a specific direction for your analysis.
- Data: Briefly describe the origin and content of the data you are proposing to use. State any issues with obtaining and sharing the data. What is the expected format of the data (CSV files, Excel files, database, etc.)? How large is the dataset (number of files, variables, and observation units)?
If you do not have a project idea, and need assistance, please say so. The ideal project is something that is closely related to your research. You can use the class to make research progress.
It is okay for your project idea to drift a bit as you refine it.
5. Acknowledgements
Cite resources or individuals helping you.