Homework 3

Due: 2020-09-01, 11:59p

For this homework, submit a ZIP archive containing all the files for this homework. You should first create a directory called LastnameFirstname-hw3, put all the homework-related files there, and zip that directory.

In this homework you will practice preparing data for reading into R.

1. (75%) Reading in the ingredient list of your recipe

Take the Markdown text file of the recipe you posted for the last homework and modify it so that the ingredient list can be read in R, but the recipe is still human-readable. Do it two ways, first as a delimted file, and then as a fixed width file.

Semicolon delimted file

  • Open the recipe file in a text editor, and comment out all lines except the ingredient list by using a # in the beginning of the line.
  • Tweak the ingredient list, so that they are delimited in a reasonable way (say using as semicolon).
  • Read it in R using read.table or any other text format.
  • Print out the data frame.

For example the pancake recipe file might look like this:

  # # Whole grain pancakes
  #
  # _Makes about 10 pancakes._
  #
  # # Ingredients
  #
  amount; unit; ingredient
      60;    g; whole wheat flour
      60;    g; brown rice flour
      1/8;   t; salt
      1;     t; sugar
      2;     t; baking powder
      200;   g; whole milk
      20;    g; yogurt
      50;    g; egg, beaten
      50;    g; vegetable oil or butter
  #
  # # Process
  #
  #    - Preheat skillet on medium heat.
  #    - Combine dry ingredients (flours, salt, sugar, baking powder) in bowl.
  #    - Add yogurt and egg to dry ingredients; do not mix.
  #    - Add 120g milk to mixture in bowl and mix till it becomes a thick mixture.
  #    - Add remaining milk, and stir till just mixed; do not overmix.
  #    - Brush oil on pan; wait till it starts smoking.
  #    - Pour batter onto skillet (adjust to desired size), and wait till 
  #       middle bubbles.
  #    - Flip pancake, and wait till side is done.
  #    - Serve with butter and maple syrup or jam.
  #
  # # Notes
  #
  #    - You may substitute all-purpose flour for whole wheat flour, 
  #      or white rice flour for brown rice flour, but then it won't be a 
  #      whole grain recipe.
  #    - The rice flour can also be replaced by wheat flour.

If I save it in a file called pancake.txt I can read it in using:

 pancake  <- read.table("pancake.txt",sep=";",comment="#",header=T)

Look up the help for read.table and consider the different options. You may have to edit the text file, or use options not mentioned here to read the data.

Fixed width file

  • Open the recipe file in a text editor, and tweak the ingredient list, so that the ingredients line up exactly in the text file (see example below).
  • Read it in R using read.fwf.
  • Print out the data frame.
  • Save the dataset in CSV format using write.csv.

For example the pancake recipe file might look like this:

  # Whole grain pancakes
  
  _Makes about 10 pancakes._
  
  # Ingredients
  
  amount unit ingredient
      60    g whole wheat flour
      60    g brown rice flour
     1/8    t salt
       1    t sugar
       2    t baking powder
     200    g whole milk
      20    g yogurt
      50    g egg, beaten
      50    g vegetable oil or butter
  
   # Process
  
   - Preheat skillet on medium heat.
   - Combine dry ingredients (flours, salt, sugar, baking powder) in bowl.
   - Add yogurt and egg to dry ingredients; do not mix.
   - Add 120g milk to mixture in bowl and mix till it becomes a thick mixture.
   - Add remaining milk, and stir till just mixed; do not overmix.
   - Brush oil on pan; wait till it starts smoking.
   - Pour batter onto skillet (adjust to desired size), and wait till 
     middle bubbles.
   - Flip pancake, and wait till side is done.
   - Serve with butter and maple syrup or jam.
  
   # Notes
  
   - You may substitute all-purpose flour for whole wheat flour, 
     or white rice flour for brown rice flour, but then it won't be a 
     whole grain recipe.
   - The rice flour can also be replaced by wheat flour.

(Add supporting files: Modified recipe text files (one for delimited, one for fixed width), file containing code used to read data and output from R.)

2. (25%) Project

Let us start thinking about the class project.

The ideal project is something that is closely related to your research. It should be interesting and challenging to you so that it furthers your personal goals. From a technical perspective, it should be complex enough to show off your R programming skills, but still simple enough for you to finish it this term.

(Supporting file: Document with the answers to the questions below; MD, TXT, DOCX, or PDF format.)

  • What is the research question?
  • What data will you use?
  • Will you have the data at hand, for sure, by September 30? The data should be such that the data owner will not mind the instructor and teaching assistant seeing it.

3. Acknowledgements

Please cite resources or people who helped you.