Homework 5

Due: 2024-03-05, 11:59pm

Submission instructions

For this homework you should submit a ZIP archive containing:

  • A single document with the answers to all the following items in HTML format only. Make sure you include plain English blocks in between the code, and its output to interpret what R is giving you.
  • Code file used to generate the answers (RMD format). There should be comments in the code blocks.

  • Jupyter notebook (IPYNB) is okay.
  • Please remember to mix your comments with code and output.
  • Do not forget acknowledgements.

In this homework we will work on generalized linear models.

Use the frog abnormalities data for this part. The ABNORMAL variable denotes whether or not a frog had an abormality; the Perkensus variable denotes with a 1 animals diagnosed with a protozoan pathogen, and SVL variable denotes animal length not including the tail.

  • Fit a logistic regression model for the association between frog abormalities and the protozoan infection. Calculate the odds ratio of association and its 95% confidence interval.
  • Compare the odds ratio calculated using logistic regression to that calculated directly by tabulating the two variables.
  • Instead of using the logit link function, use the log link function to assess the association between abnormalities and infection using the glm function. What does the coefficient corresponding to Perkensus represent?
  • Fit a logistic and probit regression model for abnormalities using the animal length as predictor. Canculate the predicted (fitted) probabilities for each animal using both models using the predict function. Make a scatterplot comparing the predictions obtained from the probit and logistic regressions.

2. Contingency tables (50%)

Use the emergency visit data for this part. This shows how to test association between a binary variable and a quantitative variable (CRP) using log linear models of summarized counts.

  • Create a categorical variable by splitting CRP into quartiles. Use logistic regression to assess the association between 30-day mortality with sex and CRP (as quartiles).
  • Use a likelihood ratio test to assess if CRP is associated with mortality when sex is included in the model.
  • Tabulate mortality, sex and CRP (as quartiles). Use log linear models on this contingency table to answer the same two questions above.
  • Use a loglinear model to test if CRP quartiles are associated with sex.

3. Acknowledgements

Cite resources or individuals helping you.