Homework 4

Due: 2024-02-20, 11:59pm

Submission instructions

For this homework you should submit a ZIP archive containing:

  • A single document with the answers to all the following items in HTML format only. Make sure you include plain English blocks in between the code, and its output to interpret what R is giving you.
  • Code file used to generate the answers (RMD format). There should be comments in the code blocks.

  • Jupyter notebook (IPYNB) is okay.
  • Please remember to mix your comments with code and output.
  • Do not forget acknowledgements.

In this homework we will work on non-linear regression and logistic regression.

1. Odds ratios, main effects and interactions (50%)

Consider the emergency department triage data.

  • Tabulate the 30-day mortality variable (mort30). Calculate the probability of death, and the odds of death.
  • Calculate the odds of death separately by the readmission variable (genindl.1). Then calculate the odds ratio and the lod odds.
  • Use logistic regression to calculate the same odds ratio.
  • Does the odds ratio for readmission vary by sex? Calculate the readmission odds ratio separately by sex, and then take the ratio.
  • Use logistic regression to calculate the same odds ratio (hint: it corresponds to an interaction term).

2. Non-linear modeling (50%)

We will perform non-linear modeling with logistic regression. Use frog abnormalities data. We will use the abnormalities (ABNORMAL) and tail length (TAIL_LENGTH) variables.

  • Use logistic regression to study the association of abnormalities and tail length.
  • What is the meaning of the regression coefficient in the output of glm command?
  • Add a quadratic term for tail length. Does it improve the fit? How do you know that it improves fit?
  • Create a new variable by dividing the range of tail length into 10 intervals, and calculate the proportion of abnormalities in each of these categories.
  • Plot the proportion of abnormalities as a function of tail length category. Is it consistent with your logistic regression output?

3. Acknowledgements

Cite resources or individuals helping you.