Midterm 2

Due: 2022-11-01, 11:59pm

The purpose of the midterm is to review and test the topics we have covered in the class so far.

This work must be submitted via Blackboard. The answers must be in a MS Word (DOCX) or PDF format. Your submitted document should have sections corresponding to those in this homework.

You should include the Stata output/figures to support your answers. The Stata output should be in

fixed width (typewriter) font.

Include graphs as images in your document. Use the lecture notes as a guide.

1. Analyze Mulugeta metabolic syndrome data (50%)

Mulugeta and colleagues studied predictors of metabolic syndrome among adults visiting a hospital in Ethiopia. Dryad link You saw this data in the first midterm.

  • Test the hypothesis that the mean systolic and diastolic blood pressure in this population are 120 and 80 mm of Hg respectively using a t-test? Make a 95% confidence interval of the mean systolic and diastolic pressure.
  • Repeat the above, but using the sign test for the median being 120 and 80 mm of Hg respectively. Are the conclusions from the t-test and sign test consistent? Why or why not?
  • Test the hypotheis that the prevalence of metabolic syndrome is 30%; make a 95% and 99% confidence interval for the proportion with metabolic syndrome.
  • Make a scoring system for metabolic syndrome by counting the presence of the following: sleep difficulty, sedentary behavior (being physically inactive), being obese, and being older than 60 years. This score should range from 0 to 4. Tabulate this variable (call it the Mulugets score) against the metabolic syndrome variable.
  • Make an ROC curve for using the Mulugeta score to predict metabolic syndrome. (Make the sensitivity and specificity for different cutoffs for the Mulugeta score, and plot them.)
  • (Optional) Make an ROC curve that only uses systolic blood pressure as the predictor for metabolic syndrome. Compare the performance of systolic blood pressure versus the Mulugeta score.

2. Analyze Van Lindert CSF data (50%)

Cerebrospinal fluid (CSF) shunt infections occur in 5–15%. Van Lindert and colleagues (2018) added topical vancomycin to the existing shunt protocol. This study assesses the efficacy of this change in protocol in reducing the infection rate for a number of age cohorts.

The article and data may be accessed using the following links:

van Lindert EJ, Bilsen Mv, Flier Mvd, et al: Topical vancomycin reduces the cerebrospinal fluid shunt infection rate: A retrospective cohort study. PLOS ONE 2018;13(1):e0190249.

https://doi.org/10.1371/journal.pone.0190249

van Lindert EJ, van Bilsen M, van der Flier M, et al: Topical vancomycin reduces the cerebrospinal fluid shunt infection rate: a retrospective cohort study. Dryad Digital Repository 2018.

https://doi.org/10.5061/dryad.ch304

Read in the data into Stata. This data is a bit unusual (to us) in that the field separator is a semicolor, and the decimal separator is a comma (this format is often used in Europe). You will have to tweak the way you read in a CSV file to read in the data.

  • State how many observations and variables are in the dataset. List the variables and state the type of the variable.

    quantitative - integer
                 - continuous
    categorical  - dichotomous
                 - ordinal
                 - nominal
    
  • Is the information in the README file sufficient for understanding the data? Did you have to read the article to understand the data? Are there any duplicates or missing values in the data?

  • Tabluate all categorical variables, and make a histogram of the numeric variables. Comment.

  • Tabulate infection status by sex and protocol type. Make a spineplot of the same; put infection status on the y axis and the other two variable categories on the x axis.

  • Test the hypothesis that the infection rate in this population is 10% using a 5% level of significance. Make a 95% confidence interval for the infection rate. Are the test and the confidence interval consistent with each other? Why?

3. Acknowledgements

Please acknowledge individuals who helped you or resources thay were helpful in completing the homework.