Modeling Prostate-Specific Antigen Levels in Cancer Patients

Photo by the National Cancer Institute on Unsplash

       This scenario-based project was given to us as an assignment in my linear regression model class. A university medical center had data on several variables of interest relating to prostate cancer patients. The general assignment was to develop a model that uses the variables (or an appropriate subset thereof) in the data set provided to us in order to describe and predict the main variable of interest— prostate-specific antigen levels. There were needs for analysis in the following areas:

  1. Univariate analysis of the variables of interests in the data set.
  2. Thorough assessment of how well the multiple regression model fits the data and provide justifications for why a model is selected over another.
  3. Provide interval estimates for important model coefficients, and interpret them. Need a family confidence of 95%.
  4. Perform appropriate diagnostics such as residual analysis, assumptions validation, etc.

The finished report can be found here; for those interested, here’s the data.