To predict the 10-year risk of future coronary heart disease (CHD) in patients
IN PYTHON
The dataset provided information on over 4,000 patients and included 15 attributes, each representing a potential risk factor for CHD. These attributes included demographic, behavioral, and medical risk factors.Task :
“To predict the 10-year risk of future coronary heart disease (CHD) in patients”
- Use logistic regression and Random forest
- Use binning and provide examples
- Which features are really important
- Use Upsampling and downsampling for unbalanced data . And which is the best method and why?
- Provide cross-validation and performance evaluation metrics specific to imbalanced datasets, to ensure the model’s effectiveness and generalizability.