New data-driven algorithm can forecast the mortality risk for certain cardiac surgery patients
A machine learning-based method developed by a Mount Sinai research team allows medical facilities to forecast the mortality risk for certain cardiac surgery patients. The new method is the first institution-specific model for determining the risk of a cardiac patient before surgery and was developed using vast amounts of Electronic Health Data (EHR).
Comparing the data-driven approach to the current population-derived models reveals a considerable performance improvement.
“The standard-of-care risk models used today are limited by their applicability to specific types of surgeries, leaving out significant numbers of patients undergoing complex or combination procedures for which no models exist,” says senior author Ravi Iyengar, Ph.D., the Dorothy H. and Lewis Rosenstiel Professor of Pharmacological Sciences at the Icahn School of Medicine at Mount Sinai, and Director of the Mount Sinai Institute for Systems Biomedicine. “Our team rigorously combined electronic health record data and machine learning methods to demonstrate for the first time how individual institutions can build their own risk models for post-cardiac surgery mortality.”
Institution-specific models versus population-derived models
While benchmarks for hospital performance, such as The Society of Thoracic Surgeons (STS) risk scores, are important, they are derived from population-level data and may not accurately predict risk for particular patients with complicated pathologies who need specialized preoperative assessments and complex surgeries.

The Mount Sinai study team postulated that by using EHR data from their own institution and machine learning-based algorithms, a workable solution might be provided. So they developed a risk prediction model for postsurgical mortality that is both unique to the patient and particular to the institution utilizing a rigorous machine learning framework and regularly gathered EHR data.
XGBoost model outperformed STS risk scores
The research team used XGBoost to model 6,392 cardiac operations, including heart valve procedures, coronary artery bypass graft, aortic dissection, replacement, or anastomosis, and reoperative cardiac operations, which have been shown to significantly increase mortality risk, that were carried out at The Mount Sinai Hospital between 2011 and 2016.
The team next evaluated how well its model performed against STS models for the same patient sets. In all widely performed categories of cardiac surgery for which STS scores were intended, the study found that the XGBoost model surpassed STS risk scores for mortality.
The XGBoost model had excellent prediction performance across all types of surgery, highlighting the potential of machine learning and EHR data for creating powerful institution-specific models.
Implications for healthcare institutions
For patients undergoing cardiac surgery to have the best outcomes, accurate postsurgical mortality prediction is essential. The study demonstrates that, depending on population data, institution-specific models may be preferable to the clinical norm.
The work was supported by funds from the National Institutes of Health and published in The Journal of Thoracic and Cardiovascular Surgery (JTCVS). The Icahn School of Medicine at Mount Sinai is well-known throughout the world for its top-notch clinical care, education, and research initiatives.
Study Abstract:
Background: The Society of Thoracic Surgeons risk scores are widely used to assess risk of morbidity and mortality in specific cardiac surgeries but may not perform optimally in all patients. In a cohort of patients undergoing cardiac surgery, we developed a data-driven, institution-specific machine learning–based model inferred from multi-modal electronic health records and compared the performance with the Society of Thoracic Surgeons models.
Methods: All adult patients undergoing cardiac surgery between 2011 and 2016 were included. Routine electronic health record administrative, demographic, clinical, hemodynamic, laboratory, pharmacological, and procedural data features were extracted. The outcome was postoperative mortality. The database was randomly split into training (development) and test (evaluation) cohorts. Models developed using 4 classification algorithms were compared using 6 evaluation metrics. The performance of the final model was compared with the Society of Thoracic Surgeons models for 7 index surgical procedures.
Results: A total of 6392 patients were included and described by 4016 features. Overall mortality was 3.0% (n = 193). The XGBoost algorithm using only features with no missing data (336 features) yielded the best-performing predictor. When applied to the test set, the predictor performed well (F-measure = 0.775; precision = 0.756; recall = 0.795; accuracy = 0.986; area under the receiver operating characteristic curve = 0.978; area under the precision-recall curve = 0.804). eXtreme Gradient Boosting consistently demonstrated improved performance over the Society of Thoracic Surgeons models when evaluated on index procedures within the test set.
Conclusions: Machine learning models using institution-specific multi-modal electronic health records may improve performance in predicting mortality for individual patients undergoing cardiac surgery compared with the standard-of-care, population-derived Society of Thoracic Surgeons models. Institution-specific models may provide insights complementary to population-derived risk predictions to aid patient-level decision-making.