Abstract
Background Identify the external applicability of the American College of Surgeons’ National Surgical Quality Improvement Program (NSQIP) risk calculator in the setting of adult spinal deformity (ASD) and subsets of patients based on deformity and frailty status.
Methods ASD patients were isolated in our single-center database and analyzed for the shared predictive variables displayed in the NSQIP calculator. Patients were stratified by frailty (not frail <0.03, frail 0.3–0.5, severely frail >0.5), deformity [T1 pelvic angle (TPA) > 30, pelvic incidence minus lumbar lordosis (PI-LL) > 20], and reoperation status. Brier scores were calculated for each variable to validate the calculator’s predictability in a single center’s database (Quality). External validity of the calculator in our ASD patients was assessed via Hosmer-Lemeshow test, which identified whether the differences between observed and expected proportions are significant.
Results A total of 1606 ASD patients were isolated from the Quality database (48.7 years, 63.8% women, 25.8 kg/m2); 33.4% received decompressions, and 100% received a fusion. For each subset of ASD patients, the calculator predicted lower outcome rates than what was identified in the Quality database. The calculator showed poor predictability for frail, deformed, and reoperation patients for the category “any complication” because they had Brier scores closer to 1. External validity of the calculator in each stratified patient group identified that the calculator was not valid, displaying P values >0.05.
Conclusion The NSQIP calculator was not a valid calculator in our single institutional database. It is unable to comment on surgical complications such as return to operating room, surgical site infection, urinary tract infection, and cardiac complications that are typically associated with poor patient outcomes. Physicians should not base their surgical plan solely on the NSQIP calculator but should consider multiple preoperative risk assessment tools.
Level of Evidence 3.
INTRODUCTION
The current health care environment is increasingly emphasizing the need for proper risk stratification that can not only be applied to a wide range of specialties but can also be utilized for specific procedures given a patient’s preoperative disposition. There are many such programs that aim to link patient outcomes from surgery to provider reimbursement, some of which are organized by the Centers for Medicare and Medicaid Services, such as pay for performance and physician quality reporting system. Such programs are constantly being integrated into clinical practice to minimize patient outcomes as well as decrease hospital costs.1,2 The intention of these programs is to be able to create a risk stratification model that facilitates appropriate risk-adjusted profiles for individual patients preoperatively with a certain predictability of potential surgical complications. Such risk assessment tools that are customizable to the patient have been shown to be more powerful than generic predictive models.3
The novel American College of Surgeons’ National Surgical Quality Improvement Program (NSQIP) risk calculator was created using data from more than 500 hospitals to aid preoperative risk stratification of patients undergoing major surgery. This calculator is accessible to the public online (https://riskcalculator.facs.org/RiskCalculator/), is inconclusive of all surgical specialties, and has been previously validated.4 More specifically, the calculator uses 21 patient-specific variables as well as a current procedural terminology (CPT) codes for the patient’s specific procedure in order to generate a predicted risk for the 11 complication categories. The National Quality Forum has previously advocated that this is a viable tool to assess individual risk for numerous specialties.5 Utilization of the calculator in a spine cohort has been previously studied, with Veeravague et al finding that the calculator consistently underestimated complication occurrence.6 McCarthy et al found that in cervical patients undergoing fusions, the calculator was only predictive of overall complication occurrence and discharge status as it was unable to accurately predict complications on a more granular basis.7
Despite the increasing research of the calculator’s predictability in various surgical specialties, there has yet to be a study that utilizes the calculator in an adult spinal deformity (ASD) cohort. The current study aimed to validate the calculator’s applicability in a single institution ASD cohort for all of the shared outcomes predicted by the risk stratification tool.
MATERIALS AND METHODS
Study Design and Data Sources
This study is a single-center prospectively collected retrospectively analyzed validation cohort study. The single-center database (Quality) contains spine patients presenting to a single academic institution from September 2011 to June 2018. Institutional Review Board approval was obtained. Inclusion criteria consisted of age >18 years, operative treatment for ASD, with available radiographic, surgical, and health-related quality of life data. ASD was defined as scoliosis ≥20°, sagittal vertical axis ≥5 cm, pelvic tilt ≥25°, or thoracic kyphosis ≥60° and undergoing ≥4-level fusions.
ASD patients from NSQIP were analyzed from 2005 to 2016. The NSQIP database is an initiative developed by the Veterans Health Administration to track the risk-adjusted outcomes of surgical patients. NSQIP collects and tracks patient demographics, preoperative risk factors, CPT coding, International Classification of Disease 9th Edition coding, surgical information, and 30-day perioperative outcomes from randomly assigned patients at participating hospitals. Online Supplemental Appendix A displays the CPT/International Classification of Disease 9th Edition codes used to define our ASD cohort.
Using NSQIP Calculator
There are a total of 13 postoperative variables that are predicted by the NSQIP calculator as listed in Table 1. However, between our single institution Quality database and the NSQIP database, there were a total of 7 postoperative variables shared and labeled in Table 1. In order to utilize the NSQIP calculator in our single institution, we collected the baseline demographic data such as age, sex, functional status, emergency case, American Society of Anesthesiologists class, steroid uses, ascites prior, system sepsis prior, ventilator-dependent, disseminated cancer, diabetes, hypertension, congestive heart failure, dyspnea, smoker status, history of severe chronic obstructive pulmonary disease, dialysis, acute renal failure, and body mass index (BMI).
Statistical Analysis
ASD patients were isolated in NSQIP according to their CPT code (Online Supplemental Appendix A). These patients were then stratified according to the frailty status as developed by Miller et al based on the standard procedure published by Searle et al (not frail <0.03, frail 0.3–0.5, severely frail >0.5),8,9 deformity graded by T1 pelvic angle (TPA) >305 and pelvic incidence minus lumbar lordosis (PI-LL) >20,10 and reoperation status. Individual scores were calculated for each of the above groupings for all the analyzed CPT codes and averaged to create the calculators “predicted” value. Patients in the Quality database were then analyzed for the rate of each of the 7 outcomes listed in Table 1. Brier scores were then calculated for each variable in order to validate the calculator’s predictability in Quality. The Brier score is a quadratic scoring rule to measure the distance between observed and predicted risk. It is calculated as the sum of squared differences between the binary outcome (Y) and the predicted risk (p): (Y – p).2 Having a score closer to 1 and >0.05 means the NSQIP calculator is a poor predictive tool for that specific outcome. A score closer to 0 means the NSQIP calculator was a predictive tool for that factor.
External Validation of NSQIP Calculator
The Hosmer-Lemeshow test was performed to determine whether the differences between observed and expected proportions are significant. A large P value indicates that the difference between the number of observed and expected values is insignificant, and the model is therefore considered valid. If the P value is smaller than the specified level of significance (P < 0.05), the difference between the number of observed and expected values is statistically significant, and the model is therefore considered not valid.
RESULTS
Cohort Overview
A total of 1606 ASD patients were isolated from the Quality database (48.7 years, 63.8% women, 25.8 kg/m2). 33.4% received decompressions, and 100% received a fusion. 15.1% of the Quality patients had past medical history of hypertension, 3.1% malignant cancer, 5.2% diabetes, 2.6% connective tissue disease, and 2.8% chronic pulmonary disease (Table 2). All of the patients included were without metastatic spine disease and recovered from their solid organ malignancy.
Outcomes Between Quality and NSQIP Patients
The average ASD outcome predicted by the NSQIP risk calculator predicted lower rates for NSQIP patients for return to operating room (0.8% vs 2.4%), length of stay (3.5 vs 6.5 days), total complication rate (11.5% vs 16.5%), and cardiac complications (0.34% vs 1.9%) than Quality patients. The single institution did have lower urinary tract infection and surgical site infection outcomes (1.7% vs 2.85%; 1% vs 1.8%, respectively). The calculated Brier scores identified the calculator’s predictability for each factor is displayed in Table 3. As identified by scores <0.05, all of the variables had great predictability when used in a single institution cohort.
NSQIP Calculator in Frail Patients
When analyzed by frailty status, 7.6% of ASD were categorized as frail while 92.4% were not frail. By basic demographics, frail patients were older (65.2 vs 46.9 years), had a larger BMI (32.2 vs 25.2 kg/m2), and had a greater Charlson Comorbidity Index (CCI) (2.9 vs 0.2; all P < 0.05). These differences were adjusted in order to properly identify the calculator’s predictability. For not frail patients, all the variables were predictive with the NSQIP calculator displaying appropriate Brier scores. However, for frail patients, the calculator did not accurately predict “any complications” displaying the highest Brier score of 0.3 (Table 4).
NSQIP Calculator in Deformed Patients
Patients who had a high TPA (>30) were older (66.7 vs 32.6 years), had a higher BMI (30 vs 24.9), and had a greater CCI (1.9 vs 0.6; all P < 0.05) than those who had a low TPA. The same baseline demographic differences were identified for patients with high PI-LL (>20) and low PI-LL: age (63.2 vs 42.6), BMI (30.4 vs 26.3), and CCI (1.9 vs 1.0; all P < 0.05). Adjusting for these baseline differences, the calculator displayed the same poor predictability for “any complications” as shown in Table 5 and Table 6 for their TPA and PI-LL deformity, respectively.
NSQIP Calculator in Reoperation Patients
Of the 1606 ASD Quality patients who were isolated, 10.8% required a reoperation. There were no differences in basic demographics among these ASD reoperation patients and therefore did not require adjustment to the NSQIP calculator. As compared with the previously identified predicted values shown in Table 3 and Table 7, the NSQIP calculator accurately predicted cardiac complication, surgical site infection, urinary tract infection, return to operating room, and death. However, it had a Brier score >0.05 for “any complication,” indicating poor predictability for this variable.
External Validation of NSQIP Calculator
After performing the Hosmer-Lemeshow test, the NSQIP calculator was not valid in a single institution for ASD patients when stratified by frailty (x 2 = 587.4; P = 8.1 × 10−126), high TPA (x 2 = 38.9, P = 6.9 × 10−8), high PI-LL (x 2 = 43.9, P = 6.4 × 10−9), and reoperations (x 2 = 54.8; P = 3.9 × 10−11).
DISCUSSION
Perioperative metrics such as the NSQIP risk calculator have become increasingly utilized to assess surgical risk in various fields in order to ensure quality improvement.11–13 Such preoperative tools enable providers to avoid any potential risk a patient may have by incorporating a pretreatment plan to minimize these character metrics. Schenker et al found that risk calculators provide surgeons with improved preoperative morbidity and mortality estimates thus improving the informed consent process. The attention on these assessments is a result of the use complication occurrence as a proxy for the quality of care within public reporting efforts.14
In the current study, we evaluated the predictive utility of the NSQIP surgical risk calculator in a single institution data for ASD patients in general and various subsets of patients. We identified that the NSQIP calculator has poor predictability for “any complications” in patients with a deformed TPA and PI-LL as well as frail patients and those undergoing a reoperation. Although the calculator has been validated in recent literature,15,16 the findings are for a select surgical population and did not apply to our ASD cohort and subgroups since our models failed to be externally validated according to the Hosmer-Lemeshow test.
Currently, the applicability of the NSQIP risk calculator in spine-specific patients is limited. In the studies that have utilized the calculator in spine patients, they have identified that the calculator consistently predicts lower rates than those that area actually observed in the population.6,14 This is impart due to the calculator’s inability to accurately assess a patient’s risk profile regardless of the planned procedure. The calculator’s postoperative risk equation is only based on basic demographics and presurgical comorbidities and does not account for a patient’s frailty status, preoperative deformity, and past surgical history. As identified by Cho et al, reoperation of ASD patients inherently predisposes patients to an elevated risk of complications. However, they identified several risk factors to contribute to these outcomes such as fusion length, type of osteotomy, and preoperative radiographic measurements.17 The use of the modified Frailty Index has also been identified to be related with postoperative complications with higher modified Frailty Index to be associated with an increased risk of 30-day postoperative complications.18 With the calculator’s lack of taking these risk factors into account, the usability of the risk calculator in a spine-specific ASD population should be combined with other preoperative risk stratification models.
Nonetheless, the role of a preoperative risk assessment tool should not be discouraged in the light of these results. These tools encourage comprehensive preoperative discussion with the patient and create a patient-centered treatment plan. Using the NSQIP calculator in conjunction with other preoperative risk assessment tools can aid in postoperative care adjusting for patient factors at baseline in order to optimize outcomes as previously reported.19,20 Given the current health care era being more patient-centric, it is imperative to be cognizant of patient characteristics that may lead to the development of impairments and worse patient satisfactions. Proper use of stratifying spine patients by taking into account the identified factors in this study can ensure appropriate patient care thus minimizing the patient’s financial burden.
This study was not without limitations. First, our single-center data were obtained through retrospective review, which represents inherent limitations and the introduction of biases including the potential for provider selection to confound results. Second, the reoperation rate of our cohort was relatively higher in our study (10.8%) compared with others, which may have contributed to the elevated observed overall complication rate among cohorts. However, despite these limitations, these results provide valuable discussion on the use of such risk stratification tools such as the NSQIP risk calculator in surgical specialties taking into consideration baseline patient characteristics.
CONCLUSIONS
The NSQIP calculator is not a valid calculator in our single institutional database. It is unable to comment on surgical complications, such as return to surgery and cardiac complications that are typically associated with poor patient outcomes.
Supplementary material
Online Supplemental File 1.
Footnotes
Funding The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests The authors report no conflicts of interest in this work.
Disclosures Renaud Lafage reports stock/stock options from Nemaris. Virginie LaFage reports royalties/licenses from Nuvasive; consulting for Alphatec Spine and Globus Medical; paid presenter or speaker for DuPuy, Stryker, and the Permanente Medical Group; and leadership roles for the Scoliosis Research Society, International Spine Study Group, and European Spine Journal. Michael C. Gerling reports royalties from Integrity Implants; consulting fees from Integrity Implants, RTI Surgical, and Wolf Endoscopic; and a leadership role with AAOS and the Cervical Spine Research Society. Themistocles Stavros Protopsaltis reports royalties from Altus; consulting fees from Globus Medical, Medicrea, Medtronic, Nuvasive, and Stryker; and stock/stock options from Spine Align and Torus Medical. Aaron James Buckland reports consulting fees from Medtronic, Nuvasive, and Stryker. Nuvasive: Paid consultant. Stryker: Paid consultant. Peter G. Passias reports consulting fees from Medtronic, Medicrea, Royal Biologics, SpineWave, and Terumo; paid presenter/speaker from Globus Medical and Zimmer; research support from the Cervical Scoliosis Research Society; leadership role with Spine; and other financial or material support from Allosource, Cerapedics, and Spinevision.
- This manuscript is generously published free of charge by ISASS, the International Society for the Advancement of Spine Surgery. Copyright © 2023 ISASS. To see more or order reprints or permissions, see http://ijssurgery.com.