Abstract
Background The authors conducted a comprehensive review and integration of insights from 4 webinars hosted by the International Society for the Advancement of Spine Surgery (ISASS) to arrive at recommendations for best clinical practices for guideline development for endoscopic spine surgery. This perspective article discusses the limitations of traditional surgical trials and amalgamates surgeons’ experience and research on various cutting-edge techniques.
Methods Data were extracted from surveys conducted during each webinar session involving 3639 surgeons globally. The polytomous Rasch model was employed to analyze responses, ensuring a robust statistical assessment of surgeon endorsements and educational impacts and focusing on operative nuances and experience-based outcomes. Bias detection was performed using the differential item functioning test.
Results The ISASS webinars provided a dynamic platform for discussing advances in endoscopic spine surgery, identifying a range of high-value procedures from basic discectomies to complex lumbar interbody fusions. Each high-value endoscopic spine surgery was highlighted in separate peer-reviewed publications, which form the basis for this summary document that synthesizes key takeaways from these webinars. High-value clinical applications of endoscopic spine surgery, primarily defined as higher-intensity endorsement transformation from the pre- to postwebinar survey with a shift to higher mean logit locations of test items both with unbiased and orderly threshold progression, were: (a) Percutaneous interlaminar endoscopic decompression for lateral canal stenosis, (b) transforaminal debridement of low-grade degenerative spondylolisthesis, (c) transforaminal full-endoscopic interbody fusion for hard disc herniation, (d) endoscopic standalone lumbar interbody fusion, (e) endoscopic debridement of spondylolytic spondylolisthesis, and (f) posterior cervical foraminotomy for herniated disc and bony stenosis.
Conclusions The ISASS webinar series has significantly impacted surgeons’ education and contributed to the identification of high-value endoscopic spine surgery practices that may serve as a cornerstone for surgeon training standards, policy, and guidelines development. Ongoing research on technological advancements and expansions of clinical indications combined with systematic review is expected to refine the recommendations on high-value endoscopic spinal surgeries recommended for enhanced reimbursement.
Clinical Relevance Assessing surgeon confidence and acceptance of endoscopic spinal surgeries using polytomous Rasch analysis.
Level of Evidence Level 2 (inferential) and 3 (observational) evidence because Rasch analysis provides statistical validation of instruments rather than direct clinical outcomes.
- endoscopic spine surgery
- clinical guidelines development
- Rasch analysis
- surgeon experience
- high-value surgical procedures
- bias detection
- surgical trial limitations
Introduction
Nowadays, endoscopic spinal surgery is widely practiced. However, the delineation of best clinical practices in the form of clinical guidelines for innovative technologies requires unbiased superiority evaluation of clinical benefits before calling for the replacement of traditional open spinal surgery protocols. While inevitable, such changes are met with resistance, particularly when high-grade clinical evidence is scarce. Critics call out the paucity of prospective randomized, double-blinded clinical trials in endoscopic spine surgery. These are traditionally regarded as the pinnacle of clinical evidence.1 The subject becomes quickly complex and may result in stalemate because there are many challenges to surgical clinical trials where the rigorous standards required by such trials often lead to the dismissal of innovative therapies and protocols in spine surgery.2 The reality is that clinical trials in spine surgery face substantial limitations, suggesting a need to redefine what constitutes the creation of highest-grade clinical evidence to foster changes in practice.3
Clinical Trial Limitations
Innovations in spine surgery are often driven by entrepreneurial surgeons,4 with outcomes typically reported as opinions or retrospective case series, which are susceptible to various biases. Efforts to mitigate bias through randomization and stringent inclusion/exclusion criteria can extend enrollment periods and potentially skew patient selection,5 leading to a loss of clinical equipoise or a study group that does not accurately represent the typical patient population treated by spine surgeons.6 A comprehensive examination reveals that other surgical subspecialties employ various classifications and levels-of-evidence reporting, tailored to specific clinical scenarios.7 In spine surgery, research tends to focus on diagnosis, preferred treatment options, and economic analyses, while prognosis-based classifications may be more suitable in fields like plastic surgery.8,9 Large-scale clinical trials in spine surgery encounter difficulties with controlled double-blinded randomization. Moreover, many spine surgery trials struggle to progress beyond the Phase II single-center stage for a plethora of reasons, ranging from insufficient organizational and financial support to challenges with institutional review board approvals, trial registration, and the ethical concerns linked to control groups that may potentially harm patients.3,10
Clinical research in spine surgery is also hampered by randomization problems and several problems stand out.3 Cross-over is 1 major problem that may degrade originally well-designed randomized controlled trials (RCTs).11–14 Fast-evolving surgical technologies may quickly make an ongoing RCT no longer needed or justifiable. The most significant drawback to surgical RCTs is the inability to blind. Patients almost always know what surgery was done, and surgeons always know what surgery they performed. While blinding reduces bias, as demonstrated by a recent systematic review of 250 RCTs,5 it may also cause considerable differences in treatment effects between double-blinded trials compared with open-label trials. Blinding is nearly impossible in surgical trials as sham-controlled interventions are rarely feasible.2 Furthermore, RCTs often suffer from limited generalizability due to strict eligibility criteria that do not accurately represent the spectrum of clinical issues seen in routine practice.15
Systematic literature reviews suggest that well-designed prospective observational cohort studies could provide higher-grade evidence than poorly executed randomized trials, especially if results are consistent across studies, and study reporting adheres to the STROBE checklist to ensure transparency in reporting by submitting detailed information for different groups in case-control and cohort studies, as discussed by von Elm et al in a 2007 Lancet article.16 This approach might represent the highest attainable standard of evidence in a specialty such as spine surgery that heavily relies on experience and skill. Surgical cohort studies are also more adept at capturing the real-world effectiveness of interventions as they do not restrict participant selection and may be more applicable to typical, nonstudy patients.
Observational studies may also be of higher pertinency for the average practicing spine surgeon who has limited time and resources to support meaningful outcome research in cash-strapped health care systems, where there is no extra time or resources to conduct complex clinical trials. Thus, advocating for strict adherence to double-blinding and randomization to justify protocol change for example from open to endoscopic spine surgery may be unrealistic and unfeasible.17 The inherent limitations of these RCT protocol designs in surgical outcome analysis represent a well-recognized barrier often referred to as the “glass ceiling effect.”18,19 Calling attention to the common RCT constraints requires a pragmatic reassessment of evidence standards and, in this article, specifically how they apply to endoscopic spinal surgery.
Traditional Workarounds
Several practical workarounds have been applied. Pseudorandomization between centers may illustrate differences in preferred treatments. Statistical power may be enhanced by concurrent data collection at different institutions or by orchestrating parallel cohort studies and separating the surgical team from administrators, outcome evaluators, and data analysts in an attempt to mitigate the lack of blinding.20 Propensity scoring is another method to reduce section bias and confounding of treatment effects due to patients’ characteristics.21–25 The propensity score, the probability of treatment exposure conditional on covariates, is the basis for 2 approaches to adjusting for confounding: methods based on stratification of observations by quantiles of estimated propensity scores and methods based on weighting observations by the inverse of estimated propensity scores. Both of these approaches and related methods offer improved precision by identifying candidate covariates, prioritizing, and integrating them into a propensity-score-based confounder adjustment model.21 However, the process is not straightforward and requires a multistep algorithm to implement high-dimensional proxy adjustments of clinical data that consists of (1) identifying data dimensions, for example, diagnoses, procedures, comorbidities, and previous spinal surgeries; (2) empirically identifying candidate covariates; (3) assessing the frequency and recurrence of problems; (4) prioritizing covariates; (5) selecting covariates for adjustment; (6) estimating the exposure propensity score; and (7) estimating an outcome model.
While this approach may work well in typical pharmacoepidemiological studies, where such proposed high-dimensional propensity score resulted in improved effect estimates compared with adjustment limited to predefined covariates, when benchmarked against results expected from randomized trials,23 it is evident that adjusting covariates and estimating an outcome model could easily distort patient outcome data of spine patients with high variability of painful conditions and available treatments, where the practicing spine surgeon may question its relevancy to “real-world” scenarios and doubt its generalizability. At a minimum, employing propensity scoring hardly seems more practical than an RCT.
Registries with prospective enrollment, standardized data collection methods, adequate follow-up, and adjustment for confounding variables have been set up to compare novel to established surgeries.15,26 In reality, registries traditionally had little traction in spine surgery because they were impractical or failed to deliver the desired clinical evidence rapidly.27 Their inherent limitations hindered the production of high-grade evidence, and data are typically collected in a nonrandomized fashion.28–30 Selection bias is common, as patients are not randomly assigned to treatment groups. They rely on voluntary data submission from multiple centers with different levels of detail and accuracy in data recording. Inconsistent data quality and the heterogeneity of treatments and patient populations make it difficult to draw generalizable conclusions. Hindsight bias may be introduced by patients,31 and incomplete capture of outcomes and utilization is often problematic because of a lack of long-term follow-up.26 Additionally, there can be underreporting of negative outcomes or complications for various reasons, including reporting bias or inconsistencies in how complications are defined and recorded across different reporting sites.32 Without the strict controls of a clinical trial, patient registries may introduce a wide range of confounding variables, such as variations in surgical technique, surgeon experience, and patient characteristics, which can obscure the effects of the surgical benefit itself.
Another way to demonstrate differences in treatment effects between surgical treatment study groups is to illustrate the durability of the treatment effect over time. This can be visualized using Kaplan-Meier curves in survival analysis.33–35 Although these curves lose accuracy as more patients are censored, they offer a straightforward visual representation of postoperative outcomes over time. These curves not only assist in managing patient expectations concerning the likelihood of reoperation33 and overall functional outcomes but also highlight the clinical benefits of competing treatments to peers and decision-makers, aligning with the goals of improved patient care and the principles of good stewardship of health care resources to be used efficiently and cost-effectively.
Evidence-Based Medicine and Guideline Development
The concept of “evidence-based medicine” (EBM) was introduced by Dr Gordon Guyatt at McMaster University.36 The foundational work, also promoted by his adviser, Dr David Sackett,37,38 has since seen the term EBM become widespread. Despite its common use, a deep comprehension of its full meaning is not as pervasive. EBM was originally crafted to blend 3 essential elements—best research evidence, clinical expertise, and patient values. This tripartite foundation, although sometimes neglected in today’s conversations about EBM with a near-exclusive emphasis on clinical trials, is crucial to its application in spine care. Recapping the initial EBM definition makes the case for empowering the stakeholders—surgeons and their patients. In spine surgery, though, clinical evidence and guideline development face a range of challenges due to the complexity of spinal disorders and the diversity of potential treatments.39,40 Clinical studies often involve issues with standardization of surgical techniques and difficulty in controlling for confounding variables, such as the surgeon’s skill and patient selection criteria. Additionally, there is significant variability in patient anatomy and the natural history of spinal diseases, which complicates the generation of high-quality, generalizable evidence (Table 1).
The traditional systematic process of literature review and committee review is laborious, resource intensive, and requires considerable time, expertise, and funding. The key steps to ensure that the guidelines are evidence-based, clinically relevant, and up-to-date are summarized in Table 2. Moreover, conflicts of interest of committee members with industry sponsorship may impact guideline development and must be managed to prevent biased recommendations.41,42 A plethora of spine surgery clinical guidelines has been published.43–62 Keeping them current can be quite challenging because rapid advancements in spine surgery can outpace the update cycle of guidelines. The latter problem may cause practitioners not to implement or adhere to the guidelines in clinical practice, particularly if they are not motivated by a perceived lag between technology advances and adjustments in medical necessity criteria for intervention to prompt payer authorization. Therefore, traditional clinical guidelines development should evolve into more dynamic and technologically advanced approaches, as proposed by the authors in the following section.
Living Clinical Guideline
The “Living Clinical Guidelines” concept is an innovative approach to clinical practice recommendations that aim to keep guidelines current by adapting to the latest evidence without the delays inherent in traditional guideline updating processes.63 In spine surgery, this concept is particularly valuable due to the rapid pace of technological advancement and the continuous emergence of new technology and its associated clinical data. This process of rapidly assessing the clinical evidence of new emerging technologies depends on real-time updates where existing recommendations may be altered or new ones added. Surgeons, patients, and other stakeholders should be engaged via digital surveys to poll their opinions on clinical outcomes with new technologies and to better understand their value and the psychometric motivators of clinical decision-making. “On-the-ground” level engagement is critical to update the guidelines to ensure they remain relevant and practical.
Policy statements for medical coverage recommendations should be issued based on ongoing guideline updates. The International Society for the Advancement of Spine Surgery (ISASS) has issued several such policy statements and updates to facilitate negotiations with the American Medical Association’s Relative Value Scale Update Committee, insurance companies, government entities, and health care systems.64–70 They outline a new technology’s surgery indications and recommended coverage for payment based on comprehensive analysis of clinical efficacy, safety, and cost-effectiveness comparing the new surgical procedure to established standards. One of the latest examples relates to an annular repair device and its appropriate clinical use.71 Others relate to correcting the misvaluation of the Category I Current Procedural Terminology (CPT) code (27279) assigned to minimally invasive sacroiliac joint arthrodesis72 and to the open surgical decompression and interlaminar stabilization CPT code 22867.65 These policy and coverage statements are crucial for both patients and health care providers as they determine the accessibility and reimbursement. Integration of modern digital platforms employed by the authors’ of this article may facilitate the ongoing process by disseminating updates promptly.
The Alternative Approach—Tapping Directly into Spine Surgeons’ Clinical Experience
Spine surgeons’ clinical decision-making has its foundation in postgraduate training, traditionally formulated clinical guidelines, and, most importantly, clinical experience. Implementing protocol change due to new high-grade clinical evidence has been characterized as slow because of study distortions by strict patient selection and randomization criteria that cannot be easily replicated in an individual spine surgeon’s practice. Surgeons may be eager to consume the new information but often return to their established practice protocols because they do not believe the information presented on a new technology applies to their patients. This common scenario creates a disconnect between the formalized clinical evidence study by professional surgeon organizations, payers, and governmental health agencies and the problems encountered in on-the-ground clinical decision-making, payor authorization for surgery, and reimbursement.
Tapping psychometrically into spine surgeons’ clinical experience using surveys analyzed using the Rasch model can significantly enhance the development of living clinical guidelines.65,72,73 The Rasch model is a psychometric tool used for constructing and analyzing surveys statistically after logarithmic transformation, particularly in the health sciences, ensuring that the data derived from these surveys are reliable and valid on a linear scale.72,74–81 This partial agreement analysis methodology has several benefits in rapid guideline development listed in Table 3. The authors applied the Rasch model to leverage the clinical experience of spine surgeons to rapidly and comprehensively measure surgeons’ experience with specific endoscopic procedures and their most appropriate surgical indications. The psychometric measurement approach of surgeons’ level of endorsement for proposed protocol changes (Figure 1) was expected to provide the deeper granular information needed to create living clinical guidelines that are not only evidence-based but also grounded in the day-to-day practical realities of patient care, thus increasing issuing organization’s relevance in surgeons’ clinical practice.
Rasch Methodology to Filter Out High-Value Endoscopic Surgeries
The Rasch model suggests that the characteristics of the surgeon based on his experience and abilities and the item (a specific endoscopic surgery and its perceived clinical results) determine the probability of a particular outcome in an empirical context. It models ordered response data by the likelihood of a response falling into categories such as “strongly agree,” “agree,” “disagree,” or “strongly disagree.” In the polytomous Rasch model, scoring x on an item indicates that an individual has surpassed x thresholds on a continuum while not surpassing the remaining m − x thresholds. Mathematically, the application of the Rasch model in the authors’ series of webinar survey studies is expressed as the log odds (or logit) of a surgeon endorsing an item, reflecting the difference between the surgeon’s ability or level of agreement and the item’s difficulty or a specific endoscopic surgery’s appropriate clinical application to achieve favorable clinical outcomes. The model uses χ 2 fit statistics, outfit, and infit to evaluate the data’s fit to the model—a process contrary to descriptive statistics employing regression or analysis of variance computations where the model is fit to the data.
The findings from the polytomous Rasch analysis are visually presented in the Wright plot83 and through person-item map analysis,84 which explicitly help to visually separate the easy from the hard to agree on items, the level of agreement, the number of endorsing surgeons, the location of the median logit on a logarithmic scale where the 0 logit location represents a 50%/50% chance of a surgeon agreeing or disagreeing with the item (Figure 2). In the context of this clinical guidelines article, the logit locations of items that are harder to agree on are shifted to the right in the person-item map. Another way to graphically visualize the Rasch analysis results is through the item characteristic curves (ICCs), which represent the probability that a surgeon with a given ability level will agree with an item’s difficulty. They plot the difficulty of an item against the likelihood of an endorsing response. In the authors’ webinar survey Rasch analyses, the 5 categories of partial agreement measure generated 5 curves per test item, with each curve representing the intensity of the agreement or disagreement.
The authors employed the Rasch methodology as an examination and filtration tool to identify high-value endoscopic spine surgeries defined by high levels of endorsement and orderly response thresholds (Figure 2). The details of the methodology’s boundary conditions employed by the authors are detailed in the individual articles. Repeating them herein and explaining the details of the infit and outfit statistics would be beyond the scope of this perspective study on clinical guideline recommendations. The authors list the high-value endoscopic spinal surgeries in this summary article and kindly ask the reader to refer to each corresponding source article for the 4-part webinar series on Current and Emerging Techniques in Endoscopic Spine Surgery.
Prior calibration of survey questions commonly deployed in Rasch methodology applications in education, for example, to determine whether the test was too easy or hard or whether the students were ill- or well-prepared to refine a test item’s ability to measure a desired known outcome did not apply here since there were no right or wrong responses per se. The authors also needed to learn the incoming survey responses. Still, they used the Rasch analysis as a filtration method to serendipitously identify definitive high-value clinical application of the endoscopic spinal surgery platform as those that were easy to agree on and those procedures with a supportive endorsement transformation—even if harder to agree on—from the pre- to postwebinar survey where a shift to higher mean logit locations occurred both with an orderly threshold progression without detection of bias.
Sample Size
Low sample size and biased outcome assessments are common grievances with traditional clinical trial results. The Rasch model operates under a principle of balanced requirements: to achieve a stable measure of individuals, the number of items presented should match the number of participants required to calibrate those items accurately. This symmetry is critical in psychometrics, as it ensures the reliability of the measurements derived from the model. According to Azizan et al, administering a set number of items—say, 30—to an equal number of participants, when done under conditions of appropriate targeting and good model fit, is likely to produce statistically stable measurements.85 Specifically, measures obtained in this setup are expected to be stable within ±1.0 logits at a 95% confidence level. Over 50 response measures are stable within ±1.0 logits at a 99% confidence level. While this balance helps enhance the precision of the Rasch model, making it a robust tool for assessing the likelihood of responses across a standardized scale, it is evident that the authors were able to achieve more than 100 responses in 4 of the 8 pre- and postwebinar survey with the survey registering the lowest number still having been completed by 34 and the remaining 3 by 57, 63, and 42 respondents, respectively. Therefore, the authors expected stability of the obtained measurements, with these parameters being essential for validating the construct under investigation, thus ensuring that the data that form the basis of these endoscopic spinal surgery summary recommendations reflected true differences in the trait or ability being measured rather than variations due to measurement error or sample size limitations.
Bias Detection
Rasch analysis excels at identifying disturbances in data, including biases, by analyzing residuals—the differences between observed and model-predicted responses. It generates fit statistics for each item to gauge their alignment with Rasch model expectations. The outfit mean square error statistic, sensitive to outliers, measures deviations from model predictions as a ratio of observed to expected variance, where a value of 1.0 signifies perfect fit, values above 1.0 indicate noise, and values below 1.0 suggest overfit. In contrast, infit is a weighted version that lessens the impact of less informative responses. Misfitting items, indicated by infit and outfit statistics, may function differently across respondent subgroups and could signal bias, known as differential item functioning (DIF). This bias can appear when individuals with equivalent abilities but different backgrounds respond inconsistently to an item. Such bias was detected in some of the responses between neurosurgeons and orthopedic surgeons. The difNLR() and difORD() functions are used for detecting DIF in dichotomous and ordinal data, respectively (Figure 2).86 The authors also attempted to detect data distortion with the MAPQ3 methodology rooted in Item Response Theory (IRT) analysis. This tool assists in identifying items that may disproportionately affect certain subgroups. Values of 0.3 or less indicate an absence of data distortion. The authors deemed the Rasch analysis particularly adept at detecting latent traits and item bias and considered it more sensitive than traditional linear regression or analysis of variance in this context because it is anchored in internal criteria, unlike traditional statistical test methods, which rely on external criteria, which in themselves have to be unbiased.87
Criteria for Identification of High-Value Surgeries
The ISASS webinars have provided a dynamic platform for discussing advances in endoscopic spine surgery, identifying a range of high-value clinical applications of endoscopic procedures from basic discectomies to complex lumbar interbody fusions. Each high-value endoscopic spine surgery was highlighted in separate peer-reviewed publications which form the basis for this summary document that synthesizes key takeaways from these webinars to establish recommendations for comprehensive clinical guidelines. The following methodology was applied to identify high-value clinical applications.
In Rasch analysis, “negative logit” (logic) value between 0 and −2 indicate that an item is relatively easy compared with items with higher (or positive) logit values. This specific range suggests that the item is easy but not trivially so. It is easier than average but still requires some degree of ability or knowledge to be able to answer it. The closer the logit value is to 0, the closer the item’s difficulty is to the average level. A logit score of less than −2 for an item represents an extremely low difficulty level, making it an exceptionally easy item for almost all surgeon respondents regardless of their ability levels, which is observed with the test item articulating instruments (logit −15). Positive values up to +2 indicate that the item is more difficult than average but not extremely so. Items are considered moderately challenging for surgeons since they require a higher level of ability to endorse the proposed clinical application of endoscopic spinal surgery. Items in this logit range are generally good at differentiating between surgeons who have average abilities and those who have slightly above-average abilities. They can effectively help in distinguishing surgeons based on their mastery or surgical skill of the endoscopic surgery platform in the test clinical application. Positive logits between 0 and +2 are an indication that the surveys were balanced and capable of accurately assessing surgeons across a spectrum of abilities for reliability. Most test items’ logits in the 8 surveys were in this range.
The authors also looked at mean threshold location, threshold progression, and threshold spread. Thresholds are the points at which the probability of choosing a higher category becomes more likely than choosing the current or a lower category. For example, in the authors’ Likert scale from 1 to 5, the threshold between 1 and 2 is the point at which a respondent is equally likely to choose 2 over 1. Disorderly thresholds occur when this progression is not maintained. For example, surgeons might have found that the threshold between “agree” and “strongly agree” is actually lower than between “neutral” and “agree.” This situation can suggest that respondents do not perceive the categories as logically or consistently more demanding of the trait or ability. Disordered threshold progression occurred when the order of response categories for an item did not logically or consistently increase with the trait or ability being measured indicating a problem with how the response categories function, which can impact the accuracy and reliability of the measurement. Disorderly thresholds may indicate misinterpretation, redundancy, or inappropriate scaling. The authors did collapse nondifferentiating categories in some cases to mitigate this problem.
The spread of logits is also of significance. Items with a narrow spread of logits are close in difficulty. This means there is less variance in how challenging the items are, which might have limited the surveys’ ability to differentiate effectively among test-takers of varying abilities. If all items are similar in difficulty, the assessment may only effectively measure a narrow band of abilities. It could be too easy or too difficult for individuals outside this band, leading to floor or ceiling effects where scores are clustered at the low or high ends of the scale. In contrast, a large range in the difficulty of items (item logits) or a broad distribution of abilities among individuals (person logits) suggests that the test includes items varying from very easy to very difficult allowing for better differentiating among surgeons with different levels of ability. When person logits show a wide spread, it indicates that the tested group has a varied range of abilities or traits.
To identify high-value clinical applications of endoscopic spinal surgery, particularly in relation to surgeon ability and confidence in achieving favorable outcomes, the authors used the following 4 criteria to identify high-value clinical applications of the endoscopic surgery platform:
1. Logit Location With High Positive Logit Values. Items (in this context, specific endoscopic surgery applications) that have higher positive logit values indicate higher difficulty or complexity, which could translate to procedures that require more skill or confidence to perform. A high positive logit location suggests that only surgeons with higher abilities are confident in achieving favorable outcomes with these applications. This can be an indicator of a high-value application, especially if these procedures are recognized for their effectiveness despite their complexity.
2. Moderate to Wide Logit Spread. A broader range of logit values among items could indicate a diverse set of surgical applications ranging from basic to advanced complexity. A widespread is beneficial as it ensures that the analysis captures a full spectrum of applications from those considered straightforward to those viewed as more challenging. Applications falling on the higher end of this spread (more positive logits) and endorsed by capable surgeons could be considered high-value due to their specialized nature.
3. Orderly Threshold Progression. This criterion is perhaps the most critical aspect when evaluating high-value clinical applications. Orderly progression in threshold responses means that as surgeons’ confidence or perceived ability increases, so does their endorsement of a procedure’s potential to yield favorable outcomes. This orderly progression is a strong indicator that the surgical application is valid and reliable, and increasing levels of surgeon ability or confidence correlate with a higher expected success rate.
4. Avoiding Disorderly Threshold Progression. In contrast to orderly progression, disorderly thresholds where responses do not logically align with increasing abilities or confidence levels indicate confusion or inconsistency in how surgeons perceive the application. Disorderly thresholds might suggest that further clarification about the procedure’s effectiveness or training may be necessary before it can be considered high-value.
Thus, for identifying high-value clinical applications in endoscopic spinal surgery, focusing on these aspects within a Rasch model provides a robust method for determining which procedures are most trusted and valued by experienced surgeons, thereby guiding educational priorities and clinical practice.
High-Value Endoscopic Spine Surgeries
The ISASS webinar surveys reached 3639 spine surgeons globally (Part 1: 1311; Part 2: 667; Part 3: 793; and Part 4: 868). The intra-survey completion rates, once started for pre- and postwebinar surveys, were between 50.0% and 77.8%. The corresponding completion rates using the total number of webinar participants as the denominator ranged between 3.2% and 16.4%. In total, 781 spine surgeons submitted completed surveys (Table 4). The high value clinical applications of spinal endoscopy (Table 5) were as follows:
Percutaneous interlaminar endoscopic decompression for lateral canal stenosis
transforaminal debridement of low-grade degenerative spondylolisthesis
transforaminal full-endoscopic interbody fusion for hard disc herniation
endoscopic standalone lumbar interbody fusion
posterior cervical foraminotomy for herniated disc and bony stenosis
endoscopic debridement of spondylolytic spondylolisthesis
posterior endoscopic single and multilevel decompression of cervical spondylotic myelopathy.
Some test items were so overly easy to endorse, such as the benefit of articulating instruments or the percutaneous interlaminar decompression surgery, that these items generated large negative logits. However, no procedural skill was measured with the item articulating instruments. Surgeons indicated that they were needed. Endorsement shifts to higher logits were observed with the transforaminal technique for:
transforaminal discectomy
lateral and central canal stenosis
migrated disc herniations
transforaminal decompression of facet cysts
posterior endoscopic lumbar interbody fusion
unilateral biportal endoscopic spine surgery decompression for facet cyst, lateral
combination of transforaminal endoscopy with an interbody process spacer
multiportal strategies for central canal stenosis
However, the person-item maps for all these techniques displayed mostly narrow logit spreads and, more importantly, out-of-order threshold progression, indicating that responses did not logically align with increasing abilities or confidence levels suggestive of confusion regarding the utility of a particular endoscopic surgery technique or inconsistency in how surgeons perceive the application in the context of clinical benefit. Narrow logit spread suggested that survey questions effectively measured a narrow band of abilities. Another reasonable explanation is that some of these surgeries are overutilized and applied to the treatment of a painful spine pathology, which does not respond favorably to this treatment. It is also possible that the responses contributing to this disorderly threshold progression came from lesser-skilled surgeons who do not see clinical improvements as higher-skilled surgeons with the same operation and surgical indication. The latter explanation appears reasonable considering that endoscopic spinal surgery has gotten significant traction within the last decade and is now performed by thousands of surgeons globally—corroborated by the authors’ ability to attract nearly 4000 surgeons to the webinar series. This disorderly threshold progression in some of these widely practiced clinical applications of the endoscopic spinal surgery platform needs further clarification and investigation of the interplay of the procedure’s effectiveness and surgeon training and skill level before they can be considered high-value applications.
Discussion
High-value spine surgeries are needed for the subspecialty to remain relevant in the elective treatment of sciatica-type neurogenic low back and leg pain, cervical pain syndromes, and cervical spondylotic myelopathy (CSM). Payers and government institutions look to cut costs while asking surgeons to provide sophisticated care utilizing health care resources in the spirit of good stewardship. The authors employed the Rasch analysis, a powerful statistical tool commonly used in psychometrics, to apply it to assess various dimensions of health care, including the evaluation of high-value endoscopic spine surgery procedures. Asking spine surgeons to articulate their experience with multiple applications of the endoscopic spinal surgery platform regarding favorable outcomes, lower complication and revision rates, and the overall value of the test item is 1 way to empower surgeons to partake in the discussion that is typically dominated by academic centers who run clinical trials or spine societies who devise policies and coverage recommendation for reimbursement. ISASS hosted a webinar series of 4 webinars on endoscopic spine surgery between February to April 2024. The series reached nearly 4000 spine surgeons (N = 3639), and 50% of the responding surgeons had more than 20 years of surgical experience.
Understanding Rasch Analysis in Health Care
Rasch analysis traditionally measures latent traits that are not directly observable. In the context of health care, these latent traits could be the competencies and skills of surgeons. By transforming these abstract qualities into measurable data, Rasch analysis provides a quantitative foundation to assess how these factors correlate with clinical outcomes focusing on the interplay between surgeon skill and ability, where the most able spine surgeons will tackle the more complex problems to achieve favorable clinical outcomes in the context of endoscopic spine surgeries, particularly emphasizing its impact on the overall value within the health care system, as they employ it to replace traditional open and other forms of minimally invasive spinal surgeries.
Surgeon Skill, Ability, Training, Credentialing, and Clinical Outcomes
In endoscopic spine surgeries, the skill and ability of the surgeon play critical roles in influencing the procedure’s success, perhaps more so than in traditional open-spine surgery. These surgeries, known for their precision and minimally invasive nature, demand high-technical expertise. They have a steeper learning curve, as confirmed by the results of the first ISASS webinar. Rasch modeling could be further applied to evaluate various competencies such as hand-eye coordination, decision-making under pressure, and mastery of specific surgical techniques. This approach enables the creation of a scalable measure of surgeon abilities, which can be directly correlated with clinical outcomes. Therefore, this scalable measure of surgeon ability, proficiency, and competency could be used to assess how surgeon trainees progress in a postgraduate training program or how surgeons that have long graduated can be credentialed in new surgeries that arose out of interim technology advances that were not available at the time they trained in residency or fellowship.
Impact on Reoperation and Complication Rates
One of the critical indicators of a successful surgical intervention is the reduction not just in complication, but more importantly, in reoperation rates. Lower reoperation rates not only reflect the immediate success of the surgical procedure but also indicate a longer-term effectiveness and patient safety by preserving spinal motion and avoiding commonly recognized problems such as adjacent segment disease following fusion, thus leading to lower utilization of healthcare resources. The model can identify specific skills that significantly lower the risk of adverse effects and the necessity for further surgical interventions, and it allows benchmarks to be set. By applying Rasch analysis, it is possible to link higher surgeon competencies with these favorable outcomes statistically, identifying high-value procedures and who is qualified to perform them to achieve the desired high-end, durable results.
Economic Implications and Health Care Value
The economic implications of linking surgeon skill levels to clinical outcomes through Rasch analysis could be profound. High-value procedures, characterized by lower complication and reoperation rates, contribute to the overall efficiency of the health care system. They lead to reduced hospital stays, less need for additional treatments, and improved long-term health, collectively decreasing health care costs. This economic benefit underscores the value of investing in surgeon training and continuous professional development. Furthermore, the concept of high-value endoscopic spine care needs to be expanded to include the patient perspective, which is often underrepresented in the current paradigm of EBM. This paradigm has traditionally focused on clinical trials, somewhat sidelining surgeon experience and patient values. Future investigations should actively solicit patient responses as a third pillar of EBM, ensuring their experiences and outcomes are integral to defining high-value care. Currently, patients’ voices are mostly confined to reviews on platforms like Google, Yelp, RateMDs, Vitals, and Zocdoc or Healthgrades, which, while informative, often concentrate on doctor-patient interaction, staff, and office environment, wait times, billing and cost, accessibility and communication, technical skills and knowledge, such as the doctor’s expertise and professionalism, and overall satisfaction and recommendations with many reviews culminating in an overall satisfaction rating and whether the patient would recommend the doctor or facility to others, but do not comprehensively capture the nuances of patient-perceived value in medical care. There is a clear need for more structured and meaningful mechanisms that allow patients to contribute to the discourse on what constitutes high-value care in a way that influences health care practices and policies. The authors’ Rasch analysis is suited to ensure that spine care systems are efficient and genuinely responsive to the needs and values of those they serve.
Policy and Training Implications
The insights gained from Rasch’s analysis of incoming survey responses from the 4-part ISASS webinar series can impact policy decisions and training programs. Health care systems and medical boards can use this data to set benchmarks for surgical competence, tailor training programs to address identified skill gaps, and prioritize resources toward the most impactful training techniques. This targeted approach enhances the quality of surgical care and ensures a better allocation of health care resources, promoting a more sustainable health care system.88
Conclusion
Using Rasch analysis to evaluate the interplay between surgeon skill and favorable clinical outcomes in endoscopic spine surgery offers a comprehensive way to assess and enhance surgical quality by identifying high-value procedures. By demonstrating how surgeon competence directly affects the clinical success and health care value, this approach provides a data-driven foundation for advancing surgical practices and health care policies. Through this analysis, health care systems can maximize the value delivered to patients while minimizing unnecessary costs and improving overall treatment efficacy by reducing the overutilization of low-value endoscopic spine surgeries.
Acknowledgments
Thanks to all the participating surgeons for their invaluable contributions and to the International Society for the Advancement of Spine Surgery (ISASS) staff for facilitating this important educational series. Special thanks to the sponsor—Lange MedTech—for enabling a broad dissemination of knowledge and supporting the webinar series and subsequent clinical guideline development with an educational grant to the ISASS. These guidelines aim to serve as a dynamic document, evolving with ongoing advancements and clinical feedback to remain at the forefront of endoscopic spine surgery practices.
Footnotes
↵† International Society for the Advancement of Spine Surgery, e Interamerican Society for Minimally Invasive Spine Surgery - La Sociedad Interamericana de Cirugía de Columna Mínimamente Invasiva (SICCMI), International Intradisccal Therapy Society (IITS.org), National Academy of Medicine of Colombia and Brazil
↵‡ Minimally Invasive Spine Surgery Section of the Chinese Orthopedic Association (COA), International Society for Endoscopic Spine Surgery (ISESS)
↵§ Minimally Invasive Spine Surgery Section of the Chinese Orthopedic Association (COA)
↵¶ Interamerican Society for Minimally Invasive Spine Surgery - La Sociedad Interamericana de Cirugía de Columna Mínimamente Invasiva (SICCMI)
↵** Interamerican Society for Minimally Invasive Spine Surgery - La Sociedad Interamericana de Cirugía de Columna Mínimamente Invasiva (SICCMI)
↵†† International Society for Minimal Intervention in Spinal Surgery (ISMISS)
↵‡‡ Interamerican Society for Minimally Invasive Spine Surgery - La Sociedad Interamericana de Cirugía de Columna Mínimamente Invasiva (SICCMI)
↵§§ Korean Minimally Invasive Spine Society
↵¶¶ Interamerican Society for Minimally Invasive Spine Surgery - La Sociedad Interamericana de Cirugía de Columna Mínimamente Invasiva (SICCMI)
↵*** International Society for Minimal Intervention in Spinal Surgery (ISMISS)
↵††† International Society for Minimal Intervention in Spinal Surgery (ISMISS)
↵‡‡‡ Brazilian Society for Thoracic Surgery – Sociedade Brasileira de Cirurgia Torácica (SBCT)
↵§§§ International Society for the Advancement of Spine Surgery
Funding ISASS received funding for the webinar series upon which this article is based as well as for the publication of this special issue. Funding was paid directly to the organization. No formal funding by private, government or commercial funders was received by the authors.
Declaration of Conflicting Interests The authors volunteered their time and internal resources to support the design and conduction of this research study. All authors aided in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. The authors declare no conflict of interest relevant to this research, and there was no personal circumstance or interest that may be perceived as inappropriately influencing the representation or interpretation of reported research results. This research was not compiled to enrich anyone.
Disclosures Brian Kwon reports royalties for product design from Globus/NUVA; consulting fees from Globus/NUVA for evaluation of products and technology; payment/honoraria for surgeon education/training from Globus/NUVA and Amplify Surgical; and stock/stock options in Amplify Surgical and SAB. Choll Kim reports consulting fees from Elliquence and Globus Medical. Gregory Basil reports consulting fees from Nuvasive, DePuy Synthes, and Aclarion; ownership and patents planned, issued, or pending with Kinesiometrics; and participation on a data safety monitoring board or advisory board for Hart Clinical Consultants. Christian Morgenstern reports royalties/licenses with Signus GmbH and Hoogland Spine Products GmbH; consulting fees from SpineArt SA and UniTech GmbH; and support for attending meetings and/or travel from Hoogland Spine GmbH and Signus GmbH. Jin-Sung Kim reports serving as a consultant for RIWOSpine GmbH and Elliquence; serving on the AOSpine Task Force, AOSpine Degenerative Knowledge Forum, and NASS Endoscopy Task Force; and stock/stock options from Amplify Surgical. Jorge Felipe Ramírez León reports grants/contracts, consulting fees, and payment/honoraria from Elliquence.
- This manuscript is generously published free of charge by ISASS, the International Society for the Advancement of Spine Surgery. Copyright © 2024 ISASS. To see more or order reprints or permissions, see http://ijssurgery.com.
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵