Abstract
Background Posterolateral fusion (PF) is a common method by which to achieve fusion in lumbar spine surgery. It has been reported that posterior interbody fusion (PIF) yields a higher fusion rate and a better functional and clinical outcome. Our objective was to determine whether PIF improves the clinical and radiologic outcomes in adults surgically treated for degenerative lumbar spine conditions compared with PF.
Methods We performed a systematic search of electronic databases, bibliographies, and relevant journals and meta-analyses.
Results Of 2798 citations identified, 5 studies met our inclusion criteria (none of which was a randomized controlled trial), with a total of 148 patients in the PIF group (intervention) and 159 in the PF group (control). Pooled meta-analyses showed that nonunion rates were lower in the intervention group (relative risk, 0.22; 95% confidence interval [CI], 0.08–0.62). The intervention group had a significantly higher disc height (weighted mean difference, 3.2 mm; 95% CI, 1.9–4.4 mm) and lower residual percent slippage (weighted mean difference, 6.3%; 95% CI, 3.9%–8.7%) at final follow-up. There were no significant differences in segmental or total lumbar lordosis. Because of heterogeneity of results, no conclusions could be made with regard to functional benefits.
Conclusions This review suggests that PIF achieves a higher fusion rate and better correction of certain radiographic aspects of deformity over PF. It also showed a slight but not significant trend toward a better functional outcome in the PIF group. The lack of randomized controlled trials and the methodologic limitations of the available studies call for the planning and conduct of a sufficiently sized, methodologically sound study with clinically relevant outcome measures. Until this has been done, the current evidence regarding the beneficial effects of PIF should be interpreted with caution.
Posterolateral spinal fusion is a long-established treatment for various degenerative disorders of the lumbar spine.1 Since its initial description, few other techniques have been described to achieve fusion of the lumbar spine, including posterior lumbar interbody fusion (PLIF)2 and unilateral transforaminal posterior lumbar interbody fusion (TLIF).3 The addition of interbody fusion (PLIF/TLIF) allows decompression of the exiting nerve root by distraction of the collapsed disc space and optimizes fusion in the load-bearing vertebral bodies with rich blood supply. The interbody fusion can be performed through an anterior or posterior approach. The addition of posterior interbody fusion (PIF) is more technically demanding, is associated with a higher complication rate when compared with posterolateral fusion (PF) only, and adds time and cost to the procedures. 4, 5 There have been few recent studies comparing PF and PIF in the treatment of degenerative lumbar spine conditions. However, the small sample sizes and the different methods by which to assess outcome have limited the clinical relevance of the findings.6–10
The objective of this systematic review is to answer the following question: Does the addition of PIF compared with PF alone improve the clinical and radiologic outcomes in adult patients undergoing surgical treatment for lumbar spine degenerative conditions?
Methods
Eligibility criteria
We identified relevant articles with the following inclusion criteria: (1) the target population consisted of adult patients undergoing surgical treatment of lumbar spine degenerative conditions (excluding tumor trauma and infection) with a minimum follow-up of 2 years; (2) the intervention was posterolateral with or without instrumentation compared with PIF (either PLIF or transforaminal lumbar interbody fusion with or without instrumentation); and (3) the outcome measure was patient-centered disease-specific functional outcome.
Study identification
A computerized search of the electronic databases Embase (1980–2006) and Ovid Medline and PubMed Medline (1966–February 2006) was performed. A hand search of the European Spine Journal, Spine, and the Journal of Spinal Disorders & Techniques, as well as bibliographies of identified studies and relevant narrative reviews, was performed to identify further studies.
Assessment of study quality
We assessed each published study for the quality of the study design using the Newcastle-Ottawa 8-point scale for assessment of nonrandomized studies.11 This scale grades the reporting of the studies based on the representativeness of samples, baseline factors, assessment of outcome, statistical analysis or study design, and length of follow-up.
Data extraction
For each eligible study, data were extracted and checked for accuracy. Specifically, the sizes and demographic data of the intervention and control groups, type of fusion, underlying diagnoses, length of follow-up, loss to follow-up, fusion rate, radiologic parameters, and clinical outcomes at final follow-up were recorded.
Data analysis
Because of the variety of clinical outcome tools used in the studies, surgical results were predefined as satisfactory if the patient had a score of less than 40 on the Oswestry Disability Index, a score of greater than 7 on the Prolo scale, or a greater than 40% gain in the Beaujon score or if the final outcome was rated as excellent or good. An outcome rating of excellent, good, significantly better, satisfied, or success was considered a satisfactory outcome, whereas ratings of fair, poor, same, worse, slightly satisfied, slightly dissatisfied, or unsuccessful were classified as unsatisfactory clinical outcomes.
For each study, the abstracted data were entered into Review Manager software, version 4.2, for statistical analysis. Pooled relative risks (RRs) of dichotomous variables (complication, nonunion, or poor outcome) and weighted mean differences of continuous variables (final disc space height and percent of spondylolisthesis slippage) were calculated with a random-effects model12 and used to compare PF and PIF. Statistical heterogeneity of pooled studies was tested and evaluated with the Higgins I2 test of heterogeneity at a significance level of P < .1.13
Results
Study identification
The literature search identified 2798 potentially relevant citations, 1982 from Medline and 816 from Embase. The application of eligibility criteria eliminated all but 5 articles from our study. Four studies were retrospective comparative studies, and one was a prospective nonrandomized trial. Isthmic spondylolisthesis was the preoperative diagnosis in 4 studies.6, 7, 9, 10 Degenerative disc disease, recurrent disc herniation, spondylolisthesis, and spinal stenosis were the indications for surgery in the fifth.8 These studies evaluated 307 patients (148 patients in the intervention group [PIF] and 159 patients in the control group [PF]). A minimum of 2 years’ follow-up was available for all patients. The sample sizes ranged from 35 to 100 patients. The details of the included studies are summarized in Table 1.
Study quality
Only 1 study stated clearly that the cases represented all the patients who underwent the intervention during the study period after the application of strict inclusion and exclusion criteria.8 The only prospective study in this review failed to give details on the representativeness of the sample or baseline factors, did not use a validated outcome assessment scale, and did not adequately describe the surgical details or the study design and statistical analysis.6 Validated outcome assessment scales were used in only 1 study,9 and the mean follow-up period was 2 to 3 years in all but 1 study, which had a 6-year follow-up.6 By use of the Newcastle-Ottawa quality assessment scale, none of the included studies met the criteria for a high-quality study. The patient-specific functional outcome evaluation tools included the following: Oswestry Disability Index, Prolo Economic and Functional Scale, Beaujon score, Modified Somatic Perception Questionnaire, Zung Depression Scale, and Kirkaldy-Willis criteria.
Nonunion
Two studies defined solid fusion when there was formation of crossing bony trabeculae and motion was less than 4 on flexion-extension on radiographs.7, 10 Madan and Boeree9 used the previously mentioned criteria to define union in addition to the criteria of Lenke et al.14 defining bony union, and La Rosa et al.7 added the absence of halo around the implant on radiographs to define solid union. Bony fusion was graded according to the classification of Brantigan and Steffee15 in the study by Lidar et al.8 The radiologic criteria and classification of fusion data were not reported in 1 study.6
Pooled results showed that nonunion was observed in 3 patients (2%) in the intervention group (PIF) and 21 patients (13%) in the control group (PF). This was statistically significant (P = .002; RR, 0.21; 95% confidence interval [CI], 0.08–0.56) and is shown in Fig. 1.
Radiologic correction of deformity
Four studies evaluated the radiologic correction of deformity, each using different methods.7–10 The intervention group had significantly higher disc height (weighted mean difference, 3.2 mm; 95% CI, 1.9–4.4 mm) and residual percent slippage (weighted mean difference, 6.3%; 95% CI, 3.9%–8.7%) at final follow-up. There were no significant differences in segmental or total lumbar lordosis.
Functional outcomes
The various functional outcome assessment instruments used in the included studies are summarized in Table 1. Only 1 study used multiple, validated outcome assessment scales.9 This study was the only study in our review that showed a significantly better functional outcome in the control group (PF) when compared with the intervention group (PIF). The other 4 studies favored the intervention group,6–8, 10 although none had sufficient numbers to show a statistically significant difference. By use of the prespecified definitions of satisfactory and unsatisfactory results, 120 patients (81%) had a satisfactory outcome (good or excellent result) in the intervention group (PIF) compared with 122 patients (77%) in the control group (PF), with no difference between the 2 groups (P = .32; RR, 1.6; 95% CI, 0.95–1.18). These pooled results are shown in Fig. 2.
Complications
All the reported complications excluding nonunion were evaluated and are shown in Table 1. One study reported no complications.7 Dehoux et al.6 reported 8 cases in the control group (PF) with persistent postoperative low-back pain that required hardware removal. There was no mention of the method used to diagnose the cause of this pain and whether it was improved after the hardware removal. There were also 2 complications in the intervention group (PIF): in 1 case there was mechanical failure because of a very short fusion, and in the other the cage could not be inserted because of a very narrow canal. Because the last 2 complications could have been avoided by careful preoperative planning and incomplete information is available on the 8 cases of persisting back pain, these 10 complications were eliminated from the final pooled analysis.
The pooled complication rate, shown in Fig. 3, showed no statistical difference (P= .94; RR, 0.96; 95% CI, 0.41–2.28) between the 2 groups, with a total of 9 complications (6%) in the intervention group (PIF) and 10 (6.2%) in the control group (PF).
Discussion
The current review used most of the methodologic criteria for research overviews. Specifically, it included explicit inclusion and exclusion criteria, assessed the methodologic quality of the studies, showed the reproducibility of selection and assessment criteria, and performed a quantitative analysis. A potential selection bias was eliminated by rigorously searching many databases and bibliographies and by conducting all aspects of the selection process in duplicate.16–18 The major limitation of this review is related to the poor quality of the included studies, which obviously affected the quality of the cumulative data. None of the included studies met the criteria for a high-quality study on the Newcastle-Ottawa scale.11 Bhandari et al.19 stated that the most definitive conclusions can be made only when high-quality randomized trials are pooled.
By design, the current analysis focused on comparing PF with PIF in the surgical treatment of degenerative lumbar spine disorders with regard to fusion, radiologic correction of deformity, functional outcome, and complications. The multiple deficiencies made the current analysis difficult.
We were unable to abstract enough details to pool radiologic outcomes across all studies. The radiologic deformity correction (eg, disc height and listhesis reduction) was reported differently in the study of Lidar et al.8 and was not pooled. In 1 study the only reported radiologic parameter reported was fusion.6 This review showed an improvement in disc height and slip percent in the PIF group, with a tendency toward loss of correction over time. It has been shown that poor sagittal balance postoperatively leads to adjacent segment degeneration and poor results.20 The review showed no difference in segmental or total lordosis.
The retrospective review of Madan and Boeree,9 in which multiple functional outcome assessment scales were used and detailed postoperative evaluation was performed, showed better functional outcome in the group treated with PF over interbody fusion. They attributed this to selection bias (age, sex, extent of listhesis, and disc degeneration) and the retraction and scarring of the nerve roots and thecal sac. The quality of the studies included in this review made it difficult to reach any conclusion regarding the previously mentioned factors.
High fusion rates have been shown with interbody fusion. 5, 21 Lowe et al.22 reported a 90% fusion rate and 85% rate of satisfactory clinical outcomes using the TLIF technique. Although fusion is often considered a satisfactory surgical outcome, we could not show that the functional outcome was superior in the interbody fusion group compared with the PF group despite the fact that the former group had a higher fusion rate. This in part could be because of the quality and design of the included studies, and a larger sample is needed to detect such a small difference. Another potential reason for this is the short follow-up period in most of the included studies.7–10 It is possible that with longer follow-up, these results might be different. Although the assessment of fusion is important, it is still crucial to recognize other factors, such as confounding comorbidities, preoperative diagnosis, and patient selection, when one is evaluating the functional outcome after degenerative lumbar spine surgery. Unfortunately, the studies included in this review did not do so.
TLIF was developed to address some of the complications associated with PLIF.22 All the interbody fusions in the intervention group were performed by the PLIF technique, and this was not associated with an increased complication rate compared with the control group.
In conclusion, this review suggests that PIF improves the fusion rate, correction of disc height, and reduction of spondylolisthesis slip percent. However, there were no significant differences in functional outcome, final segmental or lordotic angles, and complication rates over PF. These conclusions are limited by the poor quality of the included studies; this indicates the need for sufficiently sized and methodologically sound studies to assess clinically relevant endpoints. Until these studies are performed, the current evidence regarding the value of adding posterior lumbar body fusion in the surgical management of degenerative lumbar spine diseases should be interpreted with caution.
- © 2013 Elsevier Inc. All rights reserved.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-Noncommercial 3.0 Unported License, permitting all non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.