Operative Time as a Measure of Quality in Pancreaticoduodenectomy: Is Faster Better? A Retrospective Review Using the ACS NSQIP Database

Objective: To determine if pancreaticoduodenectomy operative time can provide insight into surgeon performance and thus be considered for use as a quality indicator. Background: Case volume is the traditional quality metric for complex pancreatic surgery, with studies showing better outcomes for high-volume providers. However, there are surgeons performing fewer cases with good quality who are overlooked for referrals directed to high-volume “centers of excellence”. Additional quality metrics are needed. Methods: The ACS NSQIP database (2005-2011) was used to identify 4805 pancreaticoduodenectomy patients. Cases were divided at the mean operative time (ORtime) into those ≤373 (n = 2638, 54.9%) vs ≥373 minutes in duration. Complications and outcome measures were compared and predictors of 30-day mortality were assessed. Results: Age ≤ 65 years, male sex, prior chemotherapy, prior radiation, disseminated cancer, diabetes, recent MI, no prior TIA, lower bilirubin and platelet count, and higher prothrombin time were associated with ORtime > 373 minutes. Patients with ORtime > 373 minutes demonstrated more intraabdominal and superficial infections, wound dehiscence, bleeding requiring transfusion, need for reintubation, septic shock, and returns to OR. ORtime > 373 minutes was associated with longer hospital stay and increased 30-day mortality. ORtime > 373 minutes was a significant and independent predictor in a stepwise model of 30-day mortality. Conclusions: Shorter pancreaticoduodenectomy operative time is associated with fewer complications, shorter hospital stays and lower 30-day mortality after adjusting for patient factors. This may imply that shorter operative time is associated with superior surgical outcome. Operative time may provide insight into surgeon performance and be considered for use as a quality metric. Corresponding author. G. M. Garnett et al. 419


Introduction
Compelling evidence suggests that improved outcomes in pancreatic and other complex surgeries can be achieved through centers of excellence [1]- [6]. However, part of this improvement is negated by patients' increased travel time, travel expenses, loss of income, and separation from their social support systems. Furthermore, regionalization of complex surgeries can lead to health care disparities and reductions in access to care [7]- [11]. Those patients, who do not have the finances or social support system, may not be able to travel to these centers of excellence and may receive either suboptimal care or no treatment for their complex problem. Centers of excellence are currently defined by hospital volume for a particular procedure and have been identified for such procedures as pancreatic resections, esophageal resections, and cardiac surgery. Taken as a sole criterion, procedure volume has been shown to be arbitrary with poorly defined volume thresholds, and untrustworthy [9] [12]- [15]. Nevertheless, centers of excellence have marketed their expertise, recruited more surgeons, and increased their referrals aiming to increase their volume of complex procedures. This continued regionalization improves the volume of the large centers but as an unintended consequence, it undermines local expertise and deters recruitment of highly skilled surgeons to smaller hospital settings [9]. While case volume provides a simple metric for assessing quality, additional quality metrics are needed to develop appropriate referral patterns and improve outcomes for all patients. The objective of this study is to determine if operative time for pancreaticoduodenectomy (PD) may be used as a prognostic indicator and thus a determinant of surgeon quality.

Methods
Data for this study was obtained from the American College of Surgeons National Surgical Quality Improvement Program (ACS NSQIP). ACS NSQIP is a prospective, multi-institutional, clinical registry created by the Veterans Health Administration in 1994 for quality improvement purposes. Over 130 pre-operative through 30-day post-operative variables are collected on a randomly selected sample of patients, including patient demographics, surgical profile, preoperative risk assessment, laboratory values, operative information, and 30-day morbidity and mortality rates. A highly trained Surgical Clinical Reviewer (SCR) collects the data. All reviewers receive extensive initial training prior to starting data collection and ongoing training via continuing education. ACS NSQIP monitors accrual rates and data sampling methodologies and conducts audits on a random basis, ensuring highly reliable data [16]. This review of pancreatic resections in the ACS-NSQIP study was approved by the Institutional Review Board by our local medical center.
ACS NSQIP participant files for the years 2005-2011 were reviewed and Current Procedure Terminology (CPT) codes were used to identify all patients who underwent pancreatic procedures (48100-48999). We then narrowed down these codes to include only those codes that clearly identified a pancreaticoduodenectomy: CPT code 48150: Pancreatectomy proximal subtotal with total duodenectomy, partial gastrectomy, choledochoenterostomy and gastrojejunostomy (Whipple procedure) with pancreatojejunostomy CPT code 48153: Pancreatectomy proximal subtotal with near total duodenectomy, choledochoenterostomy and duodenojejunostomy (pylorus sparing), Whipple-type procedure with pancreaticojejunostomy (2013 CPT Professional Edition, American Medical Association) After inclusion of these two codes, the "principal treatments" listed with each of the procedures were reviewed. Only those procedures with principal treatments listed as "pancreatectomy with pancreaticojejunostomy" or "pancreatectomy, proximal with pancreaticojejunostomy" were included in our analysis. Cases listing "pancreatectomy" and "partial removal of pancreas" were excluded to ensure there was no miscoding of other types of pancreatic resections such as distal pancreatic resections or enucleations. Patient demographics included sex, age, smoking, and alcohol use. The comorbidities considered were diabetes, chronic obstructive pulmonary disease (COPD), myocardial infarction (MI) within 6 months, congestive heart failure (CHF), hypertension requiring medications, disseminated cancer, and transfusions within 3 days prior to surgery. Post-operative complications of interest were superficial surgical site infection, deep surgical site infection, organ space surgical site infection, wound disruption, pneumonia, urinary tract infections, unplanned intubation, pulmonary embolism, deep vein thrombosis, cardiac arrest requiring cardiopulmonary resuscitation, myocardial infarction, intraoperative or postoperative transfusions, sepsis and septic shock.
Operative time was defined as the time between start of the surgery (incision) and the finish of surgery (closure of the skin). Room times and anesthetic times were not included in this definition. Operative times were noted and all patients who had operative time listed as less than 120 minutes were excluded to avoid any possible data entry errors. Mean operative time of the remaining patients was 373 minutes and this was the chosen cut-off value for establishing the groups. This study thus analyzed two groups: 1) Operative time equal to or less than 373 minutes and 2) Operative time greater than 373 minutes. All of the demographics, laboratory values, and post-operative complications were compared between the 2 groups in each of the analyses.
Finally, the operative times in the mentioned groupings were compared in terms of other outcome measures including hospital length of stay (LOS), 30-day mortality and time from operation to death in those patients who expired in the perioperative period.

Statistical Analysis
The association between patient characteristics, pre-operative laboratory values, and surgical complications were compared by operative time groups. Categorical and dichotomous variables were compared using the chi-square test, and continuous variables were compared using the t-test. The laboratory tests were log-transformed to meet the requirements of the t-test, and geometric means are displayed. To understand whether operative time is an independent predictor of the outcome measures, stepwise regression models of 30-day mortality and length of hospital stay were performed where all pre-operative factors and operative time were eligible for entry. Entry of operative time was considered a reflection of its importance as a predictor. A stepwise logistic regression was performed for 30-day mortality, with associated risks expressed as odds ratios (OR) with 95% confidence intervals (CIs). A stepwise linear regression was performed for hospital stay. For the stepwise tests, the laboratory tests were entered as indicator variables signifying low and high values, as listed in MedLine Plus (http://www.nlm.nih.gov/medlineplus/ency/article/003646.htm). All reported p values are two-tailed, and for all tests, p < 0.05 was considered statistically significant.

Results
In this analysis of the ACS-NSQIP database, 11,148 patients had CPT codes 48150 or 48153. Of these patients, 6308 patients were listed with the principal procedures "pancreatectomy" or "partial pancreas resection" and were excluded from the study. This was to ensure that the data included only pancreaticoduodenectomies and excluded distal pancreatic resections, enucleations and central pancreatectomies. Of the remaining 4840 patients, 35 patients had an operative time listed as 0 to 120 minutes and were also excluded from the study. Our study population thus included 4805 patients.
In the overall cohort of 4805 patients, mean age was 63.9 years and 51.6% were males. The mean operative time (ORtime) was 373.0 minutes (SD 130.3 minutes) with a range of 121 to 1295 minutes. Median operative time was 358 minutes. Distribution of operative times is shown in Figure 1.
In comparing patient characteristics, shorter operative times, ORtime < 373 minutes, were more likely to be among patients 65 years or older and female. Patients in the longer operative group were more likely to have diabetes, history of MI within 6 months of surgery, disseminated cancer, no history of TIA and prior radiation or chemotherapy ( Table 1). Preoperative laboratory values were generally similar in both groups ( Table 2). However, longer operative times were associated with lower bilirubin values and platelet counts, and higher prothrombin times (p < 0.05). These laboratory values were within normal range for both groups.
In terms of post-operative complications, the details are noted in Table 3. Patients in the group with ORtime ≥ 373 minutes were more likely to have superficial skin infections, intra-abdominal infections, wound dehiscence, need for reintubation, bleeding requiring transfusion, septic shock, and returns to the operating room. In both comparisons, patients with longer operative times had longer hospital length of stay and increased 30-day mortality ( Table 4).
Operative time > 373 minutes was found to be a significant predictor of 30-day mortality, by entering the stepwise logistic regression, along with the following preoperative factors: age 65 or higher, history of COPD   (Table 5). Similarly, operative time > 373 minutes was found to be a significant predictor of longer hospital stays, by

Discussion
Patients are increasingly referred to high-volume centers of excellence for PD based on early studies that suggested superior outcomes [1] [8] [10] [11]. However, recent studies have questioned the volume metric claiming that there is no reliable single measure of outcome or quality [9] [12]- [15]. Our study was inspired by this questioning of the volume quality metric and aimed to evaluate PD operative time as an alternative measure of surgeon quality. We demonstrated that shorter operative times for PD were associated with favorable outcomes including fewer complications, shorter hospital length of stay, and decreased 30-day mortality. Unfortunately, it was difficult to determine optimal time frames. We thus based the study on the mean cut-off time of 373 minutes. However, the differences in hospital length of stay and mortality were evident when the cutoff was varied to 373, 360 or 300 minutes. Predictors of 30-day mortality included operative time > 373 minutes, advanced age, a history of COPD and MI, hypertension requiring medication, low albumin and elevated creatinine and protime. Operative time more than 373 minutes was associated with increased bleeding requiring transfusions, intraabdominal infections, wound infections, wound dehiscence, reintubations, septic shock, and returns the operating room. This would suggest that long operations increased detrimental post-operative morbidity. Clearly outcome, and indirectly, resource utilization was affected by operative time. How do we currently measure a surgeon's ability to perform PD? For credentialing committees and hospital employers, surgical skill is very difficult to determine from job applications, letters of reference or evaluation reports as there is a lack of objective measures [17] [18]. Employers often use experience, determined by number of years in practice or number of cases performed, to determine surgical aptitude. We thus presume that quality in surgery is measured by quantity. However, many factors can affect outcome, and institutions hoping to recruit experienced, high-volume surgeons may be using data reflective of an institution's accomplishments as opposed to the individual surgeon's [1] [2] [9]. Thus, objective measures that can be applied to the individual surgeon, such as operative times, may provide a better assessment of surgical skills. Experience requirements also discriminate against newly trained yet highly skilled surgeons and cannot be used to distinguish the best surgeons. While a certain number of cases are often needed to overcome the "learning curve" in complex cases such as PD, once a surgeon has surpassed this volume threshold, annual volume has not been shown to significantly impact outcome [1] [19]. Operative time may thus provide for another objective measure of surgical proficiency and quality. What are the other shortcomings of the volume metric? While there is an inverse relationship between PD operative volume and morbidity and mortality, studies have failed to define a precise volume cutoff that clearly distinguishes high-volume centers of excellence from other institutions [13]. In addition, it is unclear how much PD volume actually contributes to favorable outcomes obtained at centers of excellence. There is a tremendous amount of variability in outcome amongst and even within centers of excellence that may be explained by infrastructure, particular processes, ancillary staff, and specific care providers including surgeons [2] [9] [13] [14]. This variability persists after controlling for important patient factors, such as age, mortality risk, illness severity, and admission status [14]. Hospital volume also fails to explain this variability at least in terms of mortality. One study estimated that less than 2% of the variability amongst hospitals in pancreatic resection perioperative deaths could be explained by hospital volume as other factors played a more important role in outcome [13]. The Leapfrog group, a consortium of health care purchasers whose members strive to improve health care safety, quality, and consumer value, has established a number of standards such as ICU staffing criteria, safe practice score, 5-star health grade ratings, and highly specialized physician services to serve as benchmarks for the achievement of high quality care. Joseph et al. demonstrated that as hospital PD volume increased, there was an incremental increase in the hospitals' fulfillment of several Leapfrog standards. In this study, "strong clinical support" as defined by the fulfillment of these Leapfrog staffing and quality standards and the presence of physician specialists (i.e. interventional radiologists, gastroenterology and surgery fellowship programs) was associated with lower mortality while hospital volume did not impact mortality [2]. Thus, while the volume metric may appear to be a reliable measure as it applies to an entire health care organization, the hospital volume of PD is indirectly related to factors that may be more influential on patient outcomes such as clinical resources, specialized personnel, and individual surgeon performance [2] [8] [9] [13] [14]. Additional quality metrics are clearly needed to appropriately credit and subsequently grow the specific resources most responsible for favorable outcomes.
Medical centers are increasingly constrained by cost and need for quality, yet they must balance this with the need for access to medical care and elimination of healthcare disparities. Hospital volume for complex procedures such as PD has governed referral patterns with the aim of improving outcomes through regionalization. However, there are significant variations in referral patterns to high-volume centers with fewer referrals of ethnic minorities, elderly and lower socio-economic groups [7] [8] [10] [11]. Less variation in high quality care may be feasible if local expertise can be better identified [9]. Perhaps operative time with PD volume could be used by referring physicians to identify highly skilled surgeons within small communities and improve access to high quality care. In addition, hospital recruitment to small centers would be enhanced as high performing surgeons would likely be willing to take jobs in more remote areas if they were assured local referrals based on individual merits. This too would improve access to high quality care that may currently be precluded by a volume-based referral pattern.
While operative time may be used to recruit experienced surgeons, the concept may also be applied for recruiting newly trained surgeons. Many centers again rely on volume to determine credentialing and require a surgeon to perform a certain number of cases before granting them privileges to perform the procedure independently [18] [19]. Acquiring the needed case number may require much time especially in smaller, more remote centers, which would limit recruitment. Perhaps credentialing committees may considering evaluating operative time in a fewer number of cases to determine competence. This may allow new and highly skilled surgeons to work in less established yet developing institutions and ultimately improve access to care.
There are many limitations to this study regarding both the use of operative time as a quality measure as well as the use of the ACS NSQIP database. NSQIP does not have information on the specific diagnosis for which PD or the details of surgery. Factors such as obesity, prior chemo-radiation, previous abdominal surgery, anatomic abnormalities, an additional organ/vascular resections are not captured. These additional factors, such as prior chemo-radiation may create more tedious and difficult dissections thus prolonging the operative time beyond what can be compensated for by technical prowess [20]. While large data registries, such as ACS NSQIP, provide for well-powered studies, they often lack information on specific nuances of the surgery that can affect outcome. Also, with any large database with numerous data entry participants, the accuracy of the data is always limited by the accuracy of the coders and staff entering the data.
One final limitation of the NSQIP database is the lack of information on the experience level of the surgeons performing the procedures and the degree of resident physician involvement in a case. Like operative time, resident involvement and education is a controversial subject when discussed in the realm of healthcare quality improvement initiatives. Resident involvement has been documented to prolong operative times though has not been shown to adversely affect outcomes and quality [21]- [24]. These studies can be somewhat misleading as many factors contribute including the level of resident, presence of fellows, level of involvement, variability in teaching and the tendency for complex cases to be referred to teaching hospitals. More recent ACS NSQIP data starting from 2012 does capture resident involvement and may be further analyzed to determine the impact of resident involvement on PD operative time. Our current study did not address this as the number of cases would be much smaller with analysis of only 2012 cases.

Conclusion
The use of any metric for determining quality is inadequate and potentially detrimental to efforts aimed at improving quality. While volume has been the surrogate quality metric for PD, operative time may be another measure of individual surgeon performance. Perhaps operative times in addition to volume may be used to assess quality, although specific criteria would require a large study in which identification of individual surgeons, operative times and surgical details are available. Additional studies are still needed to determine the accuracy of operative time as a quality metric for PD and other complex surgeries, and in identifying more accurate ways to define excellence.