|CJEB||Estimating The Incidence Of Barrett's Esophagus In Alberta|
Heng Wu(1), Tania Stafinski(1), Yutaka Yasui(1), Clarence Wong(2), Devidas Menon(1)
(1) Department of Public Health Sciences, University of Alberta, Room 3021, Research Transition Facility, 8308 114 Street, Edmonton, Alberta, Canada, T6G 2V2
(2) Division of Gastroenterology, University of Alberta, 331 Community Services Centre, Royal Alexandra Hospital, Edmonton, Alberta, Canada T5H 3V9
Corresponding author: Dr. D. Menon, Room 3021 Research Transition Facility, University of Alberta, Edmonton, Canada T6G 2V2; tel: 780-492-9080; e-mail: firstname.lastname@example.org
Background: In most westernized countries, the incidence of esophageal adenocarcinoma has been rising faster than that of any other cancer, and it continues to be associated with significant morbidity and mortality. One of its known risk factors is Barrett’s esophagus (BE), the incidence of which has yet to be estimated in Canadian populations. Given the burden of esophageal cancer on patients and the health care system, reliable incidence estimates for Canadian populations are needed in order to optimize the effectiveness of efforts aimed at managing the condition.
Methods: Administrative data (ambulatory visits, in-patient stays, and physician services) from Alberta for the 2002/03 to 2006/07 fiscal years were obtained. Because the diagnostic code specific for BE was first introduced in 2006, the annual incidence rate was only available in the fiscal year 2006/07 in Alberta. In order to estimate the trend of BE incidence in Alberta over the fiscal years 2002/03 to 2005/06, records for the fiscal year 2006/07, a data subset was created and randomly split into 2 parts: a training sample and a validation sample. The training sample was used to develop a logistic regression model for identifying diagnostic and procedure codes associated with a case of BE. After evaluating the model with the validation sample, it was used to estimate the incidence of BE from 2002/03 to 2005/06 fiscal years.
Results: The final model, which included interaction terms, was based on data from 6190 patients with 7006 patient visits in the 2006/07 fiscal year. The number of patients to whom the diagnostic code for BE had been assigned was 1938: the BE incidence rate in the 2006/07 fiscal year was 56.28 cases per 100,000 person years. According to the results of the validation exercise, the model correctly identified a diagnosis of BE approximately 80% of the time. Applying the model to data for 2002/03-2005/06, the number of new cases in Alberta was 642 (2002/03), 733 (2003/04), 628 (2004/05) and 792 (2005/06). These translate to annual estimated incidence rates for the same periods were 20.73, 23.23, 19.26 and 23.66 cases per 100,000 person years, respectively.
Conclusions: This study represents the first attempt to estimate the incidence of BE in Alberta. Its findings suggest that the incidence has remained relatively constant over the past few years.
Key words: Barrett’s esophagus; Logistic regression; Predictions; IncidenceIntroduction
Barrett's esophagus (BE) is a condition in which the normal lining of the esophagus, comprised of squamous cells, is replaced by a columnar-like epithelium found in the intestines by a process called intestinal metaplasia. While BE is a benign condition, it represents a well-established risk factor for esophageal adenocarcinoma, the most common type of esophageal cancer (1,2). Other risk factors include males, over the age of 50 years, Caucasian, high body mass index, smoking, uncontrolled gastroesophageal reflux and greater than 10 years of reflux symptoms (3,4). Studies suggest that the likelihood of developing esophageal adenocarcinoma among individuals with BE is 30 to 120 fold greater than that for individuals without BE (5,6). Although the incidence of esophageal cancer (all types) in Canada is less than 1% (7) it constitutes the 8th leading cause of cancer worldwide (8). Furthermore, survival is poor, with mortality to incidence ratios approaching 1.00 (8). Until recently, the provincial cancer registry did not collect information on esophageal cancer by cell type or disease stage. Therefore incidence trends for esophageal adenocarcinoma in Alberta are not available.
Esophageal adenocarcinoma, which often remains undiagnosed in its early stages, is relatively aggressive, spreading quickly once it has invaded the deeper layers of the esophageal wall. Incidence rates are among the highest of all cancers since 1975 (9). Consequently, its societal burden is considered to be disproportionately high (8,10). Since BE is a precursor to esophageal adenocarcinoma, estimates of its incidence are required for optimizing the effectiveness of current efforts aimed at better managing the condition at the provincial level. However, a recent comprehensive review of BE highlighted the lack of such information for provinces in Canada (11).
The objective of this study was to estimate the annual incidence of BE in the province of Alberta using information sources/data readily available to provincial level health planners and policy-makers.
The following data sets were provided by Alberta Health and Wellness: The Discharge Abstracts Database (DAD) (which contains information on inpatient hospital visits), the Ambulatory Care Classification System database (ACCS) (which contains information on outpatient visits), and the Fee for Service database (which contains information on physician visits). They spanned the fiscal years 2002/03 (when the province switched from ICD-9 to ICD-10 codes) to 2006/07, and included all visits by patients 18 years of age and older. The province first introduced the ICD-10-CA diagnostic code for BE (K22.7) in 2006/07.
Since the purpose of this study was to estimate the incidence of BE over a time frame during which there was no specific code for the condition, the following approach was used to identify records from the above databases between the fiscal years 2002/03 and 2006/07:
1.Potentially relevant diagnostic and procedure codes for BE were compiled with expert advice from local clinicians and administrators (Table 1).
2.Compiled codes, used alone (Table 1: Part A) or in combination (Table 1: Part B), were applied to the above databases in order to retrieve records (i.e., visits).
3.Retrieved records from each of the databases were merged into a single data set.
4.The data set was cleaned, removing any duplicate records.
Development of a logistic regression model to predict BE cases
Based on data from the fiscal year 2006/07 (when the diagnostic code for BE was introduced), the following approach was used to develop a logistic regression model for predicting BE cases in 2002/03 to 2005/06 fiscal years based on diagnostic and procedure codes:
1.Selecting records of the fiscal year 2006/07 to comprise the data set for the model building
Records of patients and patient visits in the fiscal year 2006/07 containing the specific diagnostic code for BE were first identified. The first three diagnostic codes of which were examined to determine whether or not at least one pertained to conditions/diseases of the esophagus, stomach or duodenum. In cases where none of the diagnostic codes were deemed relevant, the first three procedure codes were examined. If none of these codes appeared to be associated with the management of Barrett's esophagus, or if no procedure codes were recorded, the record was excluded from the data set. Thus, the included records with a diagnostic code of K22.7 contained at least one relevant diagnostic or procedure code. They were assigned a value of “1”, indicating presence of the BE condition. A summary of the codes used to determine inclusion/exclusion of records with a specific diagnostic code for BE in the model data set is presented in Table 2. The remaining records of 2006/07 were assigned a value of “0”, indicating cases with no BE but with potential for being confused as BE based on diagnostic and procedure codes. The selection of records was performed independently by two experienced researchers, who then met to compare findings. Agreement between reviewers was assessed using the Kappa statistic (12).
2.Partitioning the data set
Prior to building a model, the 2006/07 data set was split randomly into two, a training sample used to construct or fit the model and a validation sample used to evaluate it (assess the correct classification percentage) (13-15). We used 70% of the 2006/07 records to form the training sample, while the remaining 30% comprised the validation sample (13,15).
3.Building the model
Since the dependent variable/outcome of interest (i.e., presence or absence of the ICD-10-CA code for BE) was dichotomous, a logistic regression model was selected for the identification of BE cases. Nine independent variables were identified from the original data set based on diagnostic and procedure codes presented in Table 2. They included Esd (Diseases of esophagus, stomach and duodenum), Hernia (Diaphragmatic hernia, including hiatus hernia), Conmal (Congenital malformations of esophagus and other congenital malformations of upper alimentary tract), Post (Postprocedural disorders of digestive system, not elsewhere classified), Odd (Other diseases of digestive system), Car (Malignant neoplasm of the upper gastrointestinal tract), Esotomy (Esophagectomy), Ablation and Endoscopy (Endoscopy including Esophagoscopy). All statistical analyses were performed using SPSS 17.0 software (13).
We used all independent variables to make a main-effects model without any model selection, because the data set was large, there were only 9 independent variables, and our interest was on prediction. We added interaction terms to the main-effects model using likelihood ratio tests (16). The correct classification percentage of the main-effects model and that of the interaction model were computed in the validation sample.
4.Estimating the incidence of BE
The model was employed to derive the predicted probability of BE for every patient visit/ record in the data set between the fiscal years 2002/03 and 2005/06. A cut-off value of 0.5 was used to categorize visits as BE or non-BE related. Specifically, if the predicted probability of BE was greater than 0.5, the corresponding record was classified as a BE visit. If the predicted probability was less than 0.5, the record was classified as a non BE visit (13).
To calculate the incidence of BE, we need to identify the first BE visit of each patient. If a patient had multiple visits during 2002/03 to 2005/06, only the first patient visit which was assigned as a BE visit was counted in the incidence estimation.
To calculate the annual incidence rate, defined as the number of new cases during a specified year per 100,000 individuals, the following formula was applied (Equation 1): (17)
Annual incidence rate =
For “population at risk”, we used an estimated mid-year population within each fiscal year (i.e., October 1st), obtained by linearly interpolating Alberta population estimates from Statistics Canada that are based on the population as of July 1st each year
(i = calendar year 2002, 2003, 2004, 2005, or 2006)
The study data set, spanning the fiscal years 2002/03 to 2006/07, included 21,250 patients with 26,278 patient visits, after removing duplicates. The 2006/07 sample initially contained 1,995 patients with a diagnostic code of K22.7 including 2,265 visits, but after applying the study’s inclusion/exclusion criteria (Table 2), 57 patients with a diagnostic code of K22.7 including 60 patient visits were excluded. Therefore, the sample size for the model was 1,938 patients with the specific diagnostic code for BE representing 2,205 visits plus 4,252 patients without the specific diagnostic code for BE representing 4,801 visits. The degree of concordance between the 2 researchers who applied the inclusion/exclusion criteria to construct the model data set was K = 1.0, indicating perfect agreement (12).
The classification accuracy of our main-effects model in the validation sample was: 76.8% in BE patient visits and 82.8% in non-BE visits (Table 3).
Overall, 80.9% patient visits in the validation sample (Table 3) were correctly classified, consistent with the prediction accuracy of 79.3% in the training sample.
Thirty-six potential interaction terms were identified, each of which was added to the main-effects model, one at a time, to assess its statistical significance over and beyond the main-effects model.(16) Of the 36 terms, the following 6 were found to be statistically significant with p-value < 0.05: Endoscopy×Hernia; Endoscopy×Odd; Endoscopy×Post; Endoscopy×Car; Esotomy×Car; and Esotomy×Endoscopy. When the interaction term Esotomy×Endoscopy was added to the main-effects model along with the other 5 interaction terms, it was no longer statistical significant (p = 0.62): thus, it was dropped from the full model.
The full model with the 5 remaining interaction terms (Equation 3), when applied to the validation sample, correctly identified 76.7% of the BE visits and 82.9% of the non-BE visits (Table 4). Overall, 80.9% of the visits in the validation sample were correctly classified by the model (Table 4), consistent to the prediction accuracy of 79.4% in the training sample.
The correct classification percentages of both the training and validation samples for the main-effects model and full model were similar. However, the Hosmer and Lemeshow goodness of fit for the main effects model was statistically significant (p < 0.001), while that for the full model was not (p = 0.94) (16,18). Thus, the full model appears to fit the data better than the main-effects model, even though the prediction accuracy (i.e., dichotomizing at the prediction probability of 0.5) did not differ between the two models (16). The full model was used to predict BE cases in the fiscal years 2002/03 to 2005/06 subsequently.
The interpolated population estimates and the estimated numbers of BE incidence/incidence rates for fiscal years 2002/03 to 2005/06 are shown in Table 5 and Table 6, respectively. The estimated annual incidence rate ranged from 19.26 cases per 100,000 person years to 23.66 cases per 100,000 person years. As illustrated in Figure 1, the trend remained relatively stable over the 4-year time period.
Using administrative data and logistic regression, we estimated the annual incidence rate of BE in the province of Alberta over a 4-year time period. Since all health care related to the detection and management of BE is covered through Alberta’s publicly funded health care system, administrative data provided by the ministry of health captured the entire adult population of the province. To our knowledge, this study represents the first attempt to calculate such values for a Canadian province.
Internationally, previously published studies examining the incidence of BE have suggested that rates are correlated with the use of endoscopy (21). Between January 1, 1965 and January 1, 1998 the incidence of BE in Olmsted County, USA, increased 28-fold (0.37 to 10.5 cases per 100,000 person years) (21). Over that same time period, the utilization of endoscopy increased 22-fold (21). In our study, the incidence of BE and the number of endoscopies performed per year were found to be relatively constant over the four years. This difference may be explained by the fact that endoscopy has been part of routine practice for well over a decade. Therefore, any variations resulting from the introduction of a new technology would have stabilized before the starting point for our study (2002).
When we used model selection procedures such as the stepwise method (16), the prediction errors were approximately equal to the model with all 9 variables.
Nevertheless, limitations of our study remain. The data used were administrative data, and it was not possible to verify its accuracy or completeness (22). One recent study suggested that prevalence is overestimated when administrative data are used (23). At the same time, another study reported that the incidence and prevalence are underestimated with administrative data (24). Secondly, no comparison between the actual and the predicted incidence rate could be made, since the actual number of new BE cases per year was not available. Thirdly, a chart review, considered to be the gold standard for identifying patients with a particular condition was not performed. Therefore, sensitivity and specificity for our 2002/03 to 2005/06 fiscal years could not be assessed (24-27). However, it is important to note that the purpose of this study was to estimate incidence using data readily accessible to policy makers facing tight time and resource constraints. Organizing and conducting a chart review of hundreds of records would not have been feasible.
Understanding the incidence of BE is critical for developing strategies aimed at preventing esophageal adenocarcinoma. Considerable debate over screening for BE in the general population exists. Therefore, estimates of BE incidence are needed in order to develop economic models to define cost-effective screening programs. Also, such estimates are needed for more accurately planning resources requirements for BE treatment programs. Treatment often involves resection of BE using endoscopic ablative technologies and/or surgery.
The relatively constant incidence of BE in Alberta over the time period of the study is not surprising. First, there are no formal recommendations for BE screening. Consequently, physicians may not be referring patients with clinical risk factors for assessment. Second, adenocarcinoma is often diagnosed in its later stages. In some cases, the tumour has already replaced the Barrett’s mucosa. As a result, the underlying BE may only become apparent if the tumour regresses following chemotherapy (28). Third, the incidence of BE is tied to the number of endoscopies performed, and utilization has not risen significantly over the last 5 years. Fourth, there is likely a time lag effect related to the progression from BE to cancer. This time lag has yet to be established, but may be 10 to 15 years. In fact, studies have shown an increase in BE that mirrors esophageal adenocarcinoma if a 15 year period is examined (29). The length of time considered in our study was considerably shorter.
To our knowledge, this study is unique in that it attempts to estimate BE incidence using data at the population level. Such information is important for planning prevention programs for esophageal adenocarcinoma, for which BE is a precursor. Estimates of the annual incidence of BE in Alberta over the fiscal years 2002/03 to 2005/06 suggest that rates have not varied considerably. The introduction of the specific diagnostic code for BE in the fiscal year 2006/2007 now makes it possible to calculate the number of new coded cases each year. Therefore, the accuracy of the predictions made by the regression model developed for this study could be assessed when such data become available.
This study was made possible through a financial contribution from Alberta Health and Wellness and under the auspices of the Alberta Health Technologies Decision Process initiative: the Alberta model for health technology assessment and policy analysis. The views expressed herein do not necessarily represent the official policy of Alberta Health and Wellness.
We acknowledge Dr. Xiaoming Wang at the University of Alberta for his valuable discussions on the statistical issues of this work. We also thank Paul McCann, Leigh-Ann Topfer, and Dr. John Walker at the University of Alberta for their advice and support throughout the study.
(1) Sharma P, Sampliner R. Barrett's Esophagus and Esophageal Adenocarcinoma. 2nd ed. Malden, Massachusetts; Oxford, UK: Blackwell Publishing Ltd.; 2006.
(2) Pearson FG, Cooper JD, Deslauriers Jea editors.
Thoracic Surgery. 2nd ed. Philadephia: Churchill Livingstone; 2002.
(3) Lagergren J, Bergstrom R, Linfgren A, Nyren O. Symptomatic gastroesophageal reflux as a risk factor for esophageal adenocarcinoma, New England Journal of Medicine, (1999) 340(11):825-831.
(4) Farrow DC, Vaughan TL, Sweeney C, Gammon MD, Chow W-H et al Gastroesophageal reflux disease, use of H2 receptor antagonists, and risk of esophageal and gastric cancers. Cancer Causes & Control 2000; 11(3):231
(5) Shaheen N, Ransohoff DF. Gastroesophageal reflux, barrett esophagus, and esophageal cancer: scientific review. JAMA 2002 Apr 17;287(15):1972-1981.
(6) Spechler SJ, Lee E, Ahnen D, et al. Long-term outcome of medical and surgical therapies for gastroesophageal reflux disease: follow-up of a randomized controlled trial. JAMA 2001 May 9;285(18):2331-2338.
(7) Canadian Cancer Society. Canadian Cancer Statistics 2009. 2009. (http://www.cancer.ca/canada-wide/about%20cancer/cancer%20statistics/canadian%20cancer%20statistics.aspx?sc_lang=en). (Accessed September 29, 2009).
(8) Kamangar F, Dores GM, Anderson WF. Patterns of cancer incidence, mortality, and prevalence across five continents: defining priorities to reduce cancer disparities in different geographic regions of the world. J.Clin.Oncol. 2006 May 10;24(14):2137-2150.
(9) Pohl H, Welch HG. The role of overdiagnosis and reclassification in the marked increase of esophageal carcinoma incidence. Journal of the National Cancer Institute, 2005; 97(2):142-146
(10) Clifton JC, Finley RJ, Gelfand G, et al. Development and validation of a disease-specific quality of life questionnaire (EQOL) for potentially curable patients with carcinoma of the esophagus. Dis.Esophagus 2007;20(3):191-201.
(11) Photodynamic therapy for the treatment of Barrett's esophagus: a systematic review and economic evaluation. Health Technol Policy Series 2009;1.
(12) Haley SM, Osberg JS. Kappa coefficient calculation using multiple ratings per
subject: a special communication. Phys. Ther. 1989;69(11): 970-974
(13) SPSS Inc., inventor. SPSS statistical software, release 17.0. Illinois/USA patent. 2008 .
(14) Moss S. Cross validation in discriminant function analysis. 2008. (http://www.psych-it.com.au/Psychlopedia/article.asp?id=157). (Accessed July 7, 2009).
(15) Schneider J. Cross validation. 1997. (http://www.cs.cmu.edu/~schneide/tut5/node42.html). (Accessed June 10, 2009).
(16) Hosmer DW, Lemeshow S editors. Applied Logistic Regression. 2nd ed. New York: Wiley; 2000.
(17) Merrill RM. Introduction to Epidemiology. 5th ed. Sudbury, Massachusetts, USA: Jones and Bartlett Publishers, LLC; 2010.
(18) SAS institute Inc. The hosmer-lemeshow goodness-of-fit test. 2009. (http://support.sas.com/documentation/cdl/en/statug/59654/HTML/d efault/statug_logistic_sect039.htm). (Accessed September 10, 2009).
(19) Statistics Canada. Population by year, by province and territory. 2009. (http://www40.statcan.gc.ca/l01/cst01/demo02aeng. htm?sdi=population). (Accessed August 18, 2009).
(20) Statistics Canada Demography Division. Annual Demographic Statistics, 1999. 2000;91-213-XPB:17.
(21) Conio M, Cameron AJ, Romero Y, et al. Secular trends in the epidemiology and outcome of Barrett's oesophagus in Olmsted County, Minnesota. Gut 2001 Mar;48(3):304-309.
(22) Iezzoni LI. Assessing quality using administrative data. Ann.Intern.Med. 1997 Oct 15;127(8 Pt 2):666-674.
(23) Corley DA, Kubo A, DeBoer J, et al. Diagnosing Barrett's esophagus: reliability of clinical and pathologic diagnoses. Gastrointest.Endosc. 2009 May;69(6):1004-1010.
(24) Dodds L, Spencer A, Shea S, et al. Validity of autism diagnoses using administrative health data. Chronic Dis.Can. 2009;29(3):102- 107.
(25) Quan H, Li B, Saunders LD, et al. Assessing validity of ICD-9-CM and ICD-10 administrative data in recording clinical conditions in a unique dually coded database. Health Serv.Res. 2008 Aug;43(4):1424-1441.
(26) Guttmann A, Nakhla M, Henderson M, et al. Validation of a health administrative data algorithm for assessing the epidemiology of diabetes in Canadian children. Pediatr.Diabetes 2010; 11(2):122-128.
(27) Benchimol EI, Guttmann A, Griffiths AM, et al. Increasing incidence of paediatric inflammatory bowel disease in Ontario, Canada: evidence from health administrative data. Gut 2009 Nov;58(11):1490-7
(28) Theisen J, Stein HJ, Dittler HJ, Feith M, Moebius C et al. Preoperative chemotherapy unmasks underlying Barrett's mucosa in patients with adenocarcinoma of the distal esophagus. Surgical Endoscopy, 2002; 16(4):671-673
(29) Skinner DB, Walther BC, Riddell RH, Schmidt H, DeMeester TR. Barrett's esophagus. Comparison of benign and malignant cases. Annals of Surgery, 198:554-565.