The Effect of Coding Method on Cause-of-Death Rankings

Abstract

Background: Cause-of-death rankings are often used for planning or evaluating health policy measures. In the European Union, some countries produce cause-of-death statistics by a manual coding of death certificates, while other countries use an automated coding system. The outcome of these two different methods in terms of the selected underlying cause of death for statistics may vary considerably. Therefore, this study explores the effect of coding method on the ranking of countries by major causes of death. Method: Age and sex standardized rates were extracted for 33 European (related) countries from the cause-of-death registry of the European Statistical Office (Eurostat). Wilcoxon’s rank sum test was applied to the ranking of countries by major causes of death. Results: Statistically significant differences due to coding method were identified for dementia, stroke and pneumonia. These differences could be explained by a different selection of dementia or pneumonia as underlying cause of death and by a different certification practice for stroke. Conclusion: Coding method should be taken into account when constructing or interpreting rankings of countries by cause of death.

Share and Cite:

Harteloh, P. (2023) The Effect of Coding Method on Cause-of-Death Rankings. Open Journal of Statistics, 13, 778-788. doi: 10.4236/ojs.2023.136039.

1. Introduction

In health policy, rankings provide data a structure in order to become information. Diseases, causes of death, hospitals, health care interventions and even countries are ranked in an attempt to inform public, government or politicians about the effect of prevention or therapy, the quality of care, or the value for money spend on the health care system. Cause-of-death rankings are part of this information. They are often a first and regular output of mortality statistics [1] [2] [3] . However, different methods are used for the production of cause-of-death statistics. In some European countries, death certificates are processed manually by medical coders and in others automatically by dedicated software (Styx, MIKADO, MMDS, ACME, Iris) [4] - [9] . These two different methods have led to different outcomes in terms of selecting an underlying cause of death for mortality statistics [10] [11] [12] [13] [14] . Therefore, this study investigates to what extent a difference in method (manual versus automated coding) affects the ranking of countries by major causes of death.

2. Methods

The cause-of-death registry of the European Statistical Office (Eurostat) provided the material for this study. This registry contained data of 34 countries: 27 European Union (Eu) member states and 7 Eu-related countries [15] . Ten major causes of death were included for an analysis of rankings. For women, malignant neoplasm of the breast and for men malignant neoplasm of the prostate were added. Ill-defined and unknown causes of mortality were also included as indicator for the quality of a death registration and accidental falls as a major external cause of death. Data standardized for age and sex of the year 2017, a representative year for mortality statistics without epidemics, war or disasters and enough countries (still) coding manually, was extracted from the Eurostat database. A metafile provided information about the method countries used for producing their cause-of-death statistics [4] . Thus, countries coding automatically (AC) and countries coding manually (MC) could be distinguished. Data of Liechtenstein were not included in the rankings. Due to the (very) low number of yearly deaths not all major causes of death were present in their mortality data. Germany was split in states (“Länder”) coding manually (Germany-MC) and states coding automatically (Germany-AC) [4] [7] . For ranking the median rate of a cause of death in these different kind of states was used.

Countries were ranked by age and sex standardized death rates for major causes of death from high (rank number 1) to low (rank number 34), i.e. the higher the cause specific mortality, the lower the rank number. The Mann-Whitney U/Wilcoxon rank-sum test (two tailed, α = 0.05) was applied to these rankings in order to identify statistically significant differences between countries coding manually and countries coding automatically [16] .

The Eurostat database contained underlying causes of death, defined as the starting point of the causal chain of morbid events leading to death ( [17] : p32). The 10th edition of the International Statistical Classification of Diseases (ICD-10) was used by all countries contributing to the Eurostat database of 2017.

Cause-of-death statistics are a tabulation of diagnoses on death certificates that are often stated in rather general terms. Therefore, epidemiology required grouping. As “dementia not otherwise specified” (ICD-10: F03) contained an unknown i.e. not diagnosed proportion of deceased with Alzheimer’s disease (ICD-10: G30) or vascular dementia (ICD-10: F01), the group of dementia related diseases was considered as one cause of death (ICD-10: F01-F03, G30-G31). The same holds for stroke, i.e. cerebrovascular accidents. A not otherwise specified cerebrovascular accident (ICD-10: I64) must either have been ischemic (ICD-10: I63) or hemorrhagic (ICD-10: I61). So the group of cerebrovascular accidents, late effects included (ICD-10: I69), was also considered as one cause of death (ICD-10: I60-I69). The SPSS-25 package was used for conducting the rank-sum tests.

3. Results

Of the 33 countries analyzed, 13 appeared to be coding manually and 20 reported the use of an automated coding system (Table 1). Sweden (1990), England/Wales (1993) and Italy (1995) belonged to the early users of automated coding systems in Europe. They started to use the US-MMDS/ACME system and recently switched to Iris, i.e. the language independent software for producing underlying causes of death for mortality statistics. Most countries in the European region switched from manual to automated coding in the 2000-s. Poland and Switzerland reported testing an automated coding system, but they did not switch to the regular use of such a system for producing their cause-of-death statistics. However, testing the system might have had an effect on the manual coding of death certificates as well. There was no effect of the number of deceased on the use of an automated coding system.

Table 1 presents the rankings of 33 Eu-related countries by major causes of death. Countries coding automatically reported statistically significant higher death rates for dementia (incl. Alzheimer’s disease) and lower death rates for stroke and pneumonia than countries coding manually. For major causes of cancer deaths, such as lung-, colon-, prostate-, or breast cancer the difference between countries coding manually or automatically was not statistically significant. This was also the case for chronic (ischemic) heart diseases, Chronic Obstructive Pulmonary Disorders (COPD), diabetes mellitus, ill-defined causes of death and accidental falls.

4. Discussion

This study. This study identifies an effect of method (manual versus automated coding) on the ranking of countries for three major causes of death: dementia, stroke and pneumonia. Countries coding manually report on average two times more dementia as underlying cause of death than countries coding automatically (Table 1: ratio MC/AC = 0.5). This finding can be explained by the preference of automated coding systems for dementia when noted in part 2 of the death certificate, the place for contributing causes of death. In such cases the selection of dementia as underlying cause of death is a statistical decision (selection rule 3 in the ICD-10 manual/volume 2), not completely supported by the judgment of clinicians [17] [18] . The result is a 14% - 26% increase of dementia as underlying

Table 1. Major causes of death for 33 EU (related) countries ranked by standardized death rates per 100,000 inhabitants. Source: Eurostat (hlth_cd_asdr2).

cause of death when countries switch from manual to automated coding [10] [11] [12] [13] [14] . The different increase in occurrence makes it hard to apply a fixed percentage as correction factor for all countries to align the outcome of manual coding with that of automated coding and restore the homogeneity of the ranking. Thus, for studying dementia as cause of death, countries should be placed in separate rankings according to their method of coding.

For cerebrovascular accidents (stroke), countries coding manually report the disorder on average 1.7 times more as underlying cause of death than countries coding automatically (Table 1: ratio MC/AC = 1.7). Automated coding systems tend to code more to the causes of cerebrovascular accidents (hypertension, cardiac arrhythmia, etc.), if reported on the death certificate, than manual coders do. This tendency of automated coding systems was corrected by a software update in 2019. Cerebral hemorrhage and not hypertension was selected as underlying cause of death when “cerebral hemorrhage due to hypertension” was encountered on a death certificate [19] . However, as not all countries apply the updates of the automated coding system every year, the occurrence of stroke in cause-of-death statistics depends on the software version in use. Thus, for stroke the outcome of coding may vary considerably according to the national certification practice (cause reported on death certificate or not?), and both to the manual (coder rules) and automated coding (software/update version) method. Therefore, countries should not be ranked by stroke as cause of death.

For pneumonia, countries coding manually report the disease on average 50% more as underlying cause of death than countries coding automatically (Table 1: ratio MC/AC = 1.5). According to the ICD-10 coding instructions, pneumonia is a consequence of any other disease (unless ill-defined) reported on a death certificate [17] . In practice, this instruction is followed more strictly by automated coding systems than by medical coders as the 21% - 44% decrease in occurrence of pneumonia as underlying cause of death shows when countries switch from manual to automated coding [10] [11] [12] [13] [14] . Thus, for studying pneumonia as a cause of death, countries should be placed in separate rankings according to their method of coding.

For other causes of death, this study shows no effect of coding method on the ranking of countries. The ranking can be used for health policy purposes. Malignant neoplasms seem to be a clear cause of death. A malignancy underlies the transition from health to disease and is the beginning of a causal chain leading (eventually) to death. There appeared to be no large differences in frequency by coding method. Rankings provide a reason for further research to investigate why countries such as Hungary, Croatia and Slovakia show relatively high and countries such as Turkey, Cyprus and Switzerland show relatively low mortality rates of malignancies.

The ranking of countries by ischemic heart diseases (mainly acute myocardial infarction) showed no bias by coding method and can be used to identify countries with a relatively low (Lithuania, Hungary, Latvia) or high (Netherlands, Portugal, France) effect of prevention and care on the mortality due to the acute myocardial infarction.

The ranking of countries by other heart diseases (mainly heart failure) is also not biased by coding method and can be used to indicate countries with a relatively high (Bulgaria, Serbia, Poland) or low (Lithuania, Finland, United Kingdom) mortality due to heart failure. As the occurrence of heart failure in cause-of-death statistics is mainly due to death certificates not reporting its cause (myocardial infarction, hypertension, heart valve disorder, etc.), this occurrence is also a measure for the quality of the certification practice in a country [20] .

Rankings of countries by symptoms and signs or ill-defined ICD-10 codes are used as indicator of the quality of cause-of-death registrations [21] . Since this ranking is not biased by coding method, it can indeed point out relatively accurate (Malta, Slovenia, Greece) or inaccurate (Poland, Serbia, Denmark) cause-of-death registrations.

The ranking of countries by accidental falls, a major external cause of death, is not biased by coding method and can be used to indicate countries with a relatively strong (Italy, Bulgaria, Spain) or relatively weak (Slovenia, Netherlands, Finland) effect of fall prevention.

Strengths and limitations of this study. This study draws attention to an underexposed aspect of ranking countries by causes of death. The coding method affects such rankings for some major causes of death. Up to date, this has not been studied before. The results of this study cannot be compared with other publications. Bridge- or double coding studies performed on death certificates generally show an increase in the occurrence of dementia and a decrease in the occurrence of pneumonia in cause-of-death statistics when countries switch from manual to automated coding, supporting the findings of this study. The case is more complicated for stroke. Some bridge coding studies show an increase and others a decrease in the frequency of stroke in cause-of-death statistics when countries switch from manual to automated coding [10] [13] . The (national) certification practice seems to determine the occurrence of stroke on death certificates, the material for cause-of-death statistics. This particular circumstance underpins the conclusion of this study not to compare countries on a ranking by stroke as cause of death.

A limitation of this study is inherent to the statistical test used. Wilcoxon’s rank sum/Mann-Whitney U test is a robust non-parametric test to investigate potential bias in rankings. However, the approach is not multifactorial and other factors than the coding method may also have influenced the rankings studied.

Interpretation of rankings. In this study, age and gender standardized data has been used. A difference in population structure cannot be an explanation for the differences found. The incidence or prevalence of a disease in relation to the quality of the health care system remains a possible explanation for the position of a country in a ranking that is not biased by coding method. For interpreting such (valid) rankings its intrinsic properties, the method of data production and the nature of the data have to be taken into account.

First of all, the properties of the ranking should be considered. Usually, it is not the difference between the first and second or, for instance, the eighth and ninth place in the ranking which counts, but the difference between the first and the last place in the ranking that is meaningful. This study shows, for example, rankings of malignancies with much less dispersion (9-50/100,000) than rankings of cardiovascular causes of death (269-474/100,000). This implies a difference between the first and last country in the ranking of a malignant neoplasms may be less relevant than the first and last position of a country in a ranking of a cardiovascular cause of death. So, the dispersion of data has to be taken into account when interpreting a ranking.

Second, this study shows the method used to produce the data has to be considered. A difference in method may bias the ranking, as the rankings for dementia, pneumonia and stroke as underlying cause of death show. This finding might change in time as more countries switch from manual to automated coding or countries adjust their (manual or automated) coding method. Therefore, countries coding manually or automatically should always first be considered apart before putting them into the same ranking, and a ranking should be tested on homogeneity, for example by Wilcoxon’s rank sum, before drawing conclusions on the place of a country in the ranking.

And last but not least, the nature of the data being ranked should be taken into account. Death is a serious event. But what is its cause? Since the 17th Century, statistics have been produced to monitor mortality [17] . In the 20th Century, the World Health Organization (WHO) provided a classification (the ICD) and guidelines for tabulation [17] [22] . These guidelines prescribe the selection of one (“the”) cause of death per deceased. However, this reflects an idea of causality (one cause-one effect) which is challenged nowadays by multi-morbidity and interacting (chronic) diseases at the end of life [23] [24] . For example, persons do not seem to die of dementia as such, but from the interaction of dementia with infectious diseases (pneumonia, urinary tract infections), complicating behavioral changes (stop eating/drinking) or accompanying accidents due to posture and gait problems [25] . What then is the one and only cause of death in such a case? Or, what is the cause of death in case of a stroke against the background of a long standing diabetes mellitus? And in case of pneumonia as cause of death, why did the person in times of abundant antibiotics and effective medical therapies actually died? Was she/he well treated or not? Immunocompetent or not? These kinds of examples encountered more and more on death certificates today, show the conceptual difficulty in assigning (only) one cause of death per deceased. More and more, the concept of a network of causes replaces the idea of one single cause of death [26] . So for interpreting a ranking, the appropriateness of the causal theory applied to a disease in relation to death should be taken into account. Given the results of this study, the causal theory underlying cause-of-death statistics seems to better fit malignancies or accidental falls than chronic diseases such as dementia, infectious diseases such as pneumonia, or risk-factor mediated events such as a stroke.

5. Conclusion

Cause-of-death rankings are widely used for comparing the health characteristics of different countries. There appear to be (statistically significant) differences in the ranking of countries coding the cause of death manually or automatically for dementia, stroke and pneumonia. The effect of coding method should be taken into account when constructing or interpreting rankings of countries by causes of death for policy purposes.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Heron, M. (2019) Deaths: Leading Causes for 2017. National Vital Statistics Reports, 68, 1-77.
[2] Eurostat. Statistics Explained. Causes of Death Statistics—People over 65.
https://ec.europa.eu/eurostat/statistics-explained/index.php/Causes_of_death_statistics_-_people_over_65
[3] Griffiths, C., Rooney, C. and Brock, A. (2005) Leading Causes of Death in England and Wales—How Should We Group Causes? Health Statistics Quarterly, 28, 6-17.
[4] Eurostat: Causes of Death. National Metadata.
https://ec.europa.eu/eurostat/cache/metadata/en/hlth_cdeath_sims.htm
[5] Jougla, E., Pavillon, G., Rossollin, F., De Smedt, M. and Bonte, J. (1998) Improvement of the Quality and Comparability of Causes-of-Death Statistics inside the European Community. EUROSTAT Task Force on “Causes of Death Statistics”. Revue D’epidemiologie et de Sante Publique, 46, 447-456.
[6] IRIS Institute (2014) IRIS User Reference Manual V4.4.1. IRIS Institute, Cologne.
https://www.bfarm.de/SharedDocs/Downloads/EN/Code-Systems/iris-institute/manuals/iris-user-reference-manual-v4-4-1s1_pdf.html?nn=922496&cms_dlConfirm=true&cms_calledFromDoc=922496
[7] Eckert, O. (2019) Electronic Coding of Death Certificates. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz, 62, 1468-1475. (In German)
https://doi.org/10.1007/s00103-019-03045-2
[8] Israel, R.A. (1990) Automated Coding and Processing in the United States of America. World Health Statistics Quarterly, 43, 259-262.
[9] Lu, Th., Tsau, S.M. and Wu, T.C. (2005) The Automated Classification of Medical Entities (ACME) System Objectively Assessed the Appropriateness of Underlying Cause-of-Death Certification and Assignment. Journal of Clinical Epidemiology, 58, 1277-1281.
https://doi.org/10.1016/j.jclinepi.2005.03.017
[10] Harteloh, P. (2020) The Implementation of an Automated Coding System for Cause-of-Death Statistics. Informatics for Health and Social Care, 45, 1-14.
https://doi.org/10.1080/17538157.2018.1496092
[11] Floristán, F.Y., Delfrade, O.J., Carrillo, P.J., Aguirre, P.J. and Moreno-Iribas, C. (2016) Coding Causes of Death with IRIS Software. Impact in Navarre Mortality Statistic. Revista Española de Salud Pública, 90, e1-e9. (In Spanish)
[12] Orsi, C., Navarra, S., Frova, L., et al. (2019) Impact of the Implementation of ICD-10 2016 Version and Iris Software on Mortality Statistics in Italy. Epidemiologia & Prevenzione, 43, 161-170. (In Italian)
[13] McKenzie, K., Walker, S and Tong, S. (2002) Assessment of the Impact of the Change from Manual to Automated Coding on Mortality Statistics in Australia. Health Information Management, 30, 1-11.
[14] Martins, R.C. and Buchalla, C.M. (2015) Automatic Coding and Selection of Causes of Death: An Adaptation of Iris Software for Using in Brazil. Revista Brasileira de Epidemiologia, 18, 883-893.
[15] Eurostat. Causes of Death—Standardized Death Rate by NUTS-2 Region of Residence.
https://ec.europa.eu/eurostat/data/database
[16] MacFarland, T.W. and Yates, J.M. (2016) Mann-Whitney U Test. In: MacFarland, T.W. and Yates, J.M., Eds., Introduction to Nonparametric Statistics for the Biological Sciences Using R, Springer, Cham, 103-132.
https://doi.org/10.1007/978-3-319-30634-6_4
[17] World Health Organisation (WHO) (2016) International Statistical Classification of Diseases and Related Health Problems, 10th Revision. Volume 2 (Instruction Manual). WHO, Geneva.
[18] Harteloh, P. (2020) The Role of Dementia as Cause of Death: Certifier’s Opinions versus Automated Coding. Dementia and Geriatric Cognitive Disorders, 49, 511-517.
https://doi.org/10.1159/000510678
[19] World Health Organization (WHO). List of Official ICD-10 Updates.
https://www.who.int/standards/classifications/classification-of-diseases/list-of-official-icd-10-updates
[20] Harteloh, P. (2023) The Overestimation of Heart Failure in Cause-of-Death Statistics. Clinical Cardiovascular Research, 2, 1-6.
https://doi.org/10.58489/2836-5917/010
[21] World Health Organization (WHO). Mortality Database: Ill-Defined Diseases.
https://platform.who.int/mortality/themes/theme-details/MDB/ill-defined-diseases
[22] World Health Organization (WHO) (1949) Manual of the International Statistical Classification of Diseases, Injuries and Causes of Death. 6th Revision, Volume 1, WHO, Geneva, 345-346.
[23] Mackenbach, J.P., Kunst, A.E., Lautenbach, H., Oei, Y.B. and Bijlsma, F. (1997) Competing Causes of Death: A Death Certificate Study. Journal of Clinical Epidemiology, 50, 1069-1077.
https://doi.org/10.1016/S0895-4356(97)00165-0
[24] Grippo, F., Désesquelles, A., Pappagallo, M., Frova, L., Egidi, V. and Meslé, F. (2020) Multi-Morbidity and Frailty at Death: A New Classification of Death Records for an Ageing World. Population Studies, 74, 437-449.
https://doi.org/10.1080/00324728.2020.1820558
[25] Brunnström, H.R. and Englund, E.M. (2009) Cause of Death in Patients with Dementia Disorders. European Journal of Neurology, 16, 488-492.
https://doi.org/10.1111/j.1468-1331.2008.02503.x
[26] Pearl, J. and Mackenzie, D. (2018) The Book of Why. The New Science of Cause and Effect. Penguin Books, London.

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.