Data mining of hospital characteristics in online publication of medical quality information


Information disclosure can reduce information asymmetry between health care providers and patients, thus improving both patient safety and medical quality. The National Bureau of Health Insurance (NBHI) inTaiwancurrently publishes health-related information online in order to enhance service efficiency and enable the public to monitor the country’s medical system. A data mining technique, classification and regression tree (CART), is used in this work to investigate online public quality information to compare the characteristics of hospital. The hospital quality indicators and characteristics data are available on the websites of the NBHI
( and the Department of Health
( The full classification and regression tree presented in this work, grown using the hospitals’ quality medical indicators and characteristic values, classifies all hospitals into seven groups. The rate of stays longer than 30 days, which is the dependent variable in this study, is most influenced by the number of medical staff. This reflects the fact that the fewer medical staffs that are employed, the smaller the hospital is, and patients who are likely to have longer stays tend to go to the medium or large hospitals. Policy makers should work to decrease or eliminate persistent healthcare disparities among different socioeconomic groups and offer more online healthrelated services to reduce information asymmetry between health care providers and patients.

Share and Cite:

Kreng, V. and Yang, S. (2013) Data mining of hospital characteristics in online publication of medical quality information. Health, 5, 931-937. doi: 10.4236/health.2013.55123.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Gallagher, T.H. and Levinson, W. (2005) Harmful medical errors to patients—A time for professional action. Archives of Internal Medicine, 165, 1819-1824. doi:10.1001/archinte.165.16.1819
[2] Kitamura, T. (2005) Stress-reductive effects of information disclosure to medical and psychiatric patients. Psychiatry and Clinical Neurosciences, 59, 627-633. doi:10.1111/j.1440-1819.2005.01428.x
[3] Marshall, M.N., Romano, P.S. and Davies, H.T. (2004) How do we maximize the impact of the public reporting of quality of care? International Journal for Quality Health Care, 16, 57-63. doi:10.1093/intqhc/mzh013
[4] Berwick, D.M. (2002) Public performance reports and the will for change. JAMA, 288, 1523-1524. doi:10.1001/jama.288.12.1523
[5] Chassin, M.R. (2002) Achieving and sustaining improved quality: Lessons from New York State and cardiac surgery. Health Affair, 21, 40-51. doi:10.1377/hlthaff.21.4.40
[6] De Fraja, G. (2000) Contracts for health care and asymmetric information. Journal of Health Economics, 19, 663- 677. doi:10.1016/S0167-6296(00)00037-0
[7] Jin, G.Z. (2005) Competition and disclosure incentives: An empirical study of HMOs. Rand Journal of Economics, 36, 93-112.
[8] McCormick, D., Woolhandler, S., Wolfe, S.M. and Bor, D.H. (2002) Relationship between low quality-of-care scores and HMOs’ subsequent public disclosure of quality-of-care scores. The Journal of the American Medical Association, 288, 1484-1490. doi:10.1001/jama.288.12.1484
[9] Jung, K. (2010) Incentives for voluntary disclosure of quality information in HMO markets. Journal of Risk and Insurance, 77, 183-210. doi:10.1111/j.1539-6975.2009.01339.x
[10] Joseph, T. (2005) E-health care information systems: An introduction for students and professionals. Jossey Bass, San Francisco.
[11] Breault, J.L., Goodall, C.R. and Fos, P.J. (2002) Data mining a diabetic data warehouse. Artificial Intelligence in Medicine, 26, 37-54. doi:10.1016/S0933-3657(02)00051-9
[12] Kaur, H. and Wasan, S.K. (2006) Empirical study on applications of data mining techniques in healthcare. Journal of Computer Science, 2, 194-200. doi:10.3844/jcssp.2006.194.200
[13] Kononenko, I. (2001) Machine learning for medical diagnosis: History, state of the art and perspective. Artificial Intelligence in Medicine, 23, 89-109. doi:10.1016/S0933-3657(01)00077-X
[14] Obenshain, M.K. (2004) Application of data mining techniques to healthcare data. Infection Control and Hospital Epidemiology, 25, 690-695. doi:10.1086/502460
[15] Cho, S.B. and Won, H.H. (2007) Cancer classification using ensemble of neural networks with multiple significant gene subsets. Applied Intelligence, 26, 243-250. doi:10.1007/s10489-006-0020-4
[16] Delen, D., Walker, G. and Kadam, A. (2005) Predicting breast cancer survivability: A comparison of three data mining methods. Artificial Intelligence in Medicine, 34, 113-127. doi:10.1016/j.artmed.2004.07.002
[17] Garzotto, M., Beer, T.M., Hudson, R.G., Peters, L., Hsieh, Y.C., Barrera, E., Klein, T. and Mori, M. (2005) Improved detection of prostate cancer using classification and regression tree analysis. Journal of Clinical Oncology, 23, 4322-4329. doi:10.1200/JCO.2005.11.136
[18] Harper, P.R. (2005) A review and comparison of classification algorithms for medical decision making. Health Policy, 71, 315-331. doi:10.1016/j.healthpol.2004.05.002
[19] Bellazzi, R. and Zupan, B. (2008) Predictive data mining in clinical medicine: Current issues and guidelines. International Journal of Medical Informatics, 77, 81-97. doi:10.1016/j.ijmedinf.2006.11.006
[20] Phillips-Wren, G., Sharkey, P. and Dy, S.M. (2008) Mining lung cancer patient data to assess healthcare resource utilization. Expert Systems with Applications, 35, 1611- 1619. doi:10.1016/j.eswa.2007.08.076
[21] Cline, R.J. and Haynes, K.M. (2001) Consumer health information seeking on the internet: The state of the art. Health Education Research, 16, 671-692. doi:10.1093/her/16.6.671
[22] Dedding, C., van Doorn, R., Winkler, L. and Reis, R. (2011) How will e-health affect patient participation in the clinic? A review of e-health studies and the current evidence for changes in the relationship between medical professionals and patients. Social Science & Medicine, 72, 49-53. doi:10.1016/j.socscimed.2010.10.017
[23] Detmer, W.M. and Shortliffe, E.H. (1997) Using the internet to improve knowledge diffusion in medicine. Communications of the ACM, 40, 101-108. doi:10.1145/257874.257897
[24] Powell, J.A., Darvell, M. and Gray, J.A. (2003) The doctor, the patient and the world-wide web: How the internet is changing healthcare. Journal of the Royal Society of Medicine, 96, 74-76. doi:10.1258/jrsm.96.2.74
[25] Bernheim, S.M., Ross, J.S., Krumholz, H.M. and Bradley, E.H. (2008) Influence of patients’ socioeconomic status on clinical management decisions: A qualitative study. The Annals of Family Medicine, 6, 53-59. doi:10.1370/afm.749
[26] Hansen, R.P., Olesen, F., Sorensen, H.T., Sokolowski, I. and Sondergaard, J. (2008) Socioeconomic patient characteristics predict delay in cancer diagnosis: A Danish cohort study. BMC Health Services Research, 8, 49. doi:10.1186/1472-6963-8-49
[27] Willems, S., De Maesschalck, S., Deveugele, M., Derese, A. and De Maeseneer, J. (2005) Socio-economic status of the patient and doctor-patient communication: Does it make a difference? Patient Education & Counseling, 56, 139-146. doi:10.1016/j.pec.2004.02.011
[28] Keeler, E.B., Rubenstein, L.V., Kahn, K.L., Draper, D., Harrison, E.R., Mcginty, M.J., Rogers, W.H. and Brook, R.H. (1992) Hospital characteristics and quality of care. Journal of American Medical Association, 268, 1709- 1714. doi:10.1001/jama.1992.03490130097037
[29] Lehrman, W.G., Elliott, M.N., Goldstein, E., Beckett, M.K., Klein, D.J. and Giordano, L.A. (2010) Characteristics of hospitals demonstrating superior performance in patient experience and clinical process measures of care. Medical Care Research and Review, 67, 38-55. doi:10.1177/1077558709341323
[30] Frawley, W.J., Piatetskyshapiro, G. and Matheus, C.J. (1992) Knowledge discovery in databases—An overview. Ai Magazine, 13, 57-70.
[31] Fonarow, G.C., Adams Jr., K.F., Abraham, W.T., Yancy, C.W. and Boscardin, W.J. (2005) Adhere scientific advisory committee SG, investigators. Risk stratification for in-hospital mortality in acutely decompensated heart failure: Classification and regression tree analysis. JAMA, 293, 572-580. doi:10.1001/jama.293.5.572
[32] Rovlias, A. and Kotsou, S. (2004) Classification and regression tree for prediction of outcome after severe head injury using simple clinical and laboratory variables. Journal of Neurotrauma, 21, 886-893. doi:10.1089/0897715041526249
[33] Zlobec, I., Steele, R., Nigam, N. and Compton, C.C. (2005) A predictive model of rectal tumor response to preoperative radiotherapy using classification and regression tree methods. Clinical Cancer Research, 11, 5440-5443. doi:10.1158/1078-0432.CCR-04-2587
[34] Berwick, D.M., James, B. and Coye, M.J. (2003) Connections between quality measurement and improvement. Medical Care, 41, I30-I38. doi:10.1097/00005650-200301001-00004
[35] Lansky, D. (2002) Improving quality through public disclosure of performance information. Health Affairs, 21, 52-62. doi:10.1377/hlthaff.21.4.52
[36] Barr, J.K., Giannotti, T.E., Sofaer, S., Duquette, C.E., Waters, W.J. and Petrillo, M.K. (2006) Using public reports of patient satisfaction for hospital quality improvement. Health Services Research, 41, 663-682. doi:10.1111/j.1475-6773.2006.00508.x
[37] Colledge, A., Car, J., Donnelly, A. and Majeed, A. (2008) Health information for patients: Time to look beyond patient information leaf lets. Journal of the Royal Society of Medicine, 101, 447-453. doi:10.1258/jrsm.2008.080149
[38] Lopez, L., Green, A.R., Tan-McGrory, A., King, R. and Betancourt, J.R. (2011) Bridging the digital divide in health care: The role of health information technology in addressing racial and ethnic disparities. Joint Commission Journal on Quality and Patient Safety, 37, 437-445.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.