Cybernetics and Computer Engineering, 2021, 1(203)
Researcher, the Medical Information Systems Department
KOZAK L.M., DSc (Biology), Senior Researcher,
Leading Researcher, the Medical Information Systems Department
International Research and Training Center for Information Technologies
and Systems of the National Academy of Sciences of Ukraine
and Ministry of Education and Science of Ukraine,
40, Acad. Glushkov av., Kyiv, 03187, Ukraine
INFORMATION TECHNOLOGY FOR CLASSIFICATION OF DONOSOLOGICAL AND PATHOLOGICAL STATES USING THE ENSEMBLE OF DATA MINING METHODS
Introduction. The digital technologies implementation provides registration of large amounts of bio-medical data (ECG, EEG, electronic medical records) as a basis for assessing and predicting the patients` condition. Data Mining methods allow to identify the most informative indicators and typological groups, to classify the person` functional state and the patients` disease stages to predict their changes.
The purpose of the paper is to develop information technology for the classification of human health states using a set of Data Mining methods and to carry out its validation on examples of an operators` functional state and patient’s disease severity.
Results. The developed IT unites several stages: I — data pre-processing; II — clustering, selecting the homogeneous groups (data segmentation); III — predictors` identification; IV — classifying the studied states, development of predictive models using machine learning algorithms (Decision trees, Support vector machines, neural networks) and the method cross-validation. The proposed IT was used to classify the operators` functional statе and the patients` severity in case of disease progression.
Conclusions. The IT use to assess the operators` activity successes made it possible to identify the most informative HRV indicators, changes in which can predict the operators` reliability, taking into account the type of vegetative regulation. Assessing the disease activity of children with dysplasia with IT use made it possible to identify diagnostic markers of CCC and develop diagnostic rules for determining the stages of the disease by ECG parameters (T wave symmetry, an integral indicator of the ST_T segment shape).
Keywords: information technology, Data Mining, machine learning models, severity of the patient.
1. Ian H. Data Mining Practical Machine Learning Tools and Techniques Witten, Eibe Frank and Mark A. Hall Data Mining: Practical Machine Learning Tools and Techniques. 3rd Edition. Morgan Kaufmann, 2011, 665 p.
2. Yoo I., Alafaireet P., Marinov M., Pena-Hernandez K., Gopidi R., Chang J. F. Data Mining in Healthcare and Biomedicine: A Survey of the Literature. Journal of medical systems. 2012, no 36(4), pp. 2431-2448.
3. Chen M., Hao Y. , Hwang K., Wang L., Wang L. Disease Prediction by Machine Learning Over Big Data From Healthcare Communities. IEEE Access 2017;5:8869-8879.
4. Safdar S., Zafar S., Zafar N., Khan N.F. Machine learning based decision supportsystems (DSS) for heart disease diagnosis: a review. Artificial Intelligence Review. 2018, 50 (4), pp. 597-623.
5. Roopa C. K., Harish B. S. Survey on various Machine Learning Approaches for ECG Analysis. International Journal of Computer Applications. 2017, no 9, vol. 163, pp.25-33.
6. Mohan S., Thirumalai C., Srivastava G. Effective heart disease prediction using hybrid machine learning techniques. IEEE Access, 2019. 7:81542-81554.
7. Goldstein B.A., Navar A.M., Pencina M.J., Ioannidis J.P. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J Am Med Inform Assoc. 2017, Jan; 24(1):198-208.
8. Antomonov M.Yu. Algorithmization of the choice of adequate mathematical methods in the analysis of medical and biological data. Kibernetika i vycislitel’naa tehnika. 2007, Iss. 153, pp. 12-23. (In Russian)
9. Georga E.I., Tachos N.S., Sakellarios A.I., Kigka V.I., Exarchos T.P., Pelosi G. Artificial intelligence and data mining methods for cardiovascular risk prediction Cardiovascular Computing. Methodologies and Clinical Applications. 2019, pp. 279-301
10. Amin M., Chiam Y. Identification of significant features and data mining techniques in predicting heart disease. Telematics and Informatics. 2019, Vol. 36, pp. 82-93.
11. Kaieski N., da Costa C.A., da Rosa Righi R., Lora P.S. Application of artificial intelligence methods in vital signs analysis of hospitalized patients: A systematic literature review. Applied Soft Computing. 2020, Vol. 96,
12. Owens W.D., Felts J.A., et al. A physical status classifications: A study of consistency of ratings. Anesthesiology. 1978, Vol. 49, pp. 239-243.
13. Lemeshow S., Le Gall J.R: Modeling the severity of illness of ICU patients. JAMA. 1994, Vol 272, pp.1049-1055.
14. Le Gall J.R., Lemeshow S., Saulnier F: A new simplified acute physiology score(SAPS II) based on a European/North American multicenter study. JAMA. 1993, 270 (24), pp. 2957-2963.
15. Knaus W.A., Draper E.A., Wagner D.P., Zimmerman J.E: APACHE II: A severity of disease classification system. Cri.t Care Med .1985, 13:818-829.
16. Lemeshow S., Teres D., Klar J., Avrunin J.S., Gehlbach S.H., Rapoport J. Mortality probability models (MPM II) based on an international cohort of intensive care unit patients. JAMA 1993, 270, pp. 2478-86
17. Trujillano J., Badia M, Servia L. Stratification of the severity of critically ill patients with classification trees. BMC medical research methodology. 2009, V 9, no 7, pp. 83-95.
18. Kim S., Kim W., Park R.W. A Comparison of intensive care unit mortality prediction models through the use of Data Mining Techniques. Health Inform Res 2011,17, pp.232-43.
19. Allyn J. et all. A comparison of a machine learning model with EuroSCORE II in predicting mortality after elective cardiac surgery: a decision curve analysis. PLoS one 2017, 12(1), pp. 1-12.
20. Amosov N.M. Thinking about health. Moskow: 1978, 178 p. (In Russian)
21. Baevsky R.M., Berseneva A.P. Introduction to prenosological diagnostics. Moskow: Slovo, 2008, 174 p. (In Russian)
22. HRV analysis software URL: http://www.nevrokard.eu/maini/hrv.html (last access 20.10.2020)
23. Fainzilberg L.S. Computer diagnostics based on the phase portrait of an electrocardiogram. Kyiv: Osvita Ukrainy. 2013, 191 p. (In Russian)
24. Gritsenko V.I., Fainzilberg L.S. Intelligent information technologies in digital medicine on the example of phasagraphy. Kyiv: Naukova Dumka. 2019, 423 p. (In Russian)
25. Fainzilberg L.S., Dykach Ju.R. Linguistic approach for estimation of electrocardiograms’s subtle changes based on the Levenstein distance. Cybernetics and Computer Engineering. 2019, no. 2 (196), pp. 3-26.
26. Gritsenko V.I., Fainzilberg L.S. Current state and prospects for the development of digital medicine. Cybernetics and Computer Engineering. 2020, no. 1 (199), pp. 59-84.
27. Richman J.S. Randall M.J. Physiological time-series analysis using approximate entropy and sample entropy. Am J. Physiol. Heart Circ. Physiol. 2000, Vol. 278, N 6, pp. H22039-H2049.
28. Isler Y., Kuntalp M. Combining classical HRV indices with wavelet entropy measures improves to performance in diagnosing congestive heart failure. Computers in Biology and Medicine. 2007, Vol. 37, no. 10, pp. 1502-1510.
29. Valupadasu R., Chunduri B. R., Chanagoni V. Identification of Cardiac Ischemia using bispectral analysis of ECG. Biomedical Engineering and Sciences (IECBES). 2012: IEEE EMBS Conference on, Langkawi. 2012, pp. 999-1003.
30. Romanyuk O.A., Kozak L.M., Kovalenko A.S., Kryvova O.A. Digital transformation in medicine: from formalized medical documents to information technologies of digital medicine. Cybernetics and Computer Engineering. 2018, no. 4(194), pp. 61-78.
31. Krivova O.A., Kozak L.M. Comprehensive assessment of regional demographic development. Kibernetika i vycislitel’naa tehnika. 2015, Iss 182, pp. 70-84 (In Russian)
32. Wolf L., Shashua A. Features Selection for Unsupervised and Supervised Inference: The Emergence of Sparsity in a Weight-Based Approach. J. Machine Learning Res. 2005, V. 6, pp. 1855-1887.
33. Guyon I., Elisseeff A. An Introduction to Variable and Feature Selection. Journal of Machine Learning Research. 2003, V 3, pp. 1157-1182.
34. Mandel I.D. Cluster analysis. Moscow: Finance and Statistics. 1988. 128 p. (In Russian)
35. Tzortzis G., Likas A. The MinMax k-Means clustering algorithm. Pattern Recognition. 2014, no 47 (7), pp. 2505-2516.
36. McLachlan G. Krishnan T. The EM algorithm and extensions. New York, United States: Wiley. 1997, 274 p.
37. Wang K., Wang B., Peng L. CVAP: Validation for cluster analyses. Data Science Journal. 2009, no 8, pp. 88-93.
38. Fayn J. A classification tree approach for cardiac ischemia detection using spatiotemporal information from three standard ECG leads. IEEE Trans. Biomed. Eng. 2011, V. 58, no 1, pp. 95-102.
39. Pecchia L., Melillo P. Bracale M. Remote health monitoring of heart failure with data mining via CART method on HRV features. IEEE Transactions Biomedical Engineering. 2011, V. 58(3), pp. 800-804.
40. Sokolova M., Lapalme G. A systematic analysis of performance measures for classification tasks. Information processing & management. 2009, V. 45, N 4, pp. 427-437.
41. Kalnish V.V., Shvets A.V. Information technology for psychophysiological support of high reliability of operator activities. Kibernetika i vycislitel’naa tehnika. 2014, Iss. 177, pp. 54-67. (In Russian)
42. Shvets A.V., Kalnysh V.V. Features of influence of various psychophysiological states on reliability of operator’ activity. Military medicine of Ukraine. 2009, no 1, pp. 84-91. (In Ukrainian)
43. Consolaro A., Ruperto N, Bazso A. Development and validation of a composite disease activity score for juvenile idiopathic arthritis. Arthritis & Rheumatism, 2009, vol. 61, pp. 658-666.
44. Ansari S., Farzaneh N, Duda M, Horan K. A review of automated methods for detection of myocardial ischemia and infarction using electrocardiogram and electronic health records. IEEE reviews in biomedical engineering. 2017, Vol. 10, pp. 264-298.