Issue 1 (203), article 5

DOI:https://doi.org/10.15407/kvt203.01.077

Cybernetics and Computer Engineering, 2021, 1(203)

KRYVOVA O.A.,
Researcher, the Medical Information Systems Department
e-mail: ol.kryvova@gmail.com
ORCID: 0000-0002-4407-5990

KOZAK L.M., DSc (Biology), Senior Researcher,
Leading Researcher, the Medical Information Systems Department
e-mail: lmkozak52@gmail.com
ORCID: 0000-0002-7412-3041

International Research and Training Center for Information Technologies
and Systems of the National Academy of Sciences of Ukraine
and Ministry of Education and Science of Ukraine,
40, Acad. Glushkov av., Kyiv, 03187, Ukraine

INFORMATION TECHNOLOGY FOR CLASSIFICATION OF DONOSOLOGICAL AND PATHOLOGICAL STATES USING THE ENSEMBLE OF DATA MINING METHODS

Introduction. The digital technologies implementation provides registration of large amounts of bio-medical data (ECG, EEG, electronic medical records) as a basis for assessing and predicting the patients` condition. Data Mining methods allow to identify the most informative indicators and typological groups, to classify the person` functional state and the patients` disease stages to predict their changes.

The purpose of the paper is to develop information technology for the classification of human health states using a set of Data Mining methods and to carry out its validation on examples of an operators` functional state and patient’s disease severity.

Results. The developed IT unites several stages: I — data pre-processing; II — clustering, selecting the homogeneous groups (data segmentation); III — predictors` identification; IV — classifying the studied states, development of predictive models using machine learning algorithms (Decision trees, Support vector machines, neural networks) and the method cross-validation. The proposed IT was used to classify the operators` functional statе and the patients` severity in case of disease progression.

Conclusions. The IT use to assess the operators` activity successes made it possible to identify the most informative HRV indicators, changes in which can predict the operators` reliability, taking into account the type of vegetative regulation. Assessing the disease activity of children with dysplasia with IT use made it possible to identify diagnostic markers of CCC and develop diagnostic rules for determining the stages of the disease by ECG parameters (T wave symmetry, an integral indicator of the ST_T segment shape).

Keywords: information technology, Data Mining, machine learning models, severity of the patient.

Download full text!

REFERENCES

1. Ian H. Data Mining Practical Machine Learning Tools and Techniques Witten, Eibe Frank and Mark A. Hall Data Mining: Practical Machine Learning Tools and Techniques. 3rd Edition. Morgan Kaufmann, 2011, 665 p.

2. Yoo I., Alafaireet P., Marinov M., Pena-Hernandez K., Gopidi R., Chang J. F. Data Mining in Healthcare and Biomedicine: A Survey of the Literature. Journal of medical systems. 2012, no 36(4), pp. 2431-2448.
https://doi.org/10.1007/s10916-011-9710-5

3. Chen M., Hao Y. , Hwang K., Wang L., Wang L. Disease Prediction by Machine Learning Over Big Data From Healthcare Communities. IEEE Access 2017;5:8869-8879.
https://doi.org/10.1109/ACCESS.2017.2694446

4. Safdar S., Zafar S., Zafar N., Khan N.F. Machine learning based decision supportsystems (DSS) for heart disease diagnosis: a review. Artificial Intelligence Review. 2018, 50 (4), pp. 597-623.
https://doi.org/10.1007/s10462-017-9552-8

5. Roopa C. K., Harish B. S. Survey on various Machine Learning Approaches for ECG Analysis. International Journal of Computer Applications. 2017, no 9, vol. 163, pp.25-33.
https://doi.org/10.5120/ijca2017913737

6. Mohan S., Thirumalai C., Srivastava G. Effective heart disease prediction using hybrid machine learning techniques. IEEE Access, 2019. 7:81542-81554.
https://doi.org/10.1109/ACCESS.2019.2923707

7. Goldstein B.A., Navar A.M., Pencina M.J., Ioannidis J.P. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J Am Med Inform Assoc. 2017, Jan; 24(1):198-208.
https://doi.org/10.1093/jamia/ocw042

8. Antomonov M.Yu. Algorithmization of the choice of adequate mathematical methods in the analysis of medical and biological data. Kibernetika i vycislitel’naa tehnika. 2007, Iss. 153, pp. 12-23. (In Russian)

9. Georga E.I., Tachos N.S., Sakellarios A.I., Kigka V.I., Exarchos T.P., Pelosi G. Artificial intelligence and data mining methods for cardiovascular risk prediction Cardiovascular Computing. Methodologies and Clinical Applications. 2019, pp. 279-301
https://doi.org/10.1007/978-981-10-5092-3_14

10. Amin M., Chiam Y. Identification of significant features and data mining techniques in predicting heart disease. Telematics and Informatics. 2019, Vol. 36, pp. 82-93.
https://doi.org/10.1016/j.tele.2018.11.007

11. Kaieski N., da Costa C.A., da Rosa Righi R., Lora P.S. Application of artificial intelligence methods in vital signs analysis of hospitalized patients: A systematic literature review. Applied Soft Computing. 2020, Vol. 96,
https://doi.org/10.1016/j.asoc.2020.106612

12. Owens W.D., Felts J.A., et al. A physical status classifications: A study of consistency of ratings. Anesthesiology. 1978, Vol. 49, pp. 239-243.
https://doi.org/10.1097/00000542-197810000-00003

13. Lemeshow S., Le Gall J.R: Modeling the severity of illness of ICU patients. JAMA. 1994, Vol 272, pp.1049-1055.
https://doi.org/10.1001/jama.272.13.1049

14. Le Gall J.R., Lemeshow S., Saulnier F: A new simplified acute physiology score(SAPS II) based on a European/North American multicenter study. JAMA. 1993, 270 (24), pp. 2957-2963.
https://doi.org/10.1001/jama.270.24.2957

15. Knaus W.A., Draper E.A., Wagner D.P., Zimmerman J.E: APACHE II: A severity of disease classification system. Cri.t Care Med .1985, 13:818-829.
https://doi.org/10.1097/00003246-198510000-00009

16. Lemeshow S., Teres D., Klar J., Avrunin J.S., Gehlbach S.H., Rapoport J. Mortality probability models (MPM II) based on an international cohort of intensive care unit patients. JAMA 1993, 270, pp. 2478-86
https://doi.org/10.1001/jama.270.20.2478

17. Trujillano J., Badia M, Servia L. Stratification of the severity of critically ill patients with classification trees. BMC medical research methodology. 2009, V 9, no 7, pp. 83-95.
https://doi.org/10.1186/1471-2288-9-83

18. Kim S., Kim W., Park R.W. A Comparison of intensive care unit mortality prediction models through the use of Data Mining Techniques. Health Inform Res 2011,17, pp.232-43.
https://doi.org/10.4258/hir.2011.17.4.232

19. Allyn J. et all. A comparison of a machine learning model with EuroSCORE II in predicting mortality after elective cardiac surgery: a decision curve analysis. PLoS one 2017, 12(1), pp. 1-12.
https://doi.org/10.1371/journal.pone.0169772

20. Amosov N.M. Thinking about health. Moskow: 1978, 178 p. (In Russian)

21. Baevsky R.M., Berseneva A.P. Introduction to prenosological diagnostics. Moskow: Slovo, 2008, 174 p. (In Russian)

22. HRV analysis software URL: http://www.nevrokard.eu/maini/hrv.html (last access 20.10.2020)

23. Fainzilberg L.S. Computer diagnostics based on the phase portrait of an electrocardiogram. Kyiv: Osvita Ukrainy. 2013, 191 p. (In Russian)

24. Gritsenko V.I., Fainzilberg L.S. Intelligent information technologies in digital medicine on the example of phasagraphy. Kyiv: Naukova Dumka. 2019, 423 p. (In Russian)

25. Fainzilberg L.S., Dykach Ju.R. Linguistic approach for estimation of electrocardiograms’s subtle changes based on the Levenstein distance. Cybernetics and Computer Engineering. 2019, no. 2 (196), pp. 3-26.
https://doi.org/10.15407/kvt196.02.003

26. Gritsenko V.I., Fainzilberg L.S. Current state and prospects for the development of digital medicine. Cybernetics and Computer Engineering. 2020, no. 1 (199), pp. 59-84.
https://doi.org/10.15407/kvt199.01.059

27. Richman J.S. Randall M.J. Physiological time-series analysis using approximate entropy and sample entropy. Am J. Physiol. Heart Circ. Physiol. 2000, Vol. 278, N 6, pp. H22039-H2049.
https://doi.org/10.1152/ajpheart.2000.278.6.H2039

28. Isler Y., Kuntalp M. Combining classical HRV indices with wavelet entropy measures improves to performance in diagnosing congestive heart failure. Computers in Biology and Medicine. 2007, Vol. 37, no. 10, pp. 1502-1510.
https://doi.org/10.1016/j.compbiomed.2007.01.012

29. Valupadasu R., Chunduri B. R., Chanagoni V. Identification of Cardiac Ischemia using bispectral analysis of ECG. Biomedical Engineering and Sciences (IECBES). 2012: IEEE EMBS Conference on, Langkawi. 2012, pp. 999-1003.
https://doi.org/10.1109/IECBES.2012.6498112

30. Romanyuk O.A., Kozak L.M., Kovalenko A.S., Kryvova O.A. Digital transformation in medicine: from formalized medical documents to information technologies of digital medicine. Cybernetics and Computer Engineering. 2018, no. 4(194), pp. 61-78.
https://doi.org/10.15407/kvt194.04.061

31. Krivova O.A., Kozak L.M. Comprehensive assessment of regional demographic development. Kibernetika i vycislitel’naa tehnika. 2015, Iss 182, pp. 70-84 (In Russian)
https://doi.org/10.15407/kvt182.02.084

32. Wolf L., Shashua A. Features Selection for Unsupervised and Supervised Inference: The Emergence of Sparsity in a Weight-Based Approach. J. Machine Learning Res. 2005, V. 6, pp. 1855-1887.

33. Guyon I., Elisseeff A. An Introduction to Variable and Feature Selection. Journal of Machine Learning Research. 2003, V 3, pp. 1157-1182.

34. Mandel I.D. Cluster analysis. Moscow: Finance and Statistics. 1988. 128 p. (In Russian)

35. Tzortzis G., Likas A. The MinMax k-Means clustering algorithm. Pattern Recognition. 2014, no 47 (7), pp. 2505-2516.
https://doi.org/10.1016/j.patcog.2014.01.015

36. McLachlan G. Krishnan T. The EM algorithm and extensions. New York, United States: Wiley. 1997, 274 p.

37. Wang K., Wang B., Peng L. CVAP: Validation for cluster analyses. Data Science Journal. 2009, no 8, pp. 88-93.
https://doi.org/10.2481/dsj.007-020

38. Fayn J. A classification tree approach for cardiac ischemia detection using spatiotemporal information from three standard ECG leads. IEEE Trans. Biomed. Eng. 2011, V. 58, no 1, pp. 95-102.
https://doi.org/10.1109/TBME.2010.2071872

39. Pecchia L., Melillo P. Bracale M. Remote health monitoring of heart failure with data mining via CART method on HRV features. IEEE Transactions Biomedical Engineering. 2011, V. 58(3), pp. 800-804.
https://doi.org/10.1109/TBME.2010.2092776

40. Sokolova M., Lapalme G. A systematic analysis of performance measures for classification tasks. Information processing & management. 2009, V. 45, N 4, pp. 427-437.
https://doi.org/10.1016/j.ipm.2009.03.002

41. Kalnish V.V., Shvets A.V. Information technology for psychophysiological support of high reliability of operator activities. Kibernetika i vycislitel’naa tehnika. 2014, Iss. 177, pp. 54-67. (In Russian)

42. Shvets A.V., Kalnysh V.V. Features of influence of various psychophysiological states on reliability of operator’ activity. Military medicine of Ukraine. 2009, no 1, pp. 84-91. (In Ukrainian)

43. Consolaro A., Ruperto N, Bazso A. Development and validation of a composite disease activity score for juvenile idiopathic arthritis. Arthritis & Rheumatism, 2009, vol. 61, pp. 658-666.
https://doi.org/10.1002/art.24516

44. Ansari S., Farzaneh N, Duda M, Horan K. A review of automated methods for detection of myocardial ischemia and infarction using electrocardiogram and electronic health records. IEEE reviews in biomedical engineering. 2017, Vol. 10, pp. 264-298.
https://doi.org/10.1109/RBME.2017.2757953

Received 31.11.2020