Unstructured Text and Tabular Information Processing in the Clinical Decision Making System for the Respiratory Diseases Diagnosis
- DOI
- 10.2991/aisr.k.201029.061How to use a DOI?
- Keywords
- clinical decision support system, data processing, processing of text information, text mining, machine learning
- Abstract
The article considers the basic algorithm of text and table data extraction used in the developed system of clinical decision-making in diagnosis of respiratory diseases, methods of formation of the data structure of an individual patient, a set of data from all patients for further application in models of machine learning as well as construction of ML models which provide detection of disease in patients. Data extraction and generation processes are performed in the Python programming language using additional libraries: “docx” and “pandas” for data processing and “sklearn”, “lightgbm” and “catboost” for building machine learning models. The relevance of the task is due to large volumes of unstructured data received by the CDSS input and necessary for its effective functioning. The novelty of development lies in application of a set of existing and development of new algorithms of extraction and primary processing of text and table information.
- Copyright
- © 2020, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - G. R. Shakhmametova AU - A.A. Evgrafov AU - R.Kh. Zulkarneev PY - 2020 DA - 2020/11/10 TI - Unstructured Text and Tabular Information Processing in the Clinical Decision Making System for the Respiratory Diseases Diagnosis BT - Proceedings of the 8th Scientific Conference on Information Technologies for Intelligent Decision Making Support (ITIDS 2020) PB - Atlantis Press SP - 323 EP - 327 SN - 1951-6851 UR - https://doi.org/10.2991/aisr.k.201029.061 DO - 10.2991/aisr.k.201029.061 ID - Shakhmametova2020 ER -