Empowering Global Health with AI: Using NLP to Extract Medicinal Plants and Disease-fighting Compounds from PubMed
- DOI
- 10.2991/978-94-6463-294-1_7How to use a DOI?
- Keywords
- Natural Language Processing; PubMed; Python; Disease; Artificial Intelligence
- Abstract
PubMed is a free database maintained by the National Library of Medicine (NLM) at the National Institutes of Health (NIH) in the United States, and it contains more than 30 million citations and abstracts of biomedical literature and other scientific publications related to medicinal plants and phytocompounds from around the world. Natural Language Processing (NLP) and the Natural Language Toolkit (NLTK) is used to extract information on medicinal plants and disease-fighting compounds from PubMed, with the aim of empowering global health research. The methodology involved a Python-based NLP pipeline to extract information on medicinal plants and disease-fighting compounds from PubMed. The pipeline involved several stages, including text pre-processing, named entity recognition (NER), and relationship extraction. Text pre-processing involved cleaning and formatting the abstracts to remove irrelevant information and standardize the text. NER was performed using the libraries to identify chemical compounds, and disease targets. Relationship extraction involved using the NLTK to identify co-occurring terms and analyze their relationships based on their context and proximity. The use of NLP and NLTK can be powerful tools for extracting and analyzing information on medicinal plants and disease-fighting compounds from PubMed. The code developed in this study can be used to automate the extraction of key information from a large number of scientific articles, saving researchers time and effort. The results also showed that this approach can be used to identify relationships between different plants, compounds, and diseases, providing insights that may not be apparent through manual analysis.
- Copyright
- © 2023 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Rehan Khan AU - Preenon Bagchi AU - Krutanjali Patil PY - 2023 DA - 2023/11/17 TI - Empowering Global Health with AI: Using NLP to Extract Medicinal Plants and Disease-fighting Compounds from PubMed BT - Proceedings of the International Conference on Advances in Nano-Neuro-Bio-Quantum (ICAN 2023) PB - Atlantis Press SP - 72 EP - 86 SN - 2468-5739 UR - https://doi.org/10.2991/978-94-6463-294-1_7 DO - 10.2991/978-94-6463-294-1_7 ID - Khan2023 ER -