Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019)

Information Extraction from Web as Knowledge Resources for Indonesian Question Answering System

Authors
Abdiansah ABDIANSAH, Alvi Syahrini UTAMI
Corresponding Author
Abdiansah ABDIANSAH
Available Online 6 May 2020.
DOI
10.2991/aisr.k.200424.064How to use a DOI?
Keywords
information extraction, Indonesian Question Answering System
Abstract

Research in the field of Open Domain Question Answering System (OD-QAS) generally involves external knowledge which are dynamic and require high-level representation. Strong external knowledge is one of the key success of QAS. Therefore, intensive research is needed in this area. Web is one of the big source of information that can be used as external knowledge by QAS. However, the main problem is the Web contains a lot of unstructured data. Hence, a model is needed to extract information from the web. The model developed in this research based on pipeline architecture and consists three main processes: pre-processing, information extraction processing, and text processing. The input model is factoid questions, and the output are snippets or set of sentences that contains target answers. There are three search engines assist to finding relevant information from the Web, i.e, Yahoo!, Bing, and Ask. The result of average precision and deviation value for the each search engines are slightly different. The highest total number of snippets (true positive) generated by Yahoo! is 65 snippets, while the best average precision obtained by Bing is 25.33%.

Copyright
© 2020, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019)
Series
Advances in Intelligent Systems Research
Publication Date
6 May 2020
ISBN
978-94-6252-963-2
ISSN
1951-6851
DOI
10.2991/aisr.k.200424.064How to use a DOI?
Copyright
© 2020, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Abdiansah ABDIANSAH
AU  - Alvi Syahrini UTAMI
PY  - 2020
DA  - 2020/05/06
TI  - Information Extraction from Web as Knowledge Resources for Indonesian Question Answering System
BT  - Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019)
PB  - Atlantis Press
SP  - 419
EP  - 425
SN  - 1951-6851
UR  - https://doi.org/10.2991/aisr.k.200424.064
DO  - 10.2991/aisr.k.200424.064
ID  - ABDIANSAH2020
ER  -