Distributed News Crawler Using Fog Cloud Approach
- DOI
- 10.2991/978-94-6463-100-5_26How to use a DOI?
- Keywords
- Web crawler; News; Distributed web crawling; Fog cloud
- Abstract
Technology advanced quickly during the Industrial Revolution. 4.0, makes the internet network also develop rapidly and become larger. So website technology that is constantly changing becomes a big challenge in using large and complex data information on the global Internet. Stand-alone web crawlers have traditionally been difficult to overcome the challenges of rapid information growth, therefore it's challenging to extract a lot of data in a short period. The research will use distributed technology to build a more effective web-distributed news system, to search for news. Crawler systems can work efficiently with Multi-Threads working together, and each node can work efficiently with Multithreading. This study applies a new web crawler fog cloud approach that is considered to be more efficient in navigating URLs by setting according to the domain used and dividing URL limitations into various priority URL queues so that URLs can be dispersed across concurrent crawler operations to get rid of the new building. In particular, the proposed model can effectively utilize resources optimally in the cloud-fog layer by deploying a crawler distribution in the cloud-fog infrastructure to detect news. With the fog cloud, analysis is dynamically distributed across the fog and cloud layers enabling real-time distribution. The research phase of the distributed news crawler starts from URL collection, URL filtering, scheduling, accessing URLs, and extracting news data. This research is focused on developing web crawlers to process distributed news crawlers.
.
- Copyright
- © 2022 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - I. Gusti Lanang Putra Eka Prismana PY - 2022 DA - 2022/12/27 TI - Distributed News Crawler Using Fog Cloud Approach BT - Proceedings of the International Joint Conference on Science and Engineering 2022 (IJCSE 2022) PB - Atlantis Press SP - 251 EP - 260 SN - 2352-5401 UR - https://doi.org/10.2991/978-94-6463-100-5_26 DO - 10.2991/978-94-6463-100-5_26 ID - Prismana2022 ER -