Detection Model for URL Phishing with Comparison Between Shallow Machine Learning and Deep Learning Models

Nizam Aditya Zuhayr; Girinoto; Nurul Qomariasih; Hermawan Setiawan

doi:10.2991/978-94-6463-174-6_13

<Previous Article In Volume

Next Article In Volume>

Detection Model for URL Phishing with Comparison Between Shallow Machine Learning and Deep Learning Models

Authors

Nizam Aditya Zuhayr¹^{, *}, Girinoto², Nurul Qomariasih², Hermawan Setiawan²

¹The Center for Research and Development of Cyber and Crypto Security Technology, National Cyber and Crypto Agency Jakarta, Jakarta, Indonesia

²Crypto Software Engineering, State Cyber and Crypto Polytechnic Bogor, Bogor, Indonesia

^*Corresponding author. Email: nizam.aditya.1@gmail.com

Corresponding Author

Nizam Aditya Zuhayr

Available Online 22 May 2023.

DOI: 10.2991/978-94-6463-174-6_13 How to use a DOI?
Keywords: phishing; machine learning; deep learning; classification model; flask
Abstract: In the report on trends in phishing activity released by the Anti-Phishing Working Group (APWG), global phishing cases continued to increase throughout 2021 to the first quarter of 2022. This study compares shallow machine learning algorithms that have been used by governments with deep learning in classifying URLs. Phishing. From the data as many as 30,047 URLs consisting of 15,022 phishing URLs and 15,025 legal URLs, the distribution was carried out for training data and test data. URL phishing modeling uses deep learning algorithms LSTM and GRU as well as the best shallow machine learning algorithms from research conducted by Rao et.al, namely Random Forest (RF), Logistic Regression (LR), and Decision Tree (DT). Modeling is done based on URL characteristics, text structure, and a combination of URL characteristics with text structure. Based on URL characteristics, the model with the best accuracy from the shallow machine learning algorithm is Random Forest at 97.4%, while the deep learning algorithm is LSTM at 96.7%. Based on the structure of the text, the best deep learning algorithm is the GRU of 97.8%. While the combination model using 2 deep learning algorithms LSTM and GRU get an accuracy of 98.1%. Furthermore, the combination model as the best model is implemented in the form of a website using the Flask framework with the classification results in the form of a URL probability score that is detected as a phishing URL.
Copyright: © 2023 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 1st International Conference on Neural Networks and Machine Learning 2022 (ICONNSMAL 2022)
Series: Advances in Intelligent Systems Research
Publication Date: 22 May 2023
ISBN: 978-94-6463-174-6
ISSN: 1951-6851
DOI: 10.2991/978-94-6463-174-6_13 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Nizam Aditya Zuhayr
AU  - Girinoto
AU  - Nurul Qomariasih
AU  - Hermawan Setiawan
PY  - 2023
DA  - 2023/05/22
TI  - Detection Model for URL Phishing with Comparison Between Shallow Machine Learning and Deep Learning Models
BT  - Proceedings of the 1st International Conference on Neural Networks and Machine Learning 2022 (ICONNSMAL 2022)
PB  - Atlantis Press
SP  - 146
EP  - 156
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6463-174-6_13
DO  - 10.2991/978-94-6463-174-6_13
ID  - Zuhayr2023
ER  -

download .riscopy to clipboard