Proceedings of the 2018 Second International Conference of Sensor Network and Computer Engineering (ICSNCE 2018)

Research of Email Classification based on Deep Neural Network

Authors
Wang Yawen, Yu Fan, Wei Yanxi
Corresponding Author
Wang Yawen
Available Online April 2018.
DOI
10.2991/icsnce-18.2018.16How to use a DOI?
Keywords
Deep Neural Networks; Spam Email; Classification; Naive Bayes; SpamBase Data Set
Abstract

The effective distinction between normal email and spam, so as to maximize the possible of filtering spam has become a research hotspot currently. Naive bayes algorithm is a kind of frequently-used email classification and it is a statistical-based classification algorithm. It assumes that the attributes are independent of each other when given the target value. This hypothesis is apparently impossible in the email classification, so the accuracy of email classification based on naive bayes algorithm is low. In allusion to the problem of poor accuracy of email classification based on naive bayes algorithm, scholars have proposed some new email classification algorithms. The email classification algorithm based on deep neural network is one kind of them. The deep neural network is an artificial neural network with full connection between layer and layer. The algorithm extracted the email feature from the training email samples and constructed a DNN with multiple hidden layers, the DNN classifier was generated by training samples, and finally the testing emails were classified, and they were marked whether they were spam or not. In order to verify the effect of the email classification algorithm based on DNN, in this paper we constructed a DNN with 2 hidden layers. The number of nodes in each hidden layer was 30. When the training set was trained, we set up 2000 batches, and each batch has 3 trained data. We used the famous Spam Base dataset as the data set. The experiment result showed that DNN was higher than naive Bayes in the accuracy of email classification when the proportion of the training set was 10%, 20%, 30%, 40% and 50% respectively, and DNN showed a good classification effect. With the development of science and technology, spam manifests in many forms and the damage of it is more serious, this puts forward higher requirements for the accuracy of spam recognition. The focus of next research will be combining various algorithms to further improve the effect of email classification.

Copyright
© 2018, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2018 Second International Conference of Sensor Network and Computer Engineering (ICSNCE 2018)
Series
Advances in Computer Science Research
Publication Date
April 2018
ISBN
978-94-6252-498-9
ISSN
2352-538X
DOI
10.2991/icsnce-18.2018.16How to use a DOI?
Copyright
© 2018, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Wang Yawen
AU  - Yu Fan
AU  - Wei Yanxi
PY  - 2018/04
DA  - 2018/04
TI  - Research of Email Classification based on Deep Neural Network
BT  - Proceedings of the 2018 Second International Conference of Sensor Network and Computer Engineering (ICSNCE 2018)
PB  - Atlantis Press
SP  - 73
EP  - 77
SN  - 2352-538X
UR  - https://doi.org/10.2991/icsnce-18.2018.16
DO  - 10.2991/icsnce-18.2018.16
ID  - Yawen2018/04
ER  -