A Schema Feature Based Frequent Pattern Mining Algorithm for Semi-structured Data Stream
- DOI
- 10.2991/fmsmt-17.2017.260How to use a DOI?
- Keywords
- frequent pattern mining, semi-structured data stream, schema feature.
- Abstract
Data mining is used to find useful information from massive data. Frequent pattern mining is one important task of data mining. Recently, the researches on frequent pattern mining for semi-structured data have made some progresses, and it also have a lot of focuses for data stream. However, only a few studies focus on both semi-structured data and data stream. This paper proposes an algorithm named SPrefixTreeISpan. We segment the semi-structured data stream first, and then uses the pattern-growth method to mine each segment. In the end, we maintain all the results on a structure called patternTree. At the same time, the mining algorithm is optimized by the inevitable parent-child relationship and the inevitable child-parent relationship extracted from XML schema. Experiment shows that SPrefixTreeISpan has better performance.
- Copyright
- © 2017, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Weiqi Fu AU - Husheng Liao AU - Xueyun Jin PY - 2017/04 DA - 2017/04 TI - A Schema Feature Based Frequent Pattern Mining Algorithm for Semi-structured Data Stream BT - Proceedings of the 2017 5th International Conference on Frontiers of Manufacturing Science and Measuring Technology (FMSMT 2017) PB - Atlantis Press SP - 1329 EP - 1336 SN - 2352-5401 UR - https://doi.org/10.2991/fmsmt-17.2017.260 DO - 10.2991/fmsmt-17.2017.260 ID - Fu2017/04 ER -