A Schema Feature Based Frequent Pattern Mining Algorithm for Semi-structured Data Stream

Weiqi Fu; Husheng Liao; Xueyun Jin

doi:10.2991/fmsmt-17.2017.260

<Previous Article In Volume

Next Article In Volume>

A Schema Feature Based Frequent Pattern Mining Algorithm for Semi-structured Data Stream

Authors

Weiqi Fu, Husheng Liao, Xueyun Jin

Corresponding Author

Weiqi Fu

Available Online April 2017.

DOI: 10.2991/fmsmt-17.2017.260 How to use a DOI?
Keywords: frequent pattern mining, semi-structured data stream, schema feature.
Abstract: Data mining is used to find useful information from massive data. Frequent pattern mining is one important task of data mining. Recently, the researches on frequent pattern mining for semi-structured data have made some progresses, and it also have a lot of focuses for data stream. However, only a few studies focus on both semi-structured data and data stream. This paper proposes an algorithm named SPrefixTreeISpan. We segment the semi-structured data stream first, and then uses the pattern-growth method to mine each segment. In the end, we maintain all the results on a structure called patternTree. At the same time, the mining algorithm is optimized by the inevitable parent-child relationship and the inevitable child-parent relationship extracted from XML schema. Experiment shows that SPrefixTreeISpan has better performance.
Copyright: © 2017, the Authors. Published by Atlantis Press.
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 2017 5th International Conference on Frontiers of Manufacturing Science and Measuring Technology (FMSMT 2017)
Series: Advances in Engineering Research
Publication Date: April 2017
ISBN: 978-94-6252-331-9
ISSN: 2352-5401
DOI: 10.2991/fmsmt-17.2017.260 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - CONF
AU  - Weiqi Fu
AU  - Husheng Liao
AU  - Xueyun Jin
PY  - 2017/04
DA  - 2017/04
TI  - A Schema Feature Based Frequent Pattern Mining Algorithm for Semi-structured Data Stream
BT  - Proceedings of the 2017 5th International Conference on Frontiers of Manufacturing Science and Measuring Technology (FMSMT 2017)
PB  - Atlantis Press
SP  - 1329
EP  - 1336
SN  - 2352-5401
UR  - https://doi.org/10.2991/fmsmt-17.2017.260
DO  - 10.2991/fmsmt-17.2017.260
ID  - Fu2017/04
ER  -

download .riscopy to clipboard