Multivariable Statistical Correlation Measure Applied to Association Rules Mining
- DOI
- 10.2991/cisia-15.2015.266How to use a DOI?
- Keywords
- data mining;association rules;M-correlation; FP-Forest
- Abstract
Correlation is usually used in the context of real-valued sequences. However, in data mining, the values range may be of various types-real, nominal or ordinal. Regardless of their type, the methods on measuring correlation between multivariable sequences of data are reviewed. In particular, a new method on measuring the statistical correlation of multivariable sequences is proposed. As the method relies on the geometrical meaning of dot conduct to get the degree of multivariable correlation, it is called M-correlation. M-correlation is used to cut redundancy association rules in this paper. In order to enhance mining efficiency, a novel algorithm, namely FT-Miner, is presented to find all frequent sub-trees in a forest, using two new data structures called UFP-Tree and FP-Forest. The experimentation shows that the algorithm not only reduces a lot of unavailable rules, but also has better capability than classical algorithms.
- Copyright
- © 2015, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - J. Hu AU - H.F Jian AU - J.H Sun PY - 2015/06 DA - 2015/06 TI - Multivariable Statistical Correlation Measure Applied to Association Rules Mining BT - Proceedings of the International Conference on Computer Information Systems and Industrial Applications PB - Atlantis Press SP - 983 EP - 985 SN - 2352-538X UR - https://doi.org/10.2991/cisia-15.2015.266 DO - 10.2991/cisia-15.2015.266 ID - Hu2015/06 ER -