Open Access Open Access  Restricted Access Subscription Access

MINING TOP-K FREQUENT SEQUENTIAL PATTERN IN ITEM INTERVAL EXTENDED SEQUENCE DATABASE

Duong Huy Tran, Thang Truong Nguyen, Thi Duc Vu, Anh The Tran

Abstract


Abstract. Frequent sequential pattern mining in item interval extended sequence database (iSDB) has been one of interesting task in recent years. Unlike classic frequent sequential pattern mining, the pattern mining in iSDB also consider the item interval between successive items; thus, it may extract more meaningful sequential patterns in real life. Most previous frequent sequential pattern mining in iSDB algorithms needs a minimum support threshold (minsup) to perform the mining. However, it’s not easy for users to provide an appropriate threshold in practice. The too high minsup value will lead to missing valuable patterns, while the too low minsup value may generate too many useless patterns. To address this problem, we propose an algorithm: TopKWFP – Top-k weighted frequent sequential pattern mining in item interval extended sequence database. Our algorithm doesn’t need to provide a fixed minsup value, this minsup value will dynamically raise during the mining process

Keywords


sequential pattern; time; item interval; top-K; weighted

Full Text:

PDF

References


Agrawal.R, Srikant.R, "Mining sequential patterns," in Proceedings of the International Conference on Data Engineering (ICDE), 1995.

Zaki.M, "An Efficient Algorithm for Mining Frequent Sequences," in Machine Learning, 2000.

Pei.J, Han.J, Asi.B.M, Pino.H, "PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth," in Proceedings of the Seventeenth International Conference on Data Engineering, 2001.

Zaki.M, "SPADE: An Efficient Algorithm for Mining Frequent Sequences," Machine Learning, vol. 40, pp. 31-60, 2000.

Ayres.J, Gehrke.J, Yiu.T and Flannick.J, "Sequential Pattern Mining using Bitmap Representation," in Proc. of ACM SIGKDD’02, 2002.

Yu Hirate, Hayato Yamana, "Generalized Sequential Pattern Mining with Item," JOURNAL OF COMPUTERS, vol. 1, no. 3, pp. 51-60, 2006.

Cai.C.H, Chee Fu.A.W, Cheng.C.H, and Kwong.W.W, "Mining Association Rules with Weighted Items," in Proceedings of the 1998 International Symposium on Database Engineering & Applications, Cardiff, Wales, 1998.

Wang.W, Yang.J, and Yu.P.S, "Efficient Mining of Weighted Association Rules (WAR)," in Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2000.

Tao.F, Murtagh.F, Farid.M, "Weighted Association Rule Mining Using Weighted Support and Significance Framework," in Proceedings of 9th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2003.

Khan.M.S, Muyeba.M, Coenen.F, "Weighted Association Rule Mining from Binary and Fuzzy Data," in Proceedings of 8th Industrial Conference, ICDM 2008, 2008.

Yun.U, Leggett.J.J, "WFIM: weighted frequent itemset mining with a weight range and a minimum weight," in 5th SIAM Int. Conf. on Data Mining, 2005.

Janos Demetrovics, Vu Duc Thi, Tran Huy Duong, "An algorithm to mine normalized weighted sequential patterns using Prefix-Projected Database," Serdica Journal of Computing, Sofia, Bulgarian Academy of Sciences, vol. 2, p. 105–122, 2015.

Tran Huy Duong, Vu Duc Thi, "Algorithm mining normalized weighted frequent sequential patterns with Time intervals," Research, Development and Application on Information & Communication Technology, vol. 2, pp. 72-81, 2015.

Wang.J and Han.J, TFP, "An Efficient Algorithm for Mining Top-K Frequent Closed Itemsets," TKDE, vol. 17, pp. 652-664, 2005.

Chuang.K, Huang.J and Chen.M, "Mining Top-K Frequent Patterns in the Presence of the Memory Constraint," VLDB Journal, vol. 17, pp. 1321-1344, 2008.

Cheung.Y.L and Fu.A.W, "Mining frequent itemsets without support threshold: with and without item constraints," TKDE, vol. 16, pp. 1052-1069, 2004.

Sharda Khode, Sudhir Mohod, "Mining high utility itemsets using TKO and TKU to find top-k high utility web access patterns," in 2017 International conference of Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, 2017.

Tzvetkov.P, Yan.X and Han.J, "TSP: Mining Top-K Closed Sequential Patterns," ICDM, pp. 347-354, 2003.

Zheng.Z, Cao.L, Song.Y and Wei.W, "Efficiently Mining Top-K High Utility Sequential Patterns," 2013 IEEE 13th International Conference on Data Mining, pp. 1259-1264, 2013.

Asima Jamil, Abdus Salam and Farhat Amin, "Performance evaluation of top-k sequential mining methods on synthetic and real datasets," International Journal of Advanced Computer Research, vol. 7, no. 32, pp. 176-184, 2017.

Fournier-Viger.P, Gomariz.A, Gueniche.T, Mwamikazi.E, Thomas.R, "TKS: Efficient Mining of Top-K Sequential Patterns," Springer Advanced Data Mining and Application, vol. 8346, pp. 109-120, 2013.

Karishma B Hathi , Jatin R Ambasana, "Top K Sequential Pattern Mining Algorithm.," International Conference on Information Engineering, Management and Security, pp. 115-120, 2015.




DOI: https://doi.org/10.15625/1813-9663/34/3/13053

Journal of Computer Science and Cybernetics ISSN: 1813-9663

Published by Vietnam Academy of Science and Technology