Maximal Sequence Mining Approach For Topic Detection From Microblog Streams
Abstract
Unprecedented expansion of user generated content in recent years demands more attempts of information filtering in order to extract high quality information from the huge amount of available data. In particular, topic detection from microblog streams is the first step toward monitoring and summarizing social data. This task is challenging due to the short and noisy characteristics of microblog content. Moreover, the underlying models need to be able to deal with heterogeneous streams which contain multiple stories evolving simultaneously. In this work, we introduce a frequent pattern mining approach for topic detection from a microblog stream. This approach first uses a Maximal Sequence Mining (MSM) algorithm to extract pattern sequences, each an ordered set of terms. This scheme can capture more semantic information than using unordered sets of the same terms. A pattern graph, which is a directed-graph representation of the mined sequences, can then be constructed. Subsequently, a community detection algorithm is applied on the pattern graph to group the mined patterns into different topic clusters. Experiments on Twitter datasets demonstrate that MSM approach achieves high performance in comparison with the state-of-the-art methods.
Publication Date
2-9-2017
Publication Title
2016 IEEE Symposium Series on Computational Intelligence, SSCI 2016
Document Type
Article; Proceedings Paper
Personal Identifier
scopus
DOI Link
https://doi.org/10.1109/SSCI.2016.7849940
Copyright Status
Unknown
Socpus ID
85016064403 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/85016064403
STARS Citation
Jafariakinabad, Fereshteh and Hua, Kien A., "Maximal Sequence Mining Approach For Topic Detection From Microblog Streams" (2017). Scopus Export 2015-2019. 6694.
https://stars.library.ucf.edu/scopus2015/6694