Title

Exploiting Topical Perceptions Over Multi-Lingual Text For Hashtag Suggestion On Twitter

Abstract

Microblogging websites, such as Twitter, provide seemingly endless amount of textual information on a wide variety of topics generated by a large number of users. Microblog posts, or tweets in Twitter, are often written in an informal manner using multi-lingual styles. Ignoring informal styles or multiple languages can hamper the usefulness of microblogging mining applications. In this paper, we present a statistical method for processing tweets according to users perceptions of topics and hashtags. Based on the non-classical notion of relatedness of vocabulary terms to topics in a corpus, which is quantified by discriminative term weights, our method builds a ranked list of terms related to hashtags. Subsequently, given a new tweet, our method can suggest a ranked list of hashtags. Our method allows enhanced understanding and normalization of users perceptions for improved information retrieval applications. We evaluate our method on a dataset of 14 million tweets collected over a period of 52 days. Results demonstrate that the method actually learns useful relationships between vocabulary terms and topics, and that the performance is better than a Naive Bayes suggestion system. Copyright © 2013, Association for the Advancement of Artificial Intelligence. All rights reserved.

Publication Date

12-13-2013

Publication Title

FLAIRS 2013 - Proceedings of the 26th International Florida Artificial Intelligence Research Society Conference

Number of Pages

474-479

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

Socpus ID

84889772769 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/84889772769

This document is currently not available here.

Share

COinS