Title

Nelasso: Group-Sparse Modeling For Characterizing Relations Among Named Entities In News Articles

Keywords

LASSO; Named entities; News understanding; Semantic network construction; Sparse group learning

Abstract

Named entities such as people, locations, and organizations play a vital role in characterizing online content. They often reflect information of interest and are frequently used in search queries. Although named entities can be detected reliably from textual content, extracting relations among them is more challenging, yet useful in various applications (e.g., news recommending systems). In this paper, we present a novel model and system for learning semantic relations among named entities from collections of news articles. We model each named entity occurrence with sparse structured logistic regression, and consider the words (predictors) to be grouped based on background semantics. This sparse group LASSO approach forces the weights of word groups that do not influence the prediction towards zero. The resulting sparse structure is utilized for defining the type and strength of relations. Our unsupervised system yields a named entities' network where each relation is typed, quantified, and characterized in context. These relations are the key to understanding news material over time and customizing newsfeeds for readers. Extensive evaluation of our system on articles from TIME magazine and BBC News shows that the learned relations correlate with static semantic relatedness measures like WLM, and capture the evolving relationships among named entities over time.

Publication Date

10-1-2017

Publication Title

IEEE Transactions on Pattern Analysis and Machine Intelligence

Volume

39

Issue

10

Number of Pages

2000-2014

Document Type

Article

Personal Identifier

scopus

DOI Link

https://doi.org/10.1109/TPAMI.2016.2632117

Socpus ID

85029938662 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/85029938662

This document is currently not available here.

Share

COinS