Skip to main content
HOW DOES THE SIZE OF A DOCUMENT AFFECT LINKED OPEN DATA USER MODELING STRATEGIES?

R. MANRIQUE, O. MARIÑO

INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, 2017

Abstract

Semantic user modeling techniques use a representation based on concepts that are linked to a Knowledge Base (KB). Current research uses Linked Open Data (LOD) because of its comprehensive interlinked datasets, which allow excellent cross-domain modeling capabilities. LOD semantic user profiles have been employed in the context of Social Networks, which require user’s posts as input. Less attention has been paid to other domains for which input documents differ from short and concise posts. In this paper, we perform a comparative study of different LOD semantic user modeling techniques by taking different types of documents as input: short, medium, and long texts. We selected recommending academic documents based on modeling the user’s research interests as the evaluation scenario. Academic documents’ titles, abstracts, and the body of text were used, respectively, for short, medium, and long documents. Our results showed that expansion strategies work best for short and medium documents while filtering strategies are more appropriate when the whole document is used as input. Finally, we explored diverse alternatives if documents did not include a summary or abstract, and we concluded that, in this case, the two best alternatives are a filtering strategy over the whole text and the use of TextRank algorithm to build a set of key sentences to be used as input of an expansion strategy.