COMPARING GRAPH SIMILARITY MEASURES FOR SEMANTIC REPRESENTATIONS OF DOCUMENTS

R. MANRIQUE, F. CUETO-RAMIREZ, O. MARIÑO

COLOMBIAN CONFERENCE ON COMPUTING, 2018

Abstract

Documents semantic representations built from open Knowledge Graphs (KGs) have proven to be beneficial in tasks such as recommendation, user profiling, and document retrieval. Broadly speaking, a semantic representation of a document can be defined as a graph whose nodes represent concepts and whose edges represent the semantic relationships between them. Fine-grained information about the concepts found in the KGs (e.g. DBpedia, YAGO, BabelNet) can be exploited to enrich and refine the representation. Although this kind of semantic representation is a graph, most applications that compare semantic representations reduce this graph to a “flattened” concept-weight representation and use existing well-known vector similarity measures. Consequently, relevant information related to the graph structure is not exploited. In this paper, different graph-based similarity measures are adapted to semantic representation graphs and are implemented and evaluated. Experiments performed on two datasets reveal better results when using the graph similarity measures than when using vector similarity measures. This paper presents the conceptual background, the adapted measures and their evaluation and ends with some conclusions on the threshold between precision and computational complexity.

Journal