Abstract

Knowledge graphs (KGs) are widely used for modeling scholarly communication, performing scientometric analyses, and supporting a variety of intelligent services to explore the literature and predict research dynamics. However, they often suffer from incompleteness (e.g., missing affiliations, references, research topics), leading to a reduced scope and quality of the resulting analyses. This issue is usually tackled by computing knowledge graph embeddings (KGEs) and applying link prediction techniques. However, only a few KGE models are capable of taking weights of facts in the knowledge graph into account. Such weights can have different meanings, e.g. describe the degree of association or the degree of truth of a certain triple. In this paper, we propose the Weighted Triple Loss, a new loss function for KGE models that takes full advantage of the additional numerical weights on facts and it is even tolerant to incorrect weights. We also extend the Rule Loss, a loss function that is able to exploit a set of logical rules, in order to work with weighted triples. The evaluation of our solutions on several knowledge graphs indicates significant performance improvements with respect to the state of the art. Our main use case is the large-scale AIDA knowledge graph, which describes 21 million research articles. Our approach enables to complete information about affiliation types, countries, and research topics, greatly improving the scope of the resulting scientometrics analyses and providing better support to systems for monitoring and predicting research dynamics.

Open Resources

Dataset

We released the dataset that we used to train our model. In Figure 1 there is the data model displaying entities and relationships. In Table 1 there are the 25 distinct relationships available in our dataset.

Figure 1. Data model.

List of Relationships
http://aida.kmi.open.ac.uk/aida35k/ontology#hasAffiliation
http://aida.kmi.open.ac.uk/aida35k/ontology#hasAffiliation-weight
http://aida.kmi.open.ac.uk/aida35k/ontology#hasAffiliationDistribution
http://aida.kmi.open.ac.uk/aida35k/ontology#hasAuthor
http://aida.kmi.open.ac.uk/aida35k/ontology#hasCitationYear
http://aida.kmi.open.ac.uk/aida35k/ontology#hasCitationYear-weight
http://aida.kmi.open.ac.uk/aida35k/ontology#hasCitationDistribution
http://aida.kmi.open.ac.uk/aida35k/ontology#hasConfName
http://aida.kmi.open.ac.uk/aida35k/ontology#hasConfSeries
http://aida.kmi.open.ac.uk/aida35k/ontology#hasCountry
http://aida.kmi.open.ac.uk/aida35k/ontology#hasCountry-weight
http://aida.kmi.open.ac.uk/aida35k/ontology#hasCountryDistribution
http://aida.kmi.open.ac.uk/aida35k/ontology#hasCsoEnhancedTopic
http://aida.kmi.open.ac.uk/aida35k/ontology#hasEntityType
http://aida.kmi.open.ac.uk/aida35k/ontology#hasGridType
http://aida.kmi.open.ac.uk/aida35k/ontology#hasGridType-weight
http://aida.kmi.open.ac.uk/aida35k/ontology#hasGridTypeDistribution
http://aida.kmi.open.ac.uk/aida35k/ontology#hasIndustrialSector
http://aida.kmi.open.ac.uk/aida35k/ontology#hasJourName
http://aida.kmi.open.ac.uk/aida35k/ontology#hasNetworkInDistribution
http://aida.kmi.open.ac.uk/aida35k/ontology#hasPaper
http://aida.kmi.open.ac.uk/aida35k/ontology#hasReference
http://aida.kmi.open.ac.uk/aida35k/ontology#hasType
http://aida.kmi.open.ac.uk/aida35k/ontology#hasWorkedInDistribution
http://aida.kmi.open.ac.uk/aida35k/ontology#hasYear
Table 1. List of semantic relationships available in the AIDA35K dataset.
Download dataset (2.6 MB)

Code

On the following GitHub repository you can find the code for training our link prediction model.

Weighted Graph Embedding

GitHub Repository

This repository includes the implementation for Link Prediction of Weighted Triples for Knowledge Graph Completion within the Scholarly Domain, submitted to IEEE Access Journal.

GitHub