Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Link prediction in knowledge graphs using latent feature modelling and neural tensor factorisation

Thesis (MSc)--Stellenbosch University, 2020.

Saved in:
Bibliographic Details
Main Author: Magangane, Luyolo
Other Authors: Brink, Willie
Format: Thesis
Language:en_ZA
Published: Stellenbosch : Stellenbosch University 2020
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614033677385728
access_status_str Open Access
author Magangane, Luyolo
author2 Brink, Willie
author_browse Brink, Willie
Magangane, Luyolo
author_facet Brink, Willie
Magangane, Luyolo
author_sort Magangane, Luyolo
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (MSc)--Stellenbosch University, 2020.
format Thesis
id oai:scholar.sun.ac.za:10019.1/109328
institution Stellenbosch University (South Africa)
language en_ZA
last_indexed 2026-06-10T12:45:36.533Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2020
publishDateRange 2020
publishDateSort 2020
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/109328 Link prediction in knowledge graphs using latent feature modelling and neural tensor factorisation Magangane, Luyolo Brink, Willie Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Division Applied Mathematics. Link prediction Factorization (Mathematics) Machine learning Tensor Algebra Artificial intelligence UCTD Thesis (MSc)--Stellenbosch University, 2020. ENGLISH ABSTRACT: Reasoning over knowledge expressed in natural language is a problem at the forefront of artificial intelligence. Question answering is one of the core tasks of this problem, and is concerned with giving machines the capability of generating an answer given a question, by mimicking the reasoning behaviour of humans. Relational learning, in combination with information retrieval, has been explored as a framework for solving this problem. Knowledge graphs (KGs) are used to represent facts about multiple domains as entities (nodes) and relations (edges), and the resource description framework formalism, subject-predicate-object, is used to encode these facts. Link prediction then powers knowledge discovery by scoring possible relationships between entities. This thesis explores latent feature modelling using tensor factorisation as an approach to link prediction. Tensor decompositions are an attractive approach as relational domains are usually high-dimensional and sparse, a setting where factorisation methods have shown very good results. Previous approaches have focused on shallow models that can scale to large datasets, and recently deep models have been applied, specifically neural tensor factorisation models, as these models are more expressive and automatically learn the most useful latent features for entities and relations. In this work we introduce training algorithm optimisations to the neural tensor network (NTN) and HypER neural tensor factorisation models. We make use of the TensorFlow reimplementation of NTNs and apply early stopping, adaptive moment estimation and hyperparameter optimisation using random search. We see improvements in both cost and accuracy over the baseline NTN reimplementation, using standard link prediction benchmark datasets WordNet and Freebase. We then apply optimisations to the HypER model training algorithm. We begin with compensating for covariate shift caused by hypernetworks, using batch normalisation, and propose HypER+. We see similar performance to the HypER baseline on the WN18 dataset, and see significant improvement using the FB15k dataset. We extend our optimisation by initialising entity and relation embeddings using pretrained word vectors from the GloVe language model. We see marginal improvements over the baseline using the WN18RR and FB15k-237 datasets. Our results establish HypER+ as a state-of-the-art model in latent feature modelling based link prediction. AFRIKAANSE OPSOMMING: Redenering oor kennis wat in natuurlike taal uitgedruk word, is ’n probleem aan die voorpunt van kunsmatige intelligensie. Die beantwoording van vrae is een van die kerntake van hierdie probleem, en poog om masjiene die vermoë te gee om ’n antwoord te skep vir ’n gegewe vraag, deur die redenasiegedrag van mense na te boots. Verhoudingsleer, in kombinasie met die inwin van inligting, is al ondersoek as ’n raamwerk vir die oplossing van hierdie probleem. Kennisgrafieke (KG’s) word gebruik om feite oor veelvuldige domeine as entiteite (punte) en verhoudings (lyne) voor te stel, en die bronbeskrywingsraamwerk-formalisme, nl. onderwerppredikaat- voorwerp, word gebruik om sulke feite te enkodeer. Skakelvoorspelling dryf dan kennisontdekking deur moontlike verhoudings tussen entiteite te bepunt. Hierdie tesis ondersoek latente kenmerkmodellering met behulp van tensorfaktorisering, as ’n benadering tot skakelvoorspelling. Tensor-ontbindings is ’n aantreklike benadering, aangesien verhoudingsdomeine gewoonlik hoogdimensioneel en yl is; omstandighede waar faktoriseringsmetodes reeds baie goeie resultate getoon het. Vorige benaderings het op vlak modelle gefokus, wat kan skalleer met groot datastelle. Meer onlangs is diep modelle toegepas, spesifiek neurale tensorfaktoriseringsmodelle, aangesien hierdie modelle meer ekspressief is en outomaties die nuttigste latente kenmerke vir entiteite en verhoudings kan aanleer. In hierdie werk stel ons optimering van afrigalgoritmes voor vir die neurale tensornetwerk (NTN) en HypER neurale tensorfaktoriseringsmodelle. Ons maak gebruik van die TensorFlow-herimplementering van NTN’s, en pas vroeë-stop, aanpasbare momentskatting, sowel as hiperparameteroptimering met ewekansige soeke, toe. Ons sien verbeterings in koste sowel as akkuraatheid oor die basiese NTN-herimplementering, in die standaard skakelvoorspellingsdatastelle WordNet en Freebase. Ons pas dan optimerings toe op die HypER-model se afrigtingsalgoritme. Ons begin met die kompensering van kovariantskuif wat deur hipernetwerke veroorsaak word, met behulp van bondelnormalisering, en stel HypER+ voor. Ons sien prestasies soortgelyk aan die HypER-basismodel op die WN18-datastel, en beduidende verbetering op die FB15k-datastel. Ons brei ons optimering uit deur entiteit- en verhoudingsinbeddings te inisialiseer met vooraf-afgerigte woordvektore van die GloVe-taalmodel. Ons sien marginale verbeterings oor die basismodel op die WN18RR en FB15k-237 datastelle. Ons resultate vestig HypER+ as ’n mededingende model in latente kenmerkmodelleringsgebaseerde skakelvoorspelling. Masters 2020-12-01T08:38:23Z 2021-01-31T19:44:53Z 2020-12-01T08:38:23Z 2021-01-31T19:44:53Z 2020-12 Thesis http://hdl.handle.net/10019.1/109328 en_ZA Stellenbosch University viii, 84 pages : illustrations application/pdf Stellenbosch : Stellenbosch University
spellingShingle Link prediction
Factorization (Mathematics)
Machine learning
Tensor Algebra
Artificial intelligence
UCTD
Magangane, Luyolo
Link prediction in knowledge graphs using latent feature modelling and neural tensor factorisation
title Link prediction in knowledge graphs using latent feature modelling and neural tensor factorisation
title_full Link prediction in knowledge graphs using latent feature modelling and neural tensor factorisation
title_fullStr Link prediction in knowledge graphs using latent feature modelling and neural tensor factorisation
title_full_unstemmed Link prediction in knowledge graphs using latent feature modelling and neural tensor factorisation
title_short Link prediction in knowledge graphs using latent feature modelling and neural tensor factorisation
title_sort link prediction in knowledge graphs using latent feature modelling and neural tensor factorisation
topic Link prediction
Factorization (Mathematics)
Machine learning
Tensor Algebra
Artificial intelligence
UCTD
url http://hdl.handle.net/10019.1/109328
work_keys_str_mv AT maganganeluyolo linkpredictioninknowledgegraphsusinglatentfeaturemodellingandneuraltensorfactorisation