Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Latent semantic models : a study of probabilistic models for text in information retrieval

Mini Dissertation (MSc)--University of Pretoria, 2020.

Saved in:
Bibliographic Details
Other Authors: De Waal, Alta
Format: Thesis
Language:English
Published: University of Pretoria 2020
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613529957203968
access_status_str Open Access
author2 De Waal, Alta
author_browse De Waal, Alta
author_facet De Waal, Alta
collection Thesis
dc_rights_str_mv © 2019 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.
description Mini Dissertation (MSc)--University of Pretoria, 2020.
format Thesis
id oai:repository.up.ac.za:2263/73881
institution University of Pretoria (South Africa)
language English
last_indexed 2026-06-10T12:37:36.202Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from UPSpace — University of Pretoria Institutional Repository
publishDate 2020
publishDateRange 2020
publishDateSort 2020
publisher University of Pretoria
publisherStr University of Pretoria
record_format dspace
source_str UPSpace — University of Pretoria Institutional Repository
spelling oai:repository.up.ac.za:2263/73881 Latent semantic models : a study of probabilistic models for text in information retrieval De Waal, Alta siyabongamjali@gmail.com Mjali, Siyabonga Zimozoxolo UCTD Mini Dissertation (MSc)--University of Pretoria, 2020. Large volumes of text is being generated every minute which necessitates effective and robust tools to retrieve relevant information. Supervised learning approaches have been explored extensively for this task, but it is difficult to secure large collections of labelled data to train this set of models. Since a supervised approach is too expensive in terms of annotating data, we consider unsupervised methods such as topic models and word embeddings in order to represent corpora in lower dimensional semantic spaces. Furthermore, we investigate different distance measures to capture similarity between indexed documents based on their semantic distributions. These include cosine, soft cosine and Jensen-Shannon similarities. This collection of methods discussed in this work allows for the unsupervised association of semantic similar texts which has a wide range of applications such as fake news detection, sociolinguistics and sentiment analysis. The Hub Internship Centre for Artificial Intelligence Research Statistics MSc (Mathematical Statistics) Unrestricted 2020-03-31T07:21:02Z 2020-03-31T07:21:02Z 2020-09 2020 Mini Dissertation Mjali, SZ 2020, Latent semantic models: A study of probabilistic models for text in information retrieval, Masters mini dissertation, University of Pretoria, Pretoria S2020 http://hdl.handle.net/2263/73881 en © 2019 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria. application/pdf University of Pretoria
spellingShingle UCTD
Latent semantic models : a study of probabilistic models for text in information retrieval
title Latent semantic models : a study of probabilistic models for text in information retrieval
title_full Latent semantic models : a study of probabilistic models for text in information retrieval
title_fullStr Latent semantic models : a study of probabilistic models for text in information retrieval
title_full_unstemmed Latent semantic models : a study of probabilistic models for text in information retrieval
title_short Latent semantic models : a study of probabilistic models for text in information retrieval
title_sort latent semantic models a study of probabilistic models for text in information retrieval
topic UCTD
url http://hdl.handle.net/2263/73881