Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Constructing topic-based Twitter lists

Thesis (MSc)--Stellenbosch University, 2013.

Saved in:
Bibliographic Details
Main Author: De Villiers, Francois
Other Authors: Hoffmann, McElory R.
Format: Thesis
Language:en_ZA
Published: Stellenbosch : Stellenbosch University 2013
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614021582061568
access_status_str Open Access
author De Villiers, Francois
author2 Hoffmann, McElory R.
author_browse De Villiers, Francois
Hoffmann, McElory R.
author_facet Hoffmann, McElory R.
De Villiers, Francois
author_sort De Villiers, Francois
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (MSc)--Stellenbosch University, 2013.
format Thesis
id oai:scholar.sun.ac.za:10019.1/80054
institution Stellenbosch University (South Africa)
language en_ZA
last_indexed 2026-06-10T12:45:24.995Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2013
publishDateRange 2013
publishDateSort 2013
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/80054 Constructing topic-based Twitter lists De Villiers, Francois Hoffmann, McElory R. Kroon, R. S. (Steve) Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Computer Science. Online social networks Data clustering Machine learning Twitter Dissertations -- Mathematical sciences Theses -- Mathematical sciences Dissertations -- Computer science Theses -- Computer science Thesis (MSc)--Stellenbosch University, 2013. ENGLISH ABSTRACT: The amount of information that users of social networks consume on a daily basis is steadily increasing. The resulting information overload is usually associated with a loss of control over the management of information sources, leaving users feeling overwhelmed. To address this problem, social networks have introduced tools with which users can organise the people in their networks. However, these tools do not integrate any automated processing. Twitter has lists that can be used to organise people in the network into topic-based groups. This feature is a powerful organisation tool that has two main obstacles to widespread user adoption: the initial setup time and continual curation. In this thesis, we investigate the problem of constructing topic-based Twitter lists. We identify two subproblems, an unsupervised and supervised task, that need to be considered when tackling this problem. These subproblems correspond to a clustering and classification approach that we evaluate on Twitter data sets. The clustering approach is evaluated using multiple representation techniques, similarity measures and clustering algorithms. We show that it is possible to incorporate a Twitter user’s social graph data into the clustering approach to find topic-based clusters. The classification approach is implemented, from a statistical relational learning perspective, with kLog. We show that kLog can use a user’s tweet content and social graph data to perform accurate topic-based classification. We conclude that it is feasible to construct useful topic-based Twitter lists with either approach. AFRIKAANSE OPSOMMING: Die stroom van inligting wat sosiale-netwerk gebruikers op ’n daaglikse basis verwerk, is aan die groei. Vir baie gebruikers, skep hierdie oordosis inligting ’n gevoel dat hulle beheer oor hul inligtingsbronne verloor. As ’n oplossing, het sosiale-netwerke meganismes geïmplementeer waarmee gebruikers die inligting in hul netwerk kan bestuur. Hierdie meganismes is nie selfwerkend nie, maar kort toevoer van die gebruiker. Twitter het lyste geïmplementeer waarmee gebruikers ander mense in hul sosiale-netwerk kan groepeer. Lyste is ’n kragtige organiserings meganisme, maar tog vind grootskaal gebruik daarvan nie plaas nie. Gebruikers voel dat die opstelling te veel tyd in beslag neem en die onderhoud daarvan te veel moeite is. Hierdie tesis ondersoek die probleem om onderwerp-gerigte Twitter lyste te skep. Ons identisifeer twee subprobleme wat aangepak word deur ’n nie-toesig en ’n toesighoudende metode. Hierdie twee metodes hou verband met trosvorming en klassifikasie onderskeidelik. Ons evalueer beide die trosvorming en klassifikasie op twee Twitter datastelle. Die trosvorming metode word geëvalueer deur te kyk na verskillende voorstellingstegnieke, eendersheid maatstawwe en trosvorming algoritmes. Ons wys dat dit moontlik is om ’n gebruiker se Twitter netwerkdata in te sluit om onderwerp-gerigte groeperinge te vind. Die klassifikasie benadering word geïmplementeer met kLog, vanuit ’n statistiese relasionele leertoerie perspektief. Ons wys dat akkurate onderwerp-gerigte klassifikasie resultate verkry kan word met behulp van gebruikers se tweet-inhoud en sosiale-netwerk data. In beide gevalle wys ons dat dit moontlik is om onderwerp-gerigte Twitter lyste, met goeie resultate, te bou. 2013-02-20T09:21:46Z 2013-03-15T07:33:06Z 2013-02-20T09:21:46Z 2013-03-15T07:33:06Z 2013-03 Thesis http://hdl.handle.net/10019.1/80054 en_ZA Stellenbosch University 98 p. : ill. application/pdf Stellenbosch : Stellenbosch University
spellingShingle Online social networks
Data clustering
Machine learning
Twitter
Dissertations -- Mathematical sciences
Theses -- Mathematical sciences
Dissertations -- Computer science
Theses -- Computer science
De Villiers, Francois
Constructing topic-based Twitter lists
title Constructing topic-based Twitter lists
title_full Constructing topic-based Twitter lists
title_fullStr Constructing topic-based Twitter lists
title_full_unstemmed Constructing topic-based Twitter lists
title_short Constructing topic-based Twitter lists
title_sort constructing topic based twitter lists
topic Online social networks
Data clustering
Machine learning
Twitter
Dissertations -- Mathematical sciences
Theses -- Mathematical sciences
Dissertations -- Computer science
Theses -- Computer science
url http://hdl.handle.net/10019.1/80054
work_keys_str_mv AT devilliersfrancois constructingtopicbasedtwitterlists