Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Feature selection for multi-label classification

Thesis (PhD)--Stellenbosch University, 2020.

Saved in:
Bibliographic Details
Main Author: Contardo-Berning, Ivona E.
Other Authors: Steel, S. J.
Format: Thesis
Language:en_ZA
Published: Stellenbosch : Stellenbosch University 2020
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613972650262528
access_status_str Open Access
author Contardo-Berning, Ivona E.
author2 Steel, S. J.
author_browse Contardo-Berning, Ivona E.
Steel, S. J.
author_facet Steel, S. J.
Contardo-Berning, Ivona E.
author_sort Contardo-Berning, Ivona E.
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (PhD)--Stellenbosch University, 2020.
format Thesis
id oai:scholar.sun.ac.za:10019.1/109247
institution Stellenbosch University (South Africa)
language en_ZA
last_indexed 2026-06-10T12:44:38.662Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2020
publishDateRange 2020
publishDateSort 2020
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/109247 Feature selection for multi-label classification Contardo-Berning, Ivona E. Steel, S. J. Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Economics. Multi-label classification Correspondence analysis (Statistics) Biplots UCTD Thesis (PhD)--Stellenbosch University, 2020. ENGLISH ABSTRACT : The field of multi-label learning is a popular new research focus. In the multi-label setting, a data instance can be associated simultaneously with a set of labels instead of only a single label. This dissertation reviews the subject of multi-label classification, emphasising some of the notable developments in the field. The nature of multi-label datasets typically means that these datasets are complex and dimensionality reduction might aid in the analysis of these datasets. The notion of feature selection is therefore introduced and discussed briefly in this dissertation. A new procedure for multi-label feature selection is proposed. This new procedure, relevance pattern feature selection (RPFS), utilises the methodology of the graphical technique of Multiple Correspondence Analysis (MCA) biplots to perform feature selection. An empirical evaluation of the proposed technique is performed using a benchmark multi-label dataset and synthetic multi-label datasets. For the benchmark dataset it is shown that the proposed procedure achieves results similar to the full model, while using significantly fewer features. The empirical evaluation of the procedure on the synthetic datasets shows that the results achieved by the reduced sets of features are better than those achieved with a full set of features for the majority of the methods. The proposed procedure is then compared to two established multi-label feature selection techniques using the synthetic datasets. The results again show that the proposed procedure is effective. AFRIKAANSE OPSOMMING : Die veld van multi-etiket leerteorie is ’n gewilde nuwe navorsingsarea. In die multi-etiket omgewing kan ’n datageval gelyktydig geassosieer word met ’n stel etikette in plaas van met slegs ’n enkele etiket. Hierdie verhandeling verskaf ’n oorsig oor die onderwerp van multi-etiket klassifikasie en beklemtoon sekere noemenswaardige ontwikkelings in die veld. Die aard van multi-etiket datastelle leen homself tipies tot komplekse datasetelle waar dimensie reduksie die analise van hierdie datastelle kan vergemaklik. Die konsep van veranderlike seleksie word dus voorgestel en kortliks in hierdie verhandeling bespreek. ’n Nuwe prosedure vir multi-etiket veranderlike seleksie word voorgestel. Hierdie nuwe procedure, relevansie patroon verandelike seleksie (RPFS), maak gebruik van die metodologie van die grafiese tegniek van Meervoudige Ooreenstemmingsanalise bi-stippings om veranderlike seleksie uit te voer. ’n Empiriese evaluering van die voorgestelde tegniek is uitgevoer met behulp van ’n norm multi-etiket datastel en sintetiese multi-etiket datastelle. Vir die norm datastel word aangetoon dat die voorgestelde prosedure soortgelyke resultate lewer as die volledige model, maar met beduidend minder veranderlikes. Die empiriese evaluering van die prosedure op die sintetiese datastelle toon dat die resultate wat deur die gereduseerde stel veranderlikes gelewer word, beter is as dié wat met die volledige stel veranderlikes gelewer is, vir die meerderheid van die metodes. Die voorgestelde prosedure word dan vergelyk met twee gevestigde multi-etiket verandelike seleksie tegnieke met behulp van die sintetiese datastelle. Die resultate toon weereens dat die voorgestelde prosedure effektief is. Doctoral 2020-11-27T06:23:42Z 2021-01-31T19:41:05Z 2020-11-27T06:23:42Z 2021-01-31T19:41:05Z 2020-12 Thesis http://hdl.handle.net/10019.1/109247 en_ZA Stellenbosch University xxiv, 686 pages ; illustrations, includes annexures application/pdf Stellenbosch : Stellenbosch University
spellingShingle Multi-label classification
Correspondence analysis (Statistics)
Biplots
UCTD
Contardo-Berning, Ivona E.
Feature selection for multi-label classification
title Feature selection for multi-label classification
title_full Feature selection for multi-label classification
title_fullStr Feature selection for multi-label classification
title_full_unstemmed Feature selection for multi-label classification
title_short Feature selection for multi-label classification
title_sort feature selection for multi label classification
topic Multi-label classification
Correspondence analysis (Statistics)
Biplots
UCTD
url http://hdl.handle.net/10019.1/109247
work_keys_str_mv AT contardoberningivonae featureselectionformultilabelclassification