Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

L-classifier chains classification and variable selection for multi-label datasets

Thesis (MCom)--Stellenbosch University, 2016.

Saved in:
Bibliographic Details
Main Author: Du Toit, Monika
Other Authors: Steel, S. J.
Format: Thesis
Language:en_ZA
Published: Stellenbosch : Stellenbosch University 2016
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614131316588544
access_status_str Open Access
author Du Toit, Monika
author2 Steel, S. J.
author_browse Du Toit, Monika
Steel, S. J.
author_facet Steel, S. J.
Du Toit, Monika
author_sort Du Toit, Monika
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (MCom)--Stellenbosch University, 2016.
format Thesis
id oai:scholar.sun.ac.za:10019.1/100164
institution Stellenbosch University (South Africa)
language en_ZA
last_indexed 2026-06-10T12:47:09.638Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2016
publishDateRange 2016
publishDateSort 2016
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/100164 L-classifier chains classification and variable selection for multi-label datasets Du Toit, Monika Steel, S. J. Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics & Actuarial Science. Multi-label classification Statistical methods Random-forests Instrumental variables (Statistics) UCTD Thesis (MCom)--Stellenbosch University, 2016. ENGLISH SUMMARY : Multi-label classification extends binary and multi-class classification to scenarios where every data case is assigned several labels simultaneously. Applications include labelling images with tags, identifying instruments that are playing in a musical piece and classifying text according to two or more labels. Variable selection is an important part of multi-label data analysis, but it has received little attention in the literature. Multi-label variable selection is more complex than for binary classification, mainly due to the presence of more than one response as well as label dependence. In this thesis, a multi-label classification approach called L-classifier chains (LCC) is proposed. This method implements a compromise between simple classifier chains and the ensemble of classifier chains procedures. The LCC approach uses an ensemble of classifier chains with a semi-random chain structure and random forests as base learners to perform variable selection. The specific structural assumptions of the LCC method allow for variable importance inference based on the output from the random forests. The results from LCC include multi-label predictions and a matrix of variable importance values. This thesis illustrates the application of the LCC clasifier by conducting empirical work using multi-label benchmark datasets, simulated datasets and a practical dataset obtained from a South African credit bureau. Throughout the practical applications, it compares the performance of LCC relative to three other classifiers, namely binary relevance, classifier chains and ensemble of classifier chains. AFRIKAANSE OPSOMMING : Geen opsomming beskikbaar. Masters 2016-12-22T13:22:20Z 2016-12-22T13:22:20Z 2016-12 Thesis http://hdl.handle.net/10019.1/100164 en_ZA Stellenbosch University xii, 169 pages ; illustrations, includes annexures application/pdf Stellenbosch : Stellenbosch University
spellingShingle Multi-label classification
Statistical methods
Random-forests
Instrumental variables (Statistics)
UCTD
Du Toit, Monika
L-classifier chains classification and variable selection for multi-label datasets
title L-classifier chains classification and variable selection for multi-label datasets
title_full L-classifier chains classification and variable selection for multi-label datasets
title_fullStr L-classifier chains classification and variable selection for multi-label datasets
title_full_unstemmed L-classifier chains classification and variable selection for multi-label datasets
title_short L-classifier chains classification and variable selection for multi-label datasets
title_sort l classifier chains classification and variable selection for multi label datasets
topic Multi-label classification
Statistical methods
Random-forests
Instrumental variables (Statistics)
UCTD
url http://hdl.handle.net/10019.1/100164
work_keys_str_mv AT dutoitmonika lclassifierchainsclassificationandvariableselectionformultilabeldatasets