Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Categorical CVA biplots

Thesis (MCom)--Stellenbosch University, 2020.

Saved in:
Bibliographic Details
Main Author: Rodwell, David Timothy
Other Authors: Van der Merwe, Carel Johannes
Format: Thesis
Language:en_ZA
Published: Stellenbosch : Stellenbosch University 2020
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614087879327744
access_status_str Open Access
author Rodwell, David Timothy
author2 Van der Merwe, Carel Johannes
author_browse Rodwell, David Timothy
Van der Merwe, Carel Johannes
author_facet Van der Merwe, Carel Johannes
Rodwell, David Timothy
author_sort Rodwell, David Timothy
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (MCom)--Stellenbosch University, 2020.
format Thesis
id oai:scholar.sun.ac.za:10019.1/109122
institution Stellenbosch University (South Africa)
language en_ZA
last_indexed 2026-06-10T12:46:28.519Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2020
publishDateRange 2020
publishDateSort 2020
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/109122 Categorical CVA biplots Rodwell, David Timothy Van der Merwe, Carel Johannes Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science. Biplots Canonical Variate Analysis (CVA) Categorical data Canonical correlation (Statistics) Information visualization Multivariate analysis UCTD Thesis (MCom)--Stellenbosch University, 2020. ENGLISH ABSTRACT: In the modern era a great amount of emphasis is placed on data visualisation, especially in cases where a large amount of data is present. Usually, in these instances, the data is of a high-dimensional nature which cannot be visualised using conventional means. Fortunately, there has been a recent surge in using biplots to visualise multivariate data, where biplots can be described as a generalisation of a scatterplot. Moreover, these biplots use dimension reduction techniques to construct a two-dimensional representation of the data with non-orthogonal axes. However, at present, an effective biplot construction technique which adequately separates classes, in cases where categorical data is present does not exist. Hence, this research builds upon an existing biplot construction technique by using elements from Canonical Variate Analysis (CVA) and non-linear Principal Component Analysis (PCA) to develop a technique that can perform class separation in cases where numerical and categorical data is present. This novel biplot construction methodology forms the crux of this research assignment. Subsequently, the feasibility of this method was explored by considering the well-known Iris data set where two variables are binned to form categorical variables. It is shown that this novel method improves upon existing biplot construction in terms of classification accuracy and class separation. However, it is noted this method can be extended by incorporating CVA in the iterative algorithm which solves the optimal categorical level scores. A web-based Shiny application was built as supplement to this paper, and can be found at https://davidrodwell:shinyapps:io/CategoricalCVABiplotApp/. Here the user can interact with the data sets, proposed methodology, and functionalities presented in this research. AFRIKAANSE OPSOMMING: In die moderne era word daar baie klem gelê op die visualisering van data, veral in waar groot datastelle betrokke is. In hierdie gevalle is die data gewoonlik hoë-dimensioneel van aard, wat veroorsaak dat dit nie deur konvensionele maniere visueel voorgestel kan word nie. Onlangse verwikkelinge het gelei tot ’n toename in die gebruik van bi-stippings om multi-veranderlike data voor te stel, waar bi-stippings as ’n veralgemening van ’n spreidingsdiagram beskryf kan word. Hierdie bi-stippings gebruik dimensie verminderingstegnieke om ’n twee-dimensionele voorstelling van die data op ’n nie-ortogonale assestelsel te konstrueer. Huidiglik bestaan daar nie ’n effektiewe bi-stipping konstruksietegniek wat klasse kan verdeel wanneer kategoriese data teenwoordig is nie. Hierdie navorsing bou op ’n bestaande bi-stipping konstruksietegniek wat elemente van Kanoniese Veranderlike Analise (KVA) en nie-lineêre Hoof Komponent Analise (HKA) gebruik om ’n tegniek te ontwikkel wat klasse kan verdeel in gevalle waar numeriese sowel as kategoriese data teenwoordig is. Hierdie nuwe bi-stipping konstruksie metodologie vorm die kruks van hierdie navorsingstaak. Die lewensvatbaarheid van hierdie metode was ook ondersoek deur die welbekende Iris datastel te oorweeg, waar twee veranderlikes ingedeel word om kategoriese veranderlikes te word. Dit is gewys dat hierdie nuwe metode die bestaande biplot konstruksietegnieke verbeter in terme van klassifikasie akkuraatheid en klas verdeling. Daar was wel opgemerk dat hierdie metode uitgebrei kan word deur KVA in die iteratiewe algoritme te inkorporeer, wat die optimale kategoriese vlak tellings oplos. ’n Web-gebaseerde Shiny toepassing was gebou as supplimentêr tot hierdie artikel, en kan gevind word by https://davidrodwell:shinyapps:io/CategoricalCVABiplotApp/. Hier kan die gebruiker self interaksie hê met die datastelle, voorgestelde metadologie, en funksionaliteite wat voorgelê is in hierdie navorsing. Masters 2020-10-27T10:11:50Z 2021-01-31T19:36:14Z 2020-10-27T10:11:50Z 2021-01-31T19:36:14Z 2020-12 Thesis http://hdl.handle.net/10019.1/109122 en_ZA Stellenbosch University v, 30 pages : illustrations application/pdf Stellenbosch : Stellenbosch University
spellingShingle Biplots
Canonical Variate Analysis (CVA)
Categorical data
Canonical correlation (Statistics)
Information visualization
Multivariate analysis
UCTD
Rodwell, David Timothy
Categorical CVA biplots
title Categorical CVA biplots
title_full Categorical CVA biplots
title_fullStr Categorical CVA biplots
title_full_unstemmed Categorical CVA biplots
title_short Categorical CVA biplots
title_sort categorical cva biplots
topic Biplots
Canonical Variate Analysis (CVA)
Categorical data
Canonical correlation (Statistics)
Information visualization
Multivariate analysis
UCTD
url http://hdl.handle.net/10019.1/109122
work_keys_str_mv AT rodwelldavidtimothy categoricalcvabiplots