Full Text Available

Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Nearest hypersphere classification : a comparison with other classification techniques

Thesis (MCom)--Stellenbosch University, 2014.

Saved in:

Bibliographic Details
Main Author:	Van der Westhuizen, Cornelius Stephanus
Other Authors:	Lamont, M. M. C.
Format:	Thesis
Language:	en_ZA
Published:	Stellenbosch : Stellenbosch University 2015
Subjects:	UCTD Dissertations > Statistics and actuarial science Theses > Statistics and actuarial science Classification Machine learning Kernel functions
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867613867698290688
access_status_str	Open Access
author	Van der Westhuizen, Cornelius Stephanus
author2	Lamont, M. M. C.
author_browse	Lamont, M. M. C. Van der Westhuizen, Cornelius Stephanus
author_facet	Lamont, M. M. C. Van der Westhuizen, Cornelius Stephanus
author_sort	Van der Westhuizen, Cornelius Stephanus
collection	Thesis
dc_rights_str_mv	Stellenbosch University
description	Thesis (MCom)--Stellenbosch University, 2014.
format	Thesis
id	oai:scholar.sun.ac.za:10019.1/95839
institution	Stellenbosch University (South Africa)
language	en_ZA
last_indexed	2026-06-10T12:42:57.574Z
license_str	Other — see source repository
provenance_str_mv	Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate	2015
publishDateRange	2015
publishDateSort	2015
publisher	Stellenbosch : Stellenbosch University
publisherStr	Stellenbosch : Stellenbosch University
record_format	dspace
source_str	SUNScholar — Stellenbosch University Repository
spelling	oai:scholar.sun.ac.za:10019.1/95839 Nearest hypersphere classification : a comparison with other classification techniques Van der Westhuizen, Cornelius Stephanus Lamont, M. M. C. Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science. UCTD Dissertations -- Statistics and actuarial science Theses -- Statistics and actuarial science Classification Machine learning Kernel functions Thesis (MCom)--Stellenbosch University, 2014. ENGLISH ABSTRACT: Classification is a widely used statistical procedure to classify objects into two or more classes according to some rule which is based on the input variables. Examples of such techniques are Linear and Quadratic Discriminant Analysis (LDA and QDA). However, classification of objects with these methods can get complicated when the number of input variables in the data become too large (􀝊 ≪ 􀝌), when the assumption of normality is no longer met or when classes are not linearly separable. Vapnik et al. (1995) introduced the Support Vector Machine (SVM), a kernel-based technique, which can perform classification in cases where LDA and QDA are not valid. SVM makes use of an optimal separating hyperplane and a kernel function to derive a rule which can be used for classifying objects. Another kernel-based technique was proposed by Tax and Duin (1999) where a hypersphere is used for domain description of a single class. The idea of a hypersphere for a single class can be easily extended to classification when dealing with multiple classes by just classifying objects to the nearest hypersphere. Although the theory of hyperspheres is well developed, not much research has gone into using hyperspheres for classification and the performance thereof compared to other classification techniques. In this thesis we will give an overview of Nearest Hypersphere Classification (NHC) as well as provide further insight regarding the performance of NHC compared to other classification techniques (LDA, QDA and SVM) under different simulation configurations. We begin with a literature study, where the theory of the classification techniques LDA, QDA, SVM and NHC will be dealt with. In the discussion of each technique, applications in the statistical software R will also be provided. An extensive simulation study is carried out to compare the performance of LDA, QDA, SVM and NHC for the two-class case. Various data scenarios will be considered in the simulation study. This will give further insight in terms of which classification technique performs better under the different data scenarios. Finally, the thesis ends with the comparison of these techniques on real-world data. AFRIKAANSE OPSOMMING: Klassifikasie is ’n statistiese metode wat gebruik word om objekte in twee of meer klasse te klassifiseer gebaseer op ’n reël wat gebou is op die onafhanklike veranderlikes. Voorbeelde van hierdie metodes sluit in Lineêre en Kwadratiese Diskriminant Analise (LDA en KDA). Wanneer die aantal onafhanklike veranderlikes in ’n datastel te veel raak, die aanname van normaliteit nie meer geld nie of die klasse nie meer lineêr skeibaar is nie, raak die toepassing van metodes soos LDA en KDA egter te moeilik. Vapnik et al. (1995) het ’n kern gebaseerde metode bekendgestel, die Steun Vektor Masjien (SVM), wat wel vir klassifisering gebruik kan word in situasies waar metodes soos LDA en KDA misluk. SVM maak gebruik van ‘n optimale skeibare hipervlak en ’n kern funksie om ’n reël af te lei wat gebruik kan word om objekte te klassifiseer. ’n Ander kern gebaseerde tegniek is voorgestel deur Tax and Duin (1999) waar ’n hipersfeer gebruik kan word om ’n gebied beskrywing op te stel vir ’n datastel met net een klas. Dié idee van ’n enkele klas wat beskryf kan word deur ’n hipersfeer, kan maklik uitgebrei word na ’n multi-klas klassifikasie probleem. Dit kan gedoen word deur slegs die objekte te klassifiseer na die naaste hipersfeer. Alhoewel die teorie van hipersfere goed ontwikkeld is, is daar egter nog nie baie navorsing gedoen rondom die gebruik van hipersfere vir klassifikasie nie. Daar is ook nog nie baie gekyk na die prestasie van hipersfere in vergelyking met ander klassifikasie tegnieke nie. In hierdie tesis gaan ons ‘n oorsig gee van Naaste Hipersfeer Klassifikasie (NHK) asook verdere insig in terme van die prestasie van NHK in vergelyking met ander klassifikasie tegnieke (LDA, KDA en SVM) onder sekere simulasie konfigurasies. Ons gaan begin met ‘n literatuurstudie, waar die teorie van die klassifikasie tegnieke LDA, KDA, SVM en NHK behandel gaan word. Vir elke tegniek gaan toepassings in die statistiese sagteware R ook gewys word. ‘n Omvattende simulasie studie word uitgevoer om die prestasie van die tegnieke LDA, KDA, SVM en NHK te vergelyk. Die vergelyking word gedoen vir situasies waar die data slegs twee klasse het. ‘n Verskeidenheid van data situasies gaan ook ondersoek word om verdere insig te toon in terme van wanneer watter tegniek die beste vaar. Die tesis gaan afsluit deur die genoemde tegnieke toe te pas op praktiese datastelle. Masters 2015-01-13T11:47:35Z 2015-01-13T11:47:35Z 2014-12 Thesis http://hdl.handle.net/10019.1/95839 en_ZA Stellenbosch University 109 p. : Ill. application/pdf Stellenbosch : Stellenbosch University
spellingShingle	UCTD Dissertations -- Statistics and actuarial science Theses -- Statistics and actuarial science Classification Machine learning Kernel functions Van der Westhuizen, Cornelius Stephanus Nearest hypersphere classification : a comparison with other classification techniques
title	Nearest hypersphere classification : a comparison with other classification techniques
title_full	Nearest hypersphere classification : a comparison with other classification techniques
title_fullStr	Nearest hypersphere classification : a comparison with other classification techniques
title_full_unstemmed	Nearest hypersphere classification : a comparison with other classification techniques
title_short	Nearest hypersphere classification : a comparison with other classification techniques
title_sort	nearest hypersphere classification a comparison with other classification techniques
topic	UCTD Dissertations -- Statistics and actuarial science Theses -- Statistics and actuarial science Classification Machine learning Kernel functions
url	http://hdl.handle.net/10019.1/95839
work_keys_str_mv	AT vanderwesthuizencorneliusstephanus nearesthypersphereclassificationacomparisonwithotherclassificationtechniques

Full Text Available

Nearest hypersphere classification : a comparison with other classification techniques

Similar Items