Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Application of cluster analysis and multidimensional scaling on medical schemes data

Thesis (MComm (Statistics and Actuarial Science))--Stellenbosch University, 2008.

Saved in:
Bibliographic Details
Main Author: Roux, Ian
Other Authors: Le Roux, N. J.
Format: Thesis
Language:English
Published: Stellenbosch : Stellenbosch University 2008
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614077997547520
access_status_str Open Access
author Roux, Ian
author2 Le Roux, N. J.
author_browse Le Roux, N. J.
Roux, Ian
author_facet Le Roux, N. J.
Roux, Ian
author_sort Roux, Ian
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (MComm (Statistics and Actuarial Science))--Stellenbosch University, 2008.
format Thesis
id oai:scholar.sun.ac.za:10019.1/2040
institution Stellenbosch University (South Africa)
language English
last_indexed 2026-06-10T12:46:18.613Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2008
publishDateRange 2008
publishDateSort 2008
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/2040 Application of cluster analysis and multidimensional scaling on medical schemes data Roux, Ian Le Roux, N. J. McLeod, H. Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science. Cluster analysis Dissimilarities Binary data Medical schemes data Dissertations -- Statistics and actuarial science Theses -- Statistics and actuarial science Multidimensional scaling Health insurance -- Statistical methods Chronic diseases Thesis (MComm (Statistics and Actuarial Science))--Stellenbosch University, 2008. Cluster analysis and multidimensional scaling (MDS) methods can be used to explore the structure in multidimensional data and can be applied to various fields of study. In this study, clustering techniques and MDS methods are applied to a data set from the health insurance field. This data set contains information of the number of medical scheme beneficiaries, between ages 55 to 59, that are treated for certain combinations of chronic diseases. Clustering techniques and MDS methods will be used to describe the interrelations among these chronic diseases and to determine certain clusters of chronic diseases. Similarity or dissimilarity measures between the chronic diseases are constructed before the application of MDS methods or clustering techniques, because the chronic diseases are binary variables in the data set. The calculation of dissimilarities between the chronic diseases is based on various dissimilarity coefficients, where a different dissimilarity coefficient will produce a different set of dissimilarities. One of the aims of this study is to compare different dissimilarity coefficients and it will be shown that the Jaccard, Ochiai, Baroni-Urbani-Buser, Phi and Yule dissimilarity coefficients are most suitable for use on this particular data set. MDS methods are used to produce a lower dimensional display space where the chronic diseases are represented by points and distances between these points give some measurement of similarity between the chronic diseases. The classical scaling, metric least squares scaling and nonmetric MDS methods are used in this study and it will be shown that the nonmetric MDS method is the most suitable MDS method to use for this particular data set. The Scaling by Majorizing a Complicated Function (SMACOF) algorithm is used to minimise the loss functions in this study and it was found to perform well. Clustering techniques are used to provide information about the clustering structure of the chronic diseases. Chronic diseases that are in the same cluster can be considered to be more similar, while chronic diseases in different clusters are more dissimilar. The robust clustering techniques: PAM, FANNY, AGNES and DIANA are applied to the data set. It was found that AGNES and DIANA performed very well on the data set, while PAM and FANNY performed only marginally well. 2008-10-10T09:55:12Z 2010-06-01T08:39:23Z 2008-10-10T09:55:12Z 2010-06-01T08:39:23Z 2008-12 Thesis http://hdl.handle.net/10019.1/2040 en Stellenbosch University application/pdf Stellenbosch : Stellenbosch University
spellingShingle Cluster analysis
Dissimilarities
Binary data
Medical schemes data
Dissertations -- Statistics and actuarial science
Theses -- Statistics and actuarial science
Multidimensional scaling
Health insurance -- Statistical methods
Chronic diseases
Roux, Ian
Application of cluster analysis and multidimensional scaling on medical schemes data
title Application of cluster analysis and multidimensional scaling on medical schemes data
title_full Application of cluster analysis and multidimensional scaling on medical schemes data
title_fullStr Application of cluster analysis and multidimensional scaling on medical schemes data
title_full_unstemmed Application of cluster analysis and multidimensional scaling on medical schemes data
title_short Application of cluster analysis and multidimensional scaling on medical schemes data
title_sort application of cluster analysis and multidimensional scaling on medical schemes data
topic Cluster analysis
Dissimilarities
Binary data
Medical schemes data
Dissertations -- Statistics and actuarial science
Theses -- Statistics and actuarial science
Multidimensional scaling
Health insurance -- Statistical methods
Chronic diseases
url http://hdl.handle.net/10019.1/2040
work_keys_str_mv AT rouxian applicationofclusteranalysisandmultidimensionalscalingonmedicalschemesdata