Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

The implementation of noise addition partial least squares

Thesis (MComm (Statistics and Actuarial Science))--University of Stellenbosch, 2009.

Saved in:
Bibliographic Details
Main Author: Moller, Jurgen Johann
Other Authors: Kidd, Martin
Format: Thesis
Language:English
Published: Stellenbosch : University of Stellenbosch 2009
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613760138510336
access_status_str Open Access
author Moller, Jurgen Johann
author2 Kidd, Martin
author_browse Kidd, Martin
Moller, Jurgen Johann
author_facet Kidd, Martin
Moller, Jurgen Johann
author_sort Moller, Jurgen Johann
collection Thesis
dc_rights_str_mv University of Stellenbosch
description Thesis (MComm (Statistics and Actuarial Science))--University of Stellenbosch, 2009.
format Thesis
id oai:scholar.sun.ac.za:10019.1/3362
institution Stellenbosch University (South Africa)
language English
last_indexed 2026-06-10T12:41:15.521Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2009
publishDateRange 2009
publishDateSort 2009
publisher Stellenbosch : University of Stellenbosch
publisherStr Stellenbosch : University of Stellenbosch
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/3362 The implementation of noise addition partial least squares Moller, Jurgen Johann Kidd, Martin University of Stellenbosch. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science. Dissertations -- Statistics and actuarial science Theses -- Statistics and actuarial science Assignments -- Statistics and actuarial science Chemistry, Analytic -- Statistical methods Principal components analysis Regression analysis Ridge regression (Statistics) Thesis (MComm (Statistics and Actuarial Science))--University of Stellenbosch, 2009. When determining the chemical composition of a specimen, traditional laboratory techniques are often both expensive and time consuming. It is therefore preferable to employ more cost effective spectroscopic techniques such as near infrared (NIR). Traditionally, the calibration problem has been solved by means of multiple linear regression to specify the model between X and Y. Traditional regression techniques, however, quickly fail when using spectroscopic data, as the number of wavelengths can easily be several hundred, often exceeding the number of chemical samples. This scenario, together with the high level of collinearity between wavelengths, will necessarily lead to singularity problems when calculating the regression coefficients. Ways of dealing with the collinearity problem include principal component regression (PCR), ridge regression (RR) and PLS regression. Both PCR and RR require a significant amount of computation when the number of variables is large. PLS overcomes the collinearity problem in a similar way as PCR, by modelling both the chemical and spectral data as functions of common latent variables. The quality of the employed reference method greatly impacts the coefficients of the regression model and therefore, the quality of its predictions. With both X and Y subject to random error, the quality the predictions of Y will be reduced with an increase in the level of noise. Previously conducted research focussed mainly on the effects of noise in X. This paper focuses on a method proposed by Dardenne and Fernández Pierna, called Noise Addition Partial Least Squares (NAPLS) that attempts to deal with the problem of poor reference values. Some aspects of the theory behind PCR, PLS and model selection is discussed. This is then followed by a discussion of the NAPLS algorithm. Both PLS and NAPLS are implemented on various datasets that arise in practice, in order to determine cases where NAPLS will be beneficial over conventional PLS. For each dataset, specific attention is given to the analysis of outliers, influential values and the linearity between X and Y, using graphical techniques. Lastly, the performance of the NAPLS algorithm is evaluated for various Masters 2009-03-05T11:04:41Z 2010-07-09T11:08:35Z 2009-03-05T11:04:41Z 2010-07-09T11:08:35Z 2009-03 Thesis http://hdl.handle.net/10019.1/3362 en University of Stellenbosch application/pdf Stellenbosch : University of Stellenbosch
spellingShingle Dissertations -- Statistics and actuarial science
Theses -- Statistics and actuarial science
Assignments -- Statistics and actuarial science
Chemistry, Analytic -- Statistical methods
Principal components analysis
Regression analysis
Ridge regression (Statistics)
Moller, Jurgen Johann
The implementation of noise addition partial least squares
title The implementation of noise addition partial least squares
title_full The implementation of noise addition partial least squares
title_fullStr The implementation of noise addition partial least squares
title_full_unstemmed The implementation of noise addition partial least squares
title_short The implementation of noise addition partial least squares
title_sort implementation of noise addition partial least squares
topic Dissertations -- Statistics and actuarial science
Theses -- Statistics and actuarial science
Assignments -- Statistics and actuarial science
Chemistry, Analytic -- Statistical methods
Principal components analysis
Regression analysis
Ridge regression (Statistics)
url http://hdl.handle.net/10019.1/3362
work_keys_str_mv AT mollerjurgenjohann theimplementationofnoiseadditionpartialleastsquares
AT mollerjurgenjohann implementationofnoiseadditionpartialleastsquares