Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Influential data cases when the C-p criterion is used for variable selection in multiple linear regression

Dissertation (PhD)--Stellenbosch University, 2003.

Saved in:
Bibliographic Details
Main Author: Uys, Daniel Wilhelm
Other Authors: Steel, S. J.
Format: Thesis
Language:en_ZA
Published: Stellenbosch : Stellenbosch University 2012
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613834818093057
access_status_str Open Access
author Uys, Daniel Wilhelm
author2 Steel, S. J.
author_browse Steel, S. J.
Uys, Daniel Wilhelm
author_facet Steel, S. J.
Uys, Daniel Wilhelm
author_sort Uys, Daniel Wilhelm
collection Thesis
dc_rights_str_mv Stellenbosch University
description Dissertation (PhD)--Stellenbosch University, 2003.
format Thesis
id oai:scholar.sun.ac.za:10019.1/53464
institution Stellenbosch University (South Africa)
language en_ZA
last_indexed 2026-06-10T12:42:26.594Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2012
publishDateRange 2012
publishDateSort 2012
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/53464 Influential data cases when the C-p criterion is used for variable selection in multiple linear regression Uys, Daniel Wilhelm Steel, S. J. Van Vuuren, J. O. Stellenbosch University. Faculty of Economic and Management Sciences . Dept. of Statistical and Actuarial Science. Regression analysis Dissertations -- Statistics and actuarial science C-p criterion Variable selection Theses -- Statistics and actuarial science Dissertation (PhD)--Stellenbosch University, 2003. ENGLISH ABSTRACT: In this dissertation we study the influence of data cases when the Cp criterion of Mallows (1973) is used for variable selection in multiple linear regression. The influence is investigated in terms of the predictive power and the predictor variables included in the resulting model when variable selection is applied. In particular, we focus on the importance of identifying and dealing with these so called selection influential data cases before model selection and fitting are performed. For this purpose we develop two new selection influence measures, both based on the Cp criterion. The first measure is specifically developed to identify individual selection influential data cases, whereas the second identifies subsets of selection influential data cases. The success with which these influence measures identify selection influential data cases, is evaluated in example data sets and in simulation. All results are derived in the coordinate free context, with special application in multiple linear regression. AFRIKAANSE OPSOMMING: Invloedryke waarnemings as die C-p kriterium vir veranderlike seleksie in meervoudigelineêre regressie gebruik word: In hierdie proefskrif ondersoek ons die invloed van waarnemings as die Cp kriterium van Mallows (1973) vir veranderlike seleksie in meervoudige lineêre regressie gebruik word. Die invloed van waarnemings op die voorspellingskrag en die onafhanklike veranderlikes wat ingesluit word in die finale geselekteerde model, word ondersoek. In besonder fokus ons op die belangrikheid van identifisering van en handeling met sogenaamde seleksie invloedryke waarnemings voordat model seleksie en passing gedoen word. Vir hierdie doel word twee nuwe invloedsmaatstawwe, albei gebaseer op die Cp kriterium, ontwikkel. Die eerste maatstaf is spesifiek ontwikkelom die invloed van individuele waarnemings te meet, terwyl die tweede die invloed van deelversamelings van waarnemings op die seleksie proses meet. Die sukses waarmee hierdie invloedsmaatstawwe seleksie invloedryke waarnemings identifiseer word beoordeel in voorbeeld datastelle en in simulasie. Alle resultate word afgelei binne die koërdinaatvrye konteks, met spesiale toepassing in meervoudige lineêre regressie. Doctoral 2012-08-27T11:35:29Z 2012-08-27T11:35:29Z 2003 Thesis http://hdl.handle.net/10019.1/53464 en_ZA Stellenbosch University 189 p. application/pdf Stellenbosch : Stellenbosch University
spellingShingle Regression analysis
Dissertations -- Statistics and actuarial science
C-p criterion
Variable selection
Theses -- Statistics and actuarial science
Uys, Daniel Wilhelm
Influential data cases when the C-p criterion is used for variable selection in multiple linear regression
title Influential data cases when the C-p criterion is used for variable selection in multiple linear regression
title_full Influential data cases when the C-p criterion is used for variable selection in multiple linear regression
title_fullStr Influential data cases when the C-p criterion is used for variable selection in multiple linear regression
title_full_unstemmed Influential data cases when the C-p criterion is used for variable selection in multiple linear regression
title_short Influential data cases when the C-p criterion is used for variable selection in multiple linear regression
title_sort influential data cases when the c p criterion is used for variable selection in multiple linear regression
topic Regression analysis
Dissertations -- Statistics and actuarial science
C-p criterion
Variable selection
Theses -- Statistics and actuarial science
url http://hdl.handle.net/10019.1/53464
work_keys_str_mv AT uysdanielwilhelm influentialdatacaseswhenthecpcriterionisusedforvariableselectioninmultiplelinearregression