Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

A framework for estimating risk

Thesis (PhD (Statistics and Actuarial Sciences))--Stellenbosch University, 2008.

Saved in:
Bibliographic Details
Main Author: Kroon, Rodney Stephen
Other Authors: Steel, S. J.
Format: Thesis
Language:English
Published: Stellenbosch : Stellenbosch University 2008
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614015745687552
access_status_str Open Access
author Kroon, Rodney Stephen
author2 Steel, S. J.
author_browse Kroon, Rodney Stephen
Steel, S. J.
author_facet Steel, S. J.
Kroon, Rodney Stephen
author_sort Kroon, Rodney Stephen
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (PhD (Statistics and Actuarial Sciences))--Stellenbosch University, 2008.
format Thesis
id oai:scholar.sun.ac.za:10019.1/1104
institution Stellenbosch University (South Africa)
language English
last_indexed 2026-06-10T12:45:19.124Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2008
publishDateRange 2008
publishDateSort 2008
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/1104 A framework for estimating risk Kroon, Rodney Stephen Steel, S. J. Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science. Risk estimation Concentration inequalities Training sample bounds Covering numbers Risk assessment -- Mathematical models Estimation theory Bayesian statistical decision theory Sampling (Statistics) Dissertations -- Statistics and actuarial science Theses -- Statistics and actuarial science Thesis (PhD (Statistics and Actuarial Sciences))--Stellenbosch University, 2008. We consider the problem of model assessment by risk estimation. Various approaches to risk estimation are considered in a uni ed framework. This a discussion of various complexity dimensions and approaches to obtaining bounds on covering numbers is also presented. The second type of training sample interval estimator discussed in the thesis is Rademacher bounds. These bounds use advanced concentration inequalities, so a chapter discussing such inequalities is provided. Our discussion of Rademacher bounds leads to the presentation of an alternative, slightly stronger, form of the core result used for deriving local Rademacher bounds, by avoiding a few unnecessary relaxations. Next, we turn to a discussion of PAC-Bayesian bounds. Using an approach developed by Olivier Catoni, we develop new PAC-Bayesian bounds based on results underlying Hoe ding's inequality. By utilizing Catoni's concept of \exchangeable priors", these results allowed the extension of a covering number-based result to averaging classi ers, as well as its corresponding algorithm- and data-dependent result. The last contribution of the thesis is the development of a more exible shell decomposition bound: by using Hoe ding's tail inequality rather than Hoe ding's relative entropy inequality, we extended the bound to general loss functions, allowed the use of an arbitrary number of bins, and introduced between-bin and within-bin \priors". Finally, to illustrate the calculation of these bounds, we applied some of them to the UCI spam classi cation problem, using decision trees and boosted stumps. framework is an extension of a decision-theoretic framework proposed by David Haussler. Point and interval estimation based on test samples and training samples is discussed, with interval estimators being classi ed based on the measure of deviation they attempt to bound. The main contribution of this thesis is in the realm of training sample interval estimators, particularly covering number-based and PAC-Bayesian interval estimators. The thesis discusses a number of approaches to obtaining such estimators. The rst type of training sample interval estimator to receive attention is estimators based on classical covering number arguments. A number of these estimators were generalized in various directions. Typical generalizations included: extension of results from misclassi cation loss to other loss functions; extending results to allow arbitrary ghost sample size; extending results to allow arbitrary scale in the relevant covering numbers; and extending results to allow arbitrary choice of in the use of symmetrization lemmas. These extensions were applied to covering number-based estimators for various measures of deviation, as well as for the special cases of misclassi - cation loss estimators, realizable case estimators, and margin bounds. Extended results were also provided for strati cation by (algorithm- and datadependent) complexity of the decision class. In order to facilitate application of these covering number-based bounds, Doctoral 2008-06-18T10:37:52Z 2010-06-01T08:12:29Z 2008-06-18T10:37:52Z 2010-06-01T08:12:29Z 2008-03 Thesis http://hdl.handle.net/10019.1/1104 en Stellenbosch University application/pdf Stellenbosch : Stellenbosch University
spellingShingle Risk estimation
Concentration inequalities
Training sample bounds
Covering numbers
Risk assessment -- Mathematical models
Estimation theory
Bayesian statistical decision theory
Sampling (Statistics)
Dissertations -- Statistics and actuarial science
Theses -- Statistics and actuarial science
Kroon, Rodney Stephen
A framework for estimating risk
title A framework for estimating risk
title_full A framework for estimating risk
title_fullStr A framework for estimating risk
title_full_unstemmed A framework for estimating risk
title_short A framework for estimating risk
title_sort framework for estimating risk
topic Risk estimation
Concentration inequalities
Training sample bounds
Covering numbers
Risk assessment -- Mathematical models
Estimation theory
Bayesian statistical decision theory
Sampling (Statistics)
Dissertations -- Statistics and actuarial science
Theses -- Statistics and actuarial science
url http://hdl.handle.net/10019.1/1104
work_keys_str_mv AT kroonrodneystephen aframeworkforestimatingrisk
AT kroonrodneystephen frameworkforestimatingrisk