Full Text Available

Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Techniques for handling clustered binary data

Bibliography : leaves 143-153.

Saved in:

Bibliographic Details
Main Author:	Hanslo, Monique
Other Authors:	Juritz, June
Format:	Thesis
Language:	English
Published:	Department of Statistical Sciences 2014
Subjects:	Mathematical Statistics
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867613308419309568
access_status_str	Open Access
author	Hanslo, Monique
author2	Juritz, June
author_browse	Hanslo, Monique Juritz, June
author_facet	Juritz, June Hanslo, Monique
author_sort	Hanslo, Monique
collection	Thesis
description	Bibliography : leaves 143-153.
format	Thesis
id	oai:open.uct.ac.za:11427/6950
institution	University of Cape Town (South Africa)
language	eng
last_indexed	2026-06-10T12:34:03.682Z
license_str	Not specified — see source repository
provenance_str_mv	Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate	2014
publishDateRange	2014
publishDateSort	2014
publisher	Department of Statistical Sciences
publisherStr	Department of Statistical Sciences
record_format	dspace
source_str	UCTD — University of Cape Town Open Access Repository
spelling	oai:open.uct.ac.za:11427/6950 Techniques for handling clustered binary data Hanslo, Monique Juritz, June Mathematical Statistics Bibliography : leaves 143-153. Over the past few decades there has been increasing interest in clustered studies and hence much research has gone into the analysis of data arising from these studies. It is erroneous to treat clustered data, where observations within a cluster are correlated with each other, as one would treat independent data. It has been found that point estimates are not as greatly affected by clustering as are the standard deviations of the estimates. But as a consequence, confidence intervals and hypothesis testing are severely affected. Therefore one has to approach the analysis of clustered data with caution. Methods that specifically deal with correlated data have been developed. Analysis may be further complicated when the outcome variable of interest is binary rather than continuous. Methods for estimation of proportions, their variances, calculation of confidence intervals and a variety of techniques for testing the homogeneity of proportions have been developed over the years (Donner and Klar, 1993; Donner, 1989, and Rao and Scott, 1992). The methods developed within the context of experimental design generally involve incorporating the effect of clustering in the analysis. This cluster effect is quantified by the intracluster correlation and needs to be taken into account when estimating proportions, comparing proportions and in sample size calculations. In the context of observational studies, the effect of clustering is expressed by the design effect which is the inflation in the variance of an estimate that is due to selecting a cluster sample rather than an independent sample. Another important aspect of the analysis of complex sample data that is often neglected is sampling weights. One needs to recognise that each individual may not have the same probability of being selected. These weights adjust for this fact (Little et al, 1997). Methods for modelling correlated binary data have also been discussed quite extensively. Among the many models which have been proposed for analyzing binary clustered data are two approaches which have been studied and compared: the population-averaged and cluster-specific approach. The population-averaged model focuses on estimating the effect of a set of covariates on the marginal expectation of the response. One example of the population-averaged approach for parameter estimation is known as generalized estimating equations, proposed by Liang and Zeger (1986). It involves assuming that elements within a cluster are independent and then imposing a correlation structure on the set of responses. This is a useful application in longitudinal studies where a subject is regarded as a cluster. Then the parameters describe how the population-averaged response rather than a specific subject's response depends on the covariates of interest. On the other hand, cluster specific models introduce cluster to cluster variability in the model by including random effects terms, which are specific to the cluster, as linear predictors in the regression model (Neuhaus et al, 1991). Unlike the special case of correlated Gaussian responses, the parameters for the cluster specific model obtained for binary data describe different effects on the responses compared to that obtained from the population-averaged model. For longitudinal data, the parameters of a cluster-specific model describe how a specific individuals probability of a response depends on the covariates. The decision to use either of these modelling methods depends on the questions of interest. Cluster-specific models are useful for studying the effects of cluster-varying covariates and when an individual's response rather than an average population's response is the focus. The population-averaged model is useful when interest lies in how the average response across clusters changes with covariates. A criticism of this approach is that there may be no individual with the characteristics of the population-averaged model. 2014-09-08T09:50:08Z 2014-09-08T09:50:08Z 2002 Master Thesis Masters MSc http://hdl.handle.net/11427/6950 eng application/pdf Department of Statistical Sciences Faculty of Science University of Cape Town
spellingShingle	Mathematical Statistics Hanslo, Monique Techniques for handling clustered binary data
thesis_degree_str	Master's
title	Techniques for handling clustered binary data
title_full	Techniques for handling clustered binary data
title_fullStr	Techniques for handling clustered binary data
title_full_unstemmed	Techniques for handling clustered binary data
title_short	Techniques for handling clustered binary data
title_sort	techniques for handling clustered binary data
topic	Mathematical Statistics
url	http://hdl.handle.net/11427/6950
work_keys_str_mv	AT hanslomonique techniquesforhandlingclusteredbinarydata

Full Text Available

Techniques for handling clustered binary data

Similar Items