Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Techniques for handling clustered binary data

Bibliography : leaves 143-153.

Saved in:
Bibliographic Details
Main Author: Hanslo, Monique
Other Authors: Juritz, June
Format: Thesis
Language:English
Published: Department of Statistical Sciences 2014
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613308419309568
access_status_str Open Access
author Hanslo, Monique
author2 Juritz, June
author_browse Hanslo, Monique
Juritz, June
author_facet Juritz, June
Hanslo, Monique
author_sort Hanslo, Monique
collection Thesis
description Bibliography : leaves 143-153.
format Thesis
id oai:open.uct.ac.za:11427/6950
institution University of Cape Town (South Africa)
language eng
last_indexed 2026-06-10T12:34:03.682Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate 2014
publishDateRange 2014
publishDateSort 2014
publisher Department of Statistical Sciences
publisherStr Department of Statistical Sciences
record_format dspace
source_str UCTD — University of Cape Town Open Access Repository
spelling oai:open.uct.ac.za:11427/6950 Techniques for handling clustered binary data Hanslo, Monique Juritz, June Mathematical Statistics Bibliography : leaves 143-153. Over the past few decades there has been increasing interest in clustered studies and hence much research has gone into the analysis of data arising from these studies. It is erroneous to treat clustered data, where observations within a cluster are correlated with each other, as one would treat independent data. It has been found that point estimates are not as greatly affected by clustering as are the standard deviations of the estimates. But as a consequence, confidence intervals and hypothesis testing are severely affected. Therefore one has to approach the analysis of clustered data with caution. Methods that specifically deal with correlated data have been developed. Analysis may be further complicated when the outcome variable of interest is binary rather than continuous. Methods for estimation of proportions, their variances, calculation of confidence intervals and a variety of techniques for testing the homogeneity of proportions have been developed over the years (Donner and Klar, 1993; Donner, 1989, and Rao and Scott, 1992). The methods developed within the context of experimental design generally involve incorporating the effect of clustering in the analysis. This cluster effect is quantified by the intracluster correlation and needs to be taken into account when estimating proportions, comparing proportions and in sample size calculations. In the context of observational studies, the effect of clustering is expressed by the design effect which is the inflation in the variance of an estimate that is due to selecting a cluster sample rather than an independent sample. Another important aspect of the analysis of complex sample data that is often neglected is sampling weights. One needs to recognise that each individual may not have the same probability of being selected. These weights adjust for this fact (Little et al, 1997). Methods for modelling correlated binary data have also been discussed quite extensively. Among the many models which have been proposed for analyzing binary clustered data are two approaches which have been studied and compared: the population-averaged and cluster-specific approach. The population-averaged model focuses on estimating the effect of a set of covariates on the marginal expectation of the response. One example of the population-averaged approach for parameter estimation is known as generalized estimating equations, proposed by Liang and Zeger (1986). It involves assuming that elements within a cluster are independent and then imposing a correlation structure on the set of responses. This is a useful application in longitudinal studies where a subject is regarded as a cluster. Then the parameters describe how the population-averaged response rather than a specific subject's response depends on the covariates of interest. On the other hand, cluster specific models introduce cluster to cluster variability in the model by including random effects terms, which are specific to the cluster, as linear predictors in the regression model (Neuhaus et al, 1991). Unlike the special case of correlated Gaussian responses, the parameters for the cluster specific model obtained for binary data describe different effects on the responses compared to that obtained from the population-averaged model. For longitudinal data, the parameters of a cluster-specific model describe how a specific individuals probability of a response depends on the covariates. The decision to use either of these modelling methods depends on the questions of interest. Cluster-specific models are useful for studying the effects of cluster-varying covariates and when an individual's response rather than an average population's response is the focus. The population-averaged model is useful when interest lies in how the average response across clusters changes with covariates. A criticism of this approach is that there may be no individual with the characteristics of the population-averaged model. 2014-09-08T09:50:08Z 2014-09-08T09:50:08Z 2002 Master Thesis Masters MSc http://hdl.handle.net/11427/6950 eng application/pdf Department of Statistical Sciences Faculty of Science University of Cape Town
spellingShingle Mathematical Statistics
Hanslo, Monique
Techniques for handling clustered binary data
thesis_degree_str Master's
title Techniques for handling clustered binary data
title_full Techniques for handling clustered binary data
title_fullStr Techniques for handling clustered binary data
title_full_unstemmed Techniques for handling clustered binary data
title_short Techniques for handling clustered binary data
title_sort techniques for handling clustered binary data
topic Mathematical Statistics
url http://hdl.handle.net/11427/6950
work_keys_str_mv AT hanslomonique techniquesforhandlingclusteredbinarydata