Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Comparison of methods to calculate measures of inequality based on interval data

Thesis (MComm)—Stellenbosch University, 2015.

Saved in:
Bibliographic Details
Main Author: Neethling, Willem Francois
Other Authors: De Wet, Tertius
Format: Thesis
Language:en_ZA
Published: Stellenbosch : Stellenbosch University 2015
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613922043887616
access_status_str Open Access
author Neethling, Willem Francois
author2 De Wet, Tertius
author_browse De Wet, Tertius
Neethling, Willem Francois
author_facet De Wet, Tertius
Neethling, Willem Francois
author_sort Neethling, Willem Francois
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (MComm)—Stellenbosch University, 2015.
format Thesis
id oai:scholar.sun.ac.za:10019.1/97780
institution Stellenbosch University (South Africa)
language en_ZA
last_indexed 2026-06-10T12:43:50.076Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2015
publishDateRange 2015
publishDateSort 2015
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/97780 Comparison of methods to calculate measures of inequality based on interval data Neethling, Willem Francois De Wet, Tertius Neethling, Ariane Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science Interval data UCTD Income distribution -- Statistical methods Thesis (MComm)—Stellenbosch University, 2015. ENGLISH ABSTRACT: In recent decades, economists and sociologists have taken an increasing interest in the study of income attainment and income inequality. Many of these studies have used census data, but social surveys have also increasingly been utilised as sources for these analyses. In these surveys, respondents’ incomes are most often not measured in true amounts, but in categories of which the last category is open-ended. The reason is that income is seen as sensitive data and/or is sometimes difficult to reveal. Continuous data divided into categories is often more difficult to work with than ungrouped data. In this study, we compare different methods to convert grouped data to data where each observation has a specific value or point. For some methods, all the observations in an interval receive the same value; an example is the midpoint method, where all the observations in an interval are assigned the midpoint. Other methods include random methods, where each observation receives a random point between the lower and upper bound of the interval. For some methods, random and non-random, a distribution is fitted to the data and a value is calculated according to the distribution. The non-random methods that we use are the midpoint-, Pareto means- and lognormal means methods; the random methods are the random midpoint-, random Pareto- and random lognormal methods. Since our focus falls on income data, which usually follows a heavy-tailed distribution, we use the Pareto and lognormal distributions in our methods. The above-mentioned methods are applied to simulated and real datasets. The raw values of these datasets are known, and are categorised into intervals. These methods are then applied to the interval data to reconvert the interval data to point data. To test the effectiveness of these methods, we calculate some measures of inequality. The measures considered are the Gini coefficient, quintile share ratio (QSR), the Theil measure and the Atkinson measure. The estimated measures of inequality, calculated from each dataset obtained through these methods, are then compared to the true measures of inequality. AFRIKAANSE OPSOMMING: Oor die afgelope dekades het ekonome en sosioloë ʼn toenemende belangstelling getoon in studies aangaande inkomsteverkryging en inkomste-ongelykheid. Baie van die studies maak gebruik van sensus data, maar die gebruik van sosiale opnames as bronne vir die ontledings het ook merkbaar toegeneem. In die opnames word die inkomste van ʼn persoon meestal in kategorieë aangedui waar die laaste interval oop is, in plaas van numeriese waardes. Die rede vir die kategorieë is dat inkomste data as sensitief beskou word en soms is dit ook moeilik om aan te dui. Kontinue data wat in kategorieë opgedeel is, is meeste van die tyd moeiliker om mee te werk as ongegroepeerde data. In dié studie word verskeie metodes vergelyk om gegroepeerde data om te skakel na data waar elke waarneming ʼn numeriese waarde het. Vir van die metodes word dieselfde waarde aan al die waarnemings in ʼn interval gegee, byvoorbeeld die ‘midpoint’ metode waar elke waarde die middelpunt van die interval verkry. Ander metodes is ewekansige metodes waar elke waarneming ʼn ewekansige waarde kry tussen die onder- en bogrens van die interval. Vir sommige van die metodes, ewekansig en nie-ewekansig, word ʼn verdeling oor die data gepas en ʼn waarde bereken volgens die verdeling. Die nie-ewekansige metodes wat gebruik word, is die ‘midpoint’, ‘Pareto means’ en ‘Lognormal means’ en die ewekansige metodes is die ‘random midpoint’, ‘random Pareto’ en ‘random lognormal’. Ons fokus is op inkomste data, wat gewoonlik ʼn swaar stertverdeling volg, en om hierdie rede maak ons gebruik van die Pareto en lognormaal verdelings in ons metodes. Al die metodes word toegepas op gesimuleerde en werklike datastelle. Die rou waardes van die datastelle is bekend en word in intervalle gekategoriseer. Die metodes word dan op die interval data toegepas om dit terug te skakel na data waar elke waarneming ʼn numeriese waardes het. Om die doeltreffendheid van die metodes te toets word ʼn paar maatstawwe van ongelykheid bereken. Die maatstawwe sluit in die Gini koeffisiënt, ‘quintile share ratio’ (QSR), die Theil en Atkinson maatstawwe. Die beraamde maatstawwe van ongelykheid, wat bereken is vanaf die datastelle verkry deur die metodes, word dan vergelyk met die ware maatstawwe van ongelykheid. Masters 2015-12-14T07:42:24Z 2015-12-14T07:42:24Z 2015-12 Thesis http://hdl.handle.net/10019.1/97780 en_ZA Stellenbosch University 167 pages application/pdf Stellenbosch : Stellenbosch University
spellingShingle Interval data
UCTD
Income distribution -- Statistical methods
Neethling, Willem Francois
Comparison of methods to calculate measures of inequality based on interval data
title Comparison of methods to calculate measures of inequality based on interval data
title_full Comparison of methods to calculate measures of inequality based on interval data
title_fullStr Comparison of methods to calculate measures of inequality based on interval data
title_full_unstemmed Comparison of methods to calculate measures of inequality based on interval data
title_short Comparison of methods to calculate measures of inequality based on interval data
title_sort comparison of methods to calculate measures of inequality based on interval data
topic Interval data
UCTD
Income distribution -- Statistical methods
url http://hdl.handle.net/10019.1/97780
work_keys_str_mv AT neethlingwillemfrancois comparisonofmethodstocalculatemeasuresofinequalitybasedonintervaldata