Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Multi-label feature selection with application to musical instrument recognition

Thesis (PhD)--Stellenbosch University, 2013.

Saved in:
Bibliographic Details
Main Author: Sandrock, Trudie
Other Authors: Steel, S. J.
Format: Thesis
Language:en_ZA
Published: Stellenbosch : Stellenbosch University 2014
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613863651835904
access_status_str Open Access
author Sandrock, Trudie
author2 Steel, S. J.
author_browse Sandrock, Trudie
Steel, S. J.
author_facet Steel, S. J.
Sandrock, Trudie
author_sort Sandrock, Trudie
collection Thesis
dc_rights_str_mv Stellenbosch University
description Thesis (PhD)--Stellenbosch University, 2013.
format Thesis
id oai:scholar.sun.ac.za:10019.1/95550
institution Stellenbosch University (South Africa)
language en_ZA
last_indexed 2026-06-10T12:42:53.367Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from SUNScholar — Stellenbosch University Repository
publishDate 2014
publishDateRange 2014
publishDateSort 2014
publisher Stellenbosch : Stellenbosch University
publisherStr Stellenbosch : Stellenbosch University
record_format dspace
source_str SUNScholar — Stellenbosch University Repository
spelling oai:scholar.sun.ac.za:10019.1/95550 Multi-label feature selection with application to musical instrument recognition Sandrock, Trudie Steel, S. J. Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science. Machine learning Signal processing Information storage and retrieval systems -- Music Musical analysis -- Data processing Musical instruments Thesis (PhD)--Stellenbosch University, 2013. ENGLISH ABSTRACT: An area of data mining and statistics that is currently receiving considerable attention is the field of multi-label learning. Problems in this field are concerned with scenarios where each data case can be associated with a set of labels instead of only one. In this thesis, we review the field of multi-label learning and discuss the lack of suitable benchmark data available for evaluating multi-label algorithms. We propose a technique for simulating multi-label data, which allows good control over different data characteristics and which could be useful for conducting comparative studies in the multi-label field. We also discuss the explosion in data in recent years, and highlight the need for some form of dimension reduction in order to alleviate some of the challenges presented by working with large datasets. Feature (or variable) selection is one way of achieving dimension reduction, and after a brief discussion of different feature selection techniques, we propose a new technique for feature selection in a multi-label context, based on the concept of independent probes. This technique is empirically evaluated by using simulated multi-label data and it is shown to achieve classification accuracy with a reduced set of features similar to that achieved with a full set of features. The proposed technique for feature selection is then also applied to the field of music information retrieval (MIR), specifically the problem of musical instrument recognition. An overview of the field of MIR is given, with particular emphasis on the instrument recognition problem. The particular goal of (polyphonic) musical instrument recognition is to automatically identify the instruments playing simultaneously in an audio clip, which is not a simple task. We specifically consider the case of duets – in other words, where two instruments are playing simultaneously – and approach the problem as a multi-label classification one. In our empirical study, we illustrate the complexity of musical instrument data and again show that our proposed feature selection technique is effective in identifying relevant features and thereby reducing the complexity of the dataset without negatively impacting on performance. AFRIKAANSE OPSOMMING: ‘n Area van dataontginning en statistiek wat tans baie aandag ontvang, is die veld van multi-etiket leerteorie. Probleme in hierdie veld beskou scenarios waar elke datageval met ‘n stel etikette geassosieer kan word, instede van slegs een. In hierdie skripsie gee ons ‘n oorsig oor die veld van multi-etiket leerteorie en bespreek die gebrek aan geskikte standaard datastelle beskikbaar vir die evaluering van multi-etiket algoritmes. Ons stel ‘n tegniek vir die simulasie van multi-etiket data voor, wat goeie kontrole oor verskillende data eienskappe bied en wat nuttig kan wees om vergelykende studies in die multi-etiket veld uit te voer. Ons bespreek ook die onlangse ontploffing in data, en beklemtoon die behoefte aan ‘n vorm van dimensie reduksie om sommige van die uitdagings wat deur sulke groot datastelle gestel word die hoof te bied. Veranderlike seleksie is een manier van dimensie reduksie, en na ‘n vlugtige bespreking van verskillende veranderlike seleksie tegnieke, stel ons ‘n nuwe tegniek vir veranderlike seleksie in ‘n multi-etiket konteks voor, gebaseer op die konsep van onafhanklike soek-veranderlikes. Hierdie tegniek word empiries ge-evalueer deur die gebruik van gesimuleerde multi-etiket data en daar word gewys dat dieselfde klassifikasie akkuraatheid behaal kan word met ‘n verminderde stel veranderlikes as met die volle stel veranderlikes. Die voorgestelde tegniek vir veranderlike seleksie word ook toegepas in die veld van musiek dataontginning, spesifiek die probleem van die herkenning van musiekinstrumente. ‘n Oorsig van die musiek dataontginning veld word gegee, met spesifieke klem op die herkenning van musiekinstrumente. Die spesifieke doel van (polifoniese) musiekinstrument-herkenning is om instrumente te identifiseer wat saam in ‘n oudiosnit speel. Ons oorweeg spesifiek die geval van duette – met ander woorde, waar twee instrumente saam speel – en hanteer die probleem as ‘n multi-etiket klassifikasie een. In ons empiriese studie illustreer ons die kompleksiteit van musiekinstrumentdata en wys weereens dat ons voorgestelde veranderlike seleksie tegniek effektief daarin slaag om relevante veranderlikes te identifiseer en sodoende die kompleksiteit van die datastel te verminder sonder ‘n negatiewe impak op klassifikasie akkuraatheid. Doctoral 2014-09-05T10:07:38Z 2014-09-05T10:07:38Z 2013-12 Thesis http://hdl.handle.net/10019.1/95550 en_ZA Stellenbosch University 319 p. : ill. application/pdf application/pdf Stellenbosch : Stellenbosch University
spellingShingle Machine learning
Signal processing
Information storage and retrieval systems -- Music
Musical analysis -- Data processing
Musical instruments
Sandrock, Trudie
Multi-label feature selection with application to musical instrument recognition
title Multi-label feature selection with application to musical instrument recognition
title_full Multi-label feature selection with application to musical instrument recognition
title_fullStr Multi-label feature selection with application to musical instrument recognition
title_full_unstemmed Multi-label feature selection with application to musical instrument recognition
title_short Multi-label feature selection with application to musical instrument recognition
title_sort multi label feature selection with application to musical instrument recognition
topic Machine learning
Signal processing
Information storage and retrieval systems -- Music
Musical analysis -- Data processing
Musical instruments
url http://hdl.handle.net/10019.1/95550
work_keys_str_mv AT sandrocktrudie multilabelfeatureselectionwithapplicationtomusicalinstrumentrecognition