Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Developing an XML-based, exploitable linguistic database of the Hebrew text of Gen. 1:1-2:3

Thesis (PhD (Information Technology))--University of Pretoria, 2008.

Saved in:
Bibliographic Details
Other Authors: Bothma, T.J.D. (Theodorus Jan Daniel)
Format: Thesis
Published: University of Pretoria 2013
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613526060695552
access_status_str Open Access
author2 Bothma, T.J.D. (Theodorus Jan Daniel)
author_browse Bothma, T.J.D. (Theodorus Jan Daniel)
author_facet Bothma, T.J.D. (Theodorus Jan Daniel)
collection Thesis
dc_rights_str_mv ©University of Pretoria 2008 B23/
description Thesis (PhD (Information Technology))--University of Pretoria, 2008.
format Thesis
id oai:repository.up.ac.za:2263/26750
institution University of Pretoria (South Africa)
last_indexed 2026-06-10T12:37:32.711Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from UPSpace — University of Pretoria Institutional Repository
publishDate 2013
publishDateRange 2013
publishDateSort 2013
publisher University of Pretoria
publisherStr University of Pretoria
record_format dspace
source_str UPSpace — University of Pretoria Institutional Repository
spelling oai:repository.up.ac.za:2263/26750 Developing an XML-based, exploitable linguistic database of the Hebrew text of Gen. 1:1-2:3 Bothma, T.J.D. (Theodorus Jan Daniel) jan.kroeze@gmail.com Matthee, Machdel C. Kroeze, J.H. (Jan Hendrik) Online analytical processing (olap) Xml Hebrew bible Threedimensional array Visualisation Computational linguistics Text data mining Data warehousing Database management Round-tripping UCTD Thesis (PhD (Information Technology))--University of Pretoria, 2008. The thesis discusses a series of related techniques that prepare and transform raw linguistic data for advanced processing in order to unveil hidden grammatical patterns. A threedimensional array is identified as a suitable data structure to build a data cube to capture multidimensional linguistic data in a computer's temporary storage facility. It also enables online analytical processing, like slicing, to be executed on this data cube in order to reveal various subsets and presentations of the data. XML is investigated as a suitable mark-up language to permanently store such an exploitable databank of Biblical Hebrew linguistic data. This concept is illustrated by tagging a phonetic transcription of Genesis 1:1-2:3 on various linguistic levels and manipulating this databank. Transferring the data set between an XML file and a threedimensional array creates a stable environment allowing editing and advanced processing of the data in order to confirm existing knowledge or to mine for new, yet undiscovered, linguistic features. Two experiments are executed to demonstrate possible text-mining procedures. Finally, visualisation is discussed as a technique that enhances interaction between the human researcher and the computerised technologies supporting the process of knowledge creation. Although the data set is very small there are exciting indications that the compilation and analysis of aggregate linguistic data may assist linguists to perform rigorous research, for example regarding the definitions of semantic functions and the mapping of these functions onto the syntactic module. Information Science unrestricted 2013-09-07T07:36:38Z 2008-09-08 2013-09-07T07:36:38Z 2008-09-02 2008-09-08 2008-07-28 Thesis 2008 B23/eo http://hdl.handle.net/2263/26750 http://upetd.up.ac.za/thesis/available/etd-07282008-121520/ ©University of Pretoria 2008 B23/ application/pdf application/pdf application/pdf application/pdf application/pdf application/pdf application/pdf application/pdf application/pdf application/pdf application/pdf application/octet-stream University of Pretoria
spellingShingle Online analytical processing (olap)
Xml
Hebrew bible
Threedimensional array
Visualisation
Computational linguistics
Text data mining
Data warehousing
Database management
Round-tripping
UCTD
Developing an XML-based, exploitable linguistic database of the Hebrew text of Gen. 1:1-2:3
title Developing an XML-based, exploitable linguistic database of the Hebrew text of Gen. 1:1-2:3
title_full Developing an XML-based, exploitable linguistic database of the Hebrew text of Gen. 1:1-2:3
title_fullStr Developing an XML-based, exploitable linguistic database of the Hebrew text of Gen. 1:1-2:3
title_full_unstemmed Developing an XML-based, exploitable linguistic database of the Hebrew text of Gen. 1:1-2:3
title_short Developing an XML-based, exploitable linguistic database of the Hebrew text of Gen. 1:1-2:3
title_sort developing an xml based exploitable linguistic database of the hebrew text of gen 1 1 2 3
topic Online analytical processing (olap)
Xml
Hebrew bible
Threedimensional array
Visualisation
Computational linguistics
Text data mining
Data warehousing
Database management
Round-tripping
UCTD
url http://hdl.handle.net/2263/26750
http://upetd.up.ac.za/thesis/available/etd-07282008-121520/