Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Unsupervised discovery of relations for analysis of textual data in digital forensics

Dissertation (MSc)--University of Pretoria, 2010.

Saved in:
Bibliographic Details
Other Authors: Engelbrecht, Andries P.
Format: Thesis
Published: University of Pretoria 2013
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613520491708416
access_status_str Open Access
author2 Engelbrecht, Andries P.
author_browse Engelbrecht, Andries P.
author_facet Engelbrecht, Andries P.
collection Thesis
dc_rights_str_mv © 2009, University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.
description Dissertation (MSc)--University of Pretoria, 2010.
format Thesis
id oai:repository.up.ac.za:2263/27479
institution University of Pretoria (South Africa)
last_indexed 2026-06-10T12:37:27.405Z
license_str Other — see source repository
provenance_str_mv Harvested via OAI-PMH from UPSpace — University of Pretoria Institutional Repository
publishDate 2013
publishDateRange 2013
publishDateSort 2013
publisher University of Pretoria
publisherStr University of Pretoria
record_format dspace
source_str UPSpace — University of Pretoria Institutional Repository
spelling oai:repository.up.ac.za:2263/27479 Unsupervised discovery of relations for analysis of textual data in digital forensics Engelbrecht, Andries P. anita.louis@gmail.com Louis, Anita Lily Text analysis Text mining Information extraction Relation discovery Digital forensics UCTD Dissertation (MSc)--University of Pretoria, 2010. This dissertation addresses the problem of analysing digital data in digital forensics. It will be shown that text mining methods can be adapted and applied to digital forensics to aid analysts to more quickly, efficiently and accurately analyse data to reveal truly useful information. Investigators who wish to utilise digital evidence must examine and organise the data to piece together events and facts of a crime. The difficulty with finding relevant information quickly using the current tools and methods is that these tools rely very heavily on background knowledge for query terms and do not fully utilise the content of the data. A novel framework in which to perform evidence discovery is proposed in order to reduce the quantity of data to be analysed, aid the analysts' exploration of the data and enhance the intelligibility of the presentation of the data. The framework combines information extraction techniques with visual exploration techniques to provide a novel approach to performing evidence discovery, in the form of an evidence discovery system. By utilising unrestricted, unsupervised information extraction techniques, the investigator does not require input queries or keywords for searching, thus enabling the investigator to analyse portions of the data that may not have been identified by keyword searches. The evidence discovery system produces text graphs of the most important concepts and associations extracted from the full text to establish ties between the concepts and provide an overview and general representation of the text. Through an interactive visual interface the investigator can explore the data to identify suspects, events and the relations between suspects. Two models are proposed for performing the relation extraction process of the evidence discovery framework. The first model takes a statistical approach to discovering relations based on co-occurrences of complex concepts. The second model utilises a linguistic approach using named entity extraction and information extraction patterns. A preliminary study was performed to assess the usefulness of a text mining approach to digital forensics as against the traditional information retrieval approach. It was concluded that the novel approach to text analysis for evidence discovery presented in this dissertation is a viable and promising approach. The preliminary experiment showed that the results obtained from the evidence discovery system, using either of the relation extraction models, are sensible and useful. The approach advocated in this dissertation can therefore be successfully applied to the analysis of textual data for digital forensics Copyright Computer Science unrestricted 2013-09-07T11:38:24Z 2010-08-23 2013-09-07T11:38:24Z 2010-04-12 2010-08-23 2010-08-23 Dissertation Louis, AL 2009, Unsupervised discovery of relations for analysis of textual data in digital forensics, MSc dissertation, University of Pretoria, Pretoria, viewed yymmdd < http://hdl.handle.net/2263/27479 > E10/449/gm http://hdl.handle.net/2263/27479 http://upetd.up.ac.za/thesis/available/etd-08232010-193559/ © 2009, University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria. application/pdf University of Pretoria
spellingShingle Text analysis
Text mining
Information extraction
Relation discovery
Digital forensics
UCTD
Unsupervised discovery of relations for analysis of textual data in digital forensics
title Unsupervised discovery of relations for analysis of textual data in digital forensics
title_full Unsupervised discovery of relations for analysis of textual data in digital forensics
title_fullStr Unsupervised discovery of relations for analysis of textual data in digital forensics
title_full_unstemmed Unsupervised discovery of relations for analysis of textual data in digital forensics
title_short Unsupervised discovery of relations for analysis of textual data in digital forensics
title_sort unsupervised discovery of relations for analysis of textual data in digital forensics
topic Text analysis
Text mining
Information extraction
Relation discovery
Digital forensics
UCTD
url http://hdl.handle.net/2263/27479
http://upetd.up.ac.za/thesis/available/etd-08232010-193559/