Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Named Entity Recognition from Biomedical Text

As vast amounts of unstructured data are becoming available digitally, computer-based methods to extract relevant and meaningful information are needed. Named entity recognition (NER) is the task of identifying text spans that mention named entities, and to classify them into predefined categories....

Full description

Saved in:
Bibliographic Details
Main Author: Guirguis, Maged
Format: Thesis
Published: AUC Knowledge Fountain 2023
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613421942341632
access_status_str Open Access
author Guirguis, Maged
author_browse Guirguis, Maged
author_facet Guirguis, Maged
author_sort Guirguis, Maged
collection Thesis
description As vast amounts of unstructured data are becoming available digitally, computer-based methods to extract relevant and meaningful information are needed. Named entity recognition (NER) is the task of identifying text spans that mention named entities, and to classify them into predefined categories. Despite the existence of numerous and well-versed NER methods, the bio-medical domain remains under-studied. The objective of this research is to identify an efficient technique for NER tasks from biomedical data. This is achieved by investigating using deep learning technologies namely pre-trained BERT [1] model and its variances SciBERT [2] and BioBERT [3]. Preprocessing the data before passing it for training influences model performance. There is also investigation with some preprocessing rules to monitor their effect on model performance. Our model outperforms vanilla BERT, and BioBERT where is Precision: 66.20%, Recall: 98.96%, F1: 79.33%.
format Thesis
id oai:fount.aucegypt.edu:etds-3014
institution American University in Cairo (Egypt)
last_indexed 2026-06-10T12:35:53.165Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from AUC Knowledge Fountain — bepress
publishDate 2023
publishDateRange 2023
publishDateSort 2023
publisher AUC Knowledge Fountain
publisherStr AUC Knowledge Fountain
record_format dspace
source_str AUC Knowledge Fountain — bepress
spelling oai:fount.aucegypt.edu:etds-3014 Named Entity Recognition from Biomedical Text Guirguis, Maged As vast amounts of unstructured data are becoming available digitally, computer-based methods to extract relevant and meaningful information are needed. Named entity recognition (NER) is the task of identifying text spans that mention named entities, and to classify them into predefined categories. Despite the existence of numerous and well-versed NER methods, the bio-medical domain remains under-studied. The objective of this research is to identify an efficient technique for NER tasks from biomedical data. This is achieved by investigating using deep learning technologies namely pre-trained BERT [1] model and its variances SciBERT [2] and BioBERT [3]. Preprocessing the data before passing it for training influences model performance. There is also investigation with some preprocessing rules to monitor their effect on model performance. Our model outperforms vanilla BERT, and BioBERT where is Precision: 66.20%, Recall: 98.96%, F1: 79.33%. 2023-02-15T08:00:00Z thesis application/pdf https://fount.aucegypt.edu/etds/1983 https://fount.aucegypt.edu/context/etds/article/3014/viewcontent/Maged_Guirguis_Thesis.pdf Theses and Dissertations AUC Knowledge Fountain Keywords: NER; named entity recognition; chemprot; drugprot; BERT; SciBERT; BioBERT Data Science
spellingShingle Keywords: NER; named entity recognition; chemprot; drugprot; BERT; SciBERT; BioBERT
Data Science
Guirguis, Maged
Named Entity Recognition from Biomedical Text
title Named Entity Recognition from Biomedical Text
title_full Named Entity Recognition from Biomedical Text
title_fullStr Named Entity Recognition from Biomedical Text
title_full_unstemmed Named Entity Recognition from Biomedical Text
title_short Named Entity Recognition from Biomedical Text
title_sort named entity recognition from biomedical text
topic Keywords: NER; named entity recognition; chemprot; drugprot; BERT; SciBERT; BioBERT
Data Science
url https://fount.aucegypt.edu/etds/1983
https://fount.aucegypt.edu/context/etds/article/3014/viewcontent/Maged_Guirguis_Thesis.pdf
work_keys_str_mv AT guirguismaged namedentityrecognitionfrombiomedicaltext