Please use this identifier to cite or link to this item: http://theses.ncl.ac.uk/jspui/handle/10443/5927
Title: Semi-Automated Data-Driven Methods to Support Ontology Development: A Case Study on a Rehabilitation Therapy Ontology
Authors: Halawani, Mohammad K.
Issue Date: 2023
Publisher: Newcastle University
Abstract: In the fields of computer and information sciences, data need to be represented in order to be accessible by machines for processing, sharing and presenting in human readable format. This is presented through different techniques and levels of data representation, such as databases. An ontology is a technique for knowledge representation that is machine accessible. It represents concepts about knowledge and the relationships between them in a particular domain of interest. It models concepts within a specific scope in the domain of discourse. The concepts refer to real-world objects or abstract ideas in the domain, which are identified by different domain terminologies. Therefore, an ontology can be seen as a controlled vocabulary set of terms and relations. Ontology development is expensive and requires significant efforts from both domain experts and ontologists. Automating the process usually produces unsatisfactory results and involves knowledge acquisition, which is intrinsically difficult. Instead, this project investigates semi-automated techniques for bootstrapping and supporting data-driven ontology development; specifically, two methods both using scaffolds, which are then expanded manually. First, scaffolds are employed from existing knowledge resources. The ontology scaffolding technique has been previously trialled with in the mitochondrial disease ontology, where knowledge from other existing databases formed the scaffolds. It is used in this project to develop a meteorological ontology of clouds, where scaffolds are manually formed from textual guidelines. In the cloud ontology, the main classifications of clouds form the concepts that act as the scaffolds in the ontology. These scaffolds are used in different ontology patterns to construct the cloud ontology. The intention is that combining ontology scaffolding and patternisation would lead to a more robust development and easier maintenance of the ontology. Second, in the absence of knowledge resources with which to form the scaffolds, we use bootstrapping from textual resources. The ontology bootstrapping technique builds the ontology from different textual resources. Rehabilitation therapies are hard to describe, measure and compare; unlike pharmacologic therapies, they are not precisely defined. This gives rise to an interesting ontological challenge, because rehabilitation treatments are practice-based, diverse and involve interactions between a therapist, a patient and their environment. Therefore, for this project, the domain of rehabilitation was used as a case study for building a rehabilitation therapy ontology (RTO). The result of the project is a methodological, semi-automated, data-driven pipeline for building semantic knowledge structures, or graphs, to support the development of ontologies from biomedical literature. The pipeline starts with an initial small set of articles provided by experts in the domain, expands this to a corpus that is relevant and covers the scope of the initial set, performs information retrieval and extraction techniques to extract representative terminologies, and constructs a semantic knowledge graph of concepts and relationships from the representative terminologies and their lexical and semantic meanings in the literature. This knowledge structure can be used by domain experts and curators as a knowledge resource to bootstrap an ontology rather than starting from scratch. This is similar to ontology scaffolding, with scaffolds being generated from data-driven methods rather than using scaffolds from existing knowledge resources. These scaffolds are initially linked to easily discover semantic relations, and they have a “to do” list ranked by importance so that the curators can bootstrap the ontology in order. Finally, the project contributes methodological techniques that build knowledge structures to support the development of semantic nets, ontologies, and other knowledge representation formalisms from text-based knowledge resources. Moreover, in the biomedical field, it builds the knowledge structure from a small set of articles in the domain. The approaches, however, are largely independent of any domain; they depend to a large extent on the starting input data. This moves ontology development to an approach that is data-driven, embedding the process in the advances of the realm of big data analytics.
Description: Ph. D. Thesis
URI: http://hdl.handle.net/10443/5927
Appears in Collections:School of Computing

Files in This Item:
File Description SizeFormat 
dspacelicence.pdfLicence43.82 kBAdobe PDFView/Open
Halawani Mohammad Kamal H 160573767- Final Submission ecopy.pdfThesis9.49 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.