Scaling the development of large ontologies : identitas and hypernormalization

Alshammry, Nizal Khalf

Please use this identifier to cite or link to this item: http://theses.ncl.ac.uk/jspui/handle/10443/5546

Full metadata record

DC Field	Value	Language
dc.contributor.author	Alshammry, Nizal Khalf	-
dc.date.accessioned	2022-08-23T11:20:46Z	-
dc.date.available	2022-08-23T11:20:46Z	-
dc.date.issued	2021	-
dc.identifier.uri	http://hdl.handle.net/10443/5546	-
dc.description	PhD Thesis	en_US
dc.description.abstract	During the last decade ontologies have become a fundamental part of the life sciences to build organised computational knowledge. Currently, there are more than 800 biomedical ontologies hosted by the NCBO BioPortal repository. However, the proliferation of ontologies in the biomedical and biological domains has highlighted a number of problems. As ontologies become large, their development and maintenance becomes more challenging and time-consuming. Therefore, the scalability of ontology development has become problematic. In this thesis, we examine two new approaches that can help address this challenge. First, we consider a new approach to identi ers that could signi cantly facilitate the scalability of ontologies and overcome some related issues with monotonic, numeric identi ers while remaining semantics-free. Our solutions are described, along with the Identitas library, which allows concurrent development, pronounceability and error checking. The library integrated into two ontology development environments, Prot eg e and Tawny-OWL. This thesis also discusses the ways in which current ontological practices could be migrated towards the use of this scheme. Second, we investigate the usage of the hypernormalisation, patternisation and programatic approaches by asking how we could use this approach to rebuild the Gene Ontology (GO). The aim of the hypernormalisation and patternisation techniques is to allow the ontology developer to manage its maintainability and evolution. To apply this approach we had to analyse the ontology structure, starting with the Molecular Function Ontology (MFO). The MFO is formed from several large and tangled hierarchies of classes, each of which describe a broad molecular activity. The exploitation of the hypernormalisation approach resulted in the creation of a hypernormalised form of the Transporter Activity (TA) and Catalytic Activity (CA) hierarchies, together they constitute 78% of all classes in MFO. The hypernormalised structure of the TA and CA are generated based on developed higher-level patterns and novel content-speci c patterns, and exploit ontology logical reasoners. The gen- erated ontologies are robust, easy to maintain and can be developed and extended freely. Although, there are a variety of ontologies development tools, Tawny-OWL is a programmatic interactive tool for ontology creation and management and provides a set of patterns that explicitly support the creation of a hypernormalised ontology. Finally, the investigation of the hypernormalisation highlighted inconsistent classi- cations and identi cation of signi cant semantic mismatch between GO and the Chemical Entities of Biological Interest (ChEBI). Although both ontologies describe the same real entities, GO often refers to the form most common in biology, while ChEBI is more speci c and precise. The use of hypernormalisation forces us to deal with this mismatch, we used the equivalence axioms created by the GO-Plus ontology. To sum up, to address the scalability and ease development of ontologies we propose a new identi er scheme and investigate the use of the hypernormalisation methodology. Together, the Identitas and the hypernormalisation technique should enable the construction of large-scale ontologies in the future.	en_US
dc.description.sponsorship	Northern Borders University, Saudi Arabia,	en_US
dc.language.iso	en	en_US
dc.publisher	Newcastle University	en_US
dc.title	Scaling the development of large ontologies : identitas and hypernormalization	en_US
dc.type	Thesis	en_US
Appears in Collections:	School of Computing

Files in This Item:

File	Description	Size	Format
Alshammry N 2021		4.45 MB	Adobe PDF	View/Open
dspacelicence.pdf		43.82 kB	Adobe PDF	View/Open

Show simple item record