Please use this identifier to cite or link to this item:
Title: The exploitation of provenance and versioning in the reproduction of e-experiments
Authors: Abang Ibrahim, Dayang Hanani
Issue Date: 2016
Publisher: Newcastle University
Abstract: Reproducibility has long been a cornerstone of science, and is now becoming a key research area for e-Science. This is because it provides a way to validate, and build on, previous results. Underpinning reproducibility in e-Science is provenance, which has the potential to provide scientists with a complete understanding of data generated in eexperiments, including the services that produced and consumed it. This thesis explores the issues in exploiting provenance for reproducibility. Based on this, a reproducibility framework is designed and implemented to allow past experiments to be reproduced. Seven aspects of reproducibility are considered: 1) experiments, 2) reproducibility, 3) provenance, 4) provenance models, 5) provenance and versioning, 6) automatic transformation of provenance to support reproduction, and 7) a reproducibility taxonomy. A key to reproducibility is the provenance model: a data model that structures information about an e-experiment. A review of existing provenance systems shows that the problem caused by services being updated has been neglected. This can have a severe impact on the ability to reproduce experiments and it is therefore argued that the issue of service versioning must be addressed. Even after information on the provenance of an execution, and versioning of services, is captured there is the need for a method to transform this knowledge into a form that allows past experiments to be reproduced: that is another output of this thesis. The thesis focuses on the use of work ow as a means to represent the composition, and to execute experiments. This work explores how work ows can be automatically generated to re-execute past experiments. In order to do this, a transformation algorithm is described that maps a past experiment's execution log data into a work ow format that can be read and processed by the work- ow system. The thesis also introduces a Reproducibility Taxonomy that captures and structures the information required for reproducibility in the presence of versions and provenance.
Description: PhD Thesis
Appears in Collections:School of Computing Science

Files in This Item:
File Description SizeFormat 
Abang Ibrahim, D.H. 2016.pdfThesis4.32 MBAdobe PDFView/Open
dspacelicence.pdfLicence43.82 kBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.