Please use this identifier to cite or link to this item: http://theses.ncl.ac.uk/jspui/handle/10443/4922
Title: Scalable Bayesian Time Series Modelling for Streaming Data
Authors: Law, Jonathan
Issue Date: 2019
Publisher: Newcastle University
Abstract: Ubiquitous cheap processing power and reduced storage costs have led to increased deployment of connected devices used to collect and store information about their surroundings. Examples include environmental sensors used to measure pollution levels and temperature, or vibration sensors deployed on machinery to detect faults. This data is often streamed in real time to cloud services and used to make decisions such as when to perform maintenance on critical machinery, and monitor systems, such as how interventions to reduce pollution are performing. The data recorded at these sensors is unbounded, heterogeneous and often inaccurate, recorded with different sampling frequencies, and often on irregular time grids. Connection problems or hardware faults can cause information to be missing for days at a time. Additionally, multiple co-located sensors can report different readings for the same process. A exible class of dynamic models can be used to ameliorate these issues and used to smooth and interpolate the data. Irregularly observed time series can be conveniently modelled using state space models with a continuous time latent-state represented by di usion processes. In order to model the wide array of different environmental sensors the observation distributions of these dynamic models are exible, in all cases particle filtering methods can be used for inference and in some cases the exact Kalman filter can be used. The models along with a binary composition operator form a semigroup, making model composition and reuse straightforward. Heteroskedastic time series are accounted for by using a factor structure to model a full-rank time-dependent system noise matrix for the dynamic models which can account for changes in variance and the correlation structure between each time series in a multivariate model. Finally, to model multiple nearby sensors a dynamic model is used to model a time-dependent common mean and a time-invariant Gaussian process can account for the spatial variation between the sensors. Functional programming in Scala is used to implement these time series models. Functional programming provides a unified principled API (application programming interface) for interacting with different collection types using higher order functions. This, combined with the type-class pattern, makes it possible to write inference algorithms once and deploy them locally using serial collections and later on unbounded time series data using libraries such as Akka streams using techniques from functional reactive programming.
Description: Ph. D. Thesis
URI: http://theses.ncl.ac.uk/jspui/handle/10443/4922
Appears in Collections:School of Mathematics and Statistics

Files in This Item:
File Description SizeFormat 
Law J 2019.pdfThesis9.01 MBAdobe PDFView/Open
dspacelicence.pdfLicence43.82 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.