Data-driven approaches for formal synthesis of cyber-physical systems

Kazemi Mehrabadi, Milad

Please use this identifier to cite or link to this item: http://theses.ncl.ac.uk/jspui/handle/10443/5930

Title:	Data-driven approaches for formal synthesis of cyber-physical systems
Authors:	Kazemi Mehrabadi, Milad
Issue Date:	2023
Publisher:	Newcastle University
Abstract:	The traditional view in control theory connects sensing, actuation, and computation in a feedback loop to provide stability, performance, and robustness. In recent years, the control community has started looking at controlling systems such as trustworthy autonomous systems and networked systems to satisfy complex requirements. The question has then changed to address dynamic, interconnection, and computing in a unified and scalable framework against high-level logical requirements including safety. Such a comprehensive framework is needed especially for the design of safety-critical systems. The requirements on the system’s behaviour can generally be expressed as temporal logic specifications. Such specifications express formally how the system should behave as time passes. Examples of logical specifications include: always staying in a safe region, reach a destination within a certain time, and visit a region infinitely often. A prominent approach for formal control synthesis against logical specifications is to use abstraction-based methods. Due to the complex nature of the system dynamics that evolve over continuous or hybrid spaces, an abstract model is first constructed that approximates the dynamical system’s behaviour with a simple finite model. Analysing the finite-state model is more accessible than the continuous-state model, and efficient computational methods are available from Computer Science literature using compact data structures. There exists a gap between the dynamical behaviour of the original model and the abstraction that is considered to guarantee the satisfaction of the specification on the original model. This gap is generally addressed by ensuring that the abstract model over-approximates the behaviour of the original model or by making the specification more conservative. This thesis aims to push the boundaries of abstraction-based methods by making them applicable to large-scale systems that operate in an uncertain environment using data-driven compositional learning approaches. Formal abstraction-based synthesis schemes rely on a precise mathematical model of the system to build a finite state abstract model. Their usage is limited to small-scale models because the finite abstract model is generally constructed by state space discretisation. This thesis will address the above limitations by making the following contributions. The first contribution of this thesis is to make abstraction-based schemes applicable when the system dynamics is unknown. We study the formal synthesis of controllers for continuous-space systems with unknown dynamics to satisfy requirements expressed as linear temporal logic (LTL) formulas. We propose a data-driven approach that computes the growth bound of the system using a finite number of trajectories. The growth bound gives the distance between the trajectories started from different initial states. The growth bound and the sampled trajectories are used to construct the abstraction and synthesise a controller. Our approach casts the computation of the growth bound as a robust convex optimisation program (RCP). Since the unknown dynamics appear in the optimisation, we formulate a scenario convex program (SCP) corresponding to the RCP using a finite number of sampled trajectories. We establish a sample complexity result that gives a lower bound for the number of sampled trajectories to guarantee the correctness of the growth bound computed from the SCP with a given confidence. The second contribution of this thesis is to address the scalability of abstraction-based methods for systems that are influenced by random uncertainties. We design model-free reinforcement methods to satisfy temporal properties on unknown stochastic systems with continuous state spaces. We show how reinforcement learning (RL) can be applied for computing policies that are finite-memory and deterministic, using only the paths of the stochastic process. We address properties expressed in LTL and give a path-dependent reward function maximised via the RL algorithm. We develop the required assumptions and theories for the learned policy to converge to the optimal policy in the continuous state space. The third contribution of this thesis is to provide a formal compositional synthesis approach designed for large-scale interconnected systems. We introduce a novel RL scheme to synthesise policies for networks of continuous-space stochastic control systems with unknown dynamics. The proposed compositional framework applies model-free two-player RL in an assume-guarantee fashion and compositionally compute strategies for continuousspace interconnected systems without explicitly constructing their finite-state abstractions. This approach gives a guaranteed lower bound for the probability of property satisfaction by the interconnected system based on those of individual controllers over subsystems. As our last contribution, we address the use of average-reward RL for controller synthesis with formal convergence guarantees. Previous approaches rely on using discounted RL with formal guarantees that hold only when the discounting factor convergences to one. Discounted RL prioritises the short-term behaviour of the system over the long-term performance. To satisfy an LTL property, we need to choose the discounting factor close to one, leading to the instability of the learning algorithms. An alternative to discounted RL is to use the average objective without discounting, which inherently focuses on the system’s long-term behaviour. We restrict our attention to specifications corresponding to absolute liveness properties (i.e., those that cannot be invalidated by a finite prefix). We propose a translation from absolute liveness properties to an average reward objective for RL. This reduction can be made on the fly without full knowledge of the system, thereby enabling the use of model-free RL algorithms. The contributions made during the course of this PhD enable us to perform formal control synthesis against high-level logical specifications on larger classes of systems. This is achieved by designing novel data-driven and learning methods with proper formal (convergence or closeness) guarantees.
Description:	Ph. D. Thesis.
URI:	http://hdl.handle.net/10443/5930
Appears in Collections:	School of Computing

Files in This Item:

File	Description	Size	Format
Kazemi Milad ecopy 180672684.pdf	Thesis	7.33 MB	Adobe PDF	View/Open
dspacelicence.pdf	Licence	43.82 kB	Adobe PDF	View/Open

Show full item record