Abstract:In this work, we propose a novel method to find temporal properties that lead to the unexpected behaviors from labeled dataset. We express these properties in past time Signal Temporal Logic (ptSTL). First, we present a novel approach for finding parameters of a template ptSTL formula, which extends the results on monotonicity based parameter synthesis. The proposed method optimizes a given monotone criteria while bounding an error. Then, we employ the parameter synthesis method in an iterative unguided formula synthesis framework. In particular, we combine optimized formulas iteratively to describe the causes of the labeled events while bounding the error. We illustrate the proposed framework on two examples.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: automatically find the temporal properties that lead to unexpected behaviors from a labeled dataset, and express these properties in the form of past - signal temporal logic (ptSTL) formulas. Specifically, the authors propose a novel method to synthesize ptSTL formulas so that, given a dataset of labeled system trajectories, the main causes of these labels can be captured.
### Problem Description
Designing and implementing cyber - physical systems (CPS) for complex tasks is a difficult and error - prone process. Models are usually complex and consist of various sub - modules, such as Simulink models (MATLAB, 2016). Once the model is developed, its trajectories will be verified according to the specifications. Although it is relatively easy to simulate the system and mark unexpected behaviors, it is very challenging to locate the errors in the model that cause these unexpected behaviors. For this reason, this paper proposes a novel method to find the temporal properties that lead to unexpected behaviors from unlabeled system trajectories in an automated manner. The generated properties can provide insights into the underlying causes and help design engineers identify the corresponding modeling errors.
### Solution Overview
The authors use signal temporal logic (STL) formulas to express these temporal properties. STL is a rich specification language that extends linear temporal logic (Baier et al., 2008) and is used to describe the properties of real - valued signals (Donze, 2013). Due to its expressive power and efficient algorithms, STL is widely used in different fields, including runtime verification (Bartocci et al., 2018), time - series data analysis (Vazquez - Chanlatte et al., 2017), and formalized control (Raman et al., 2015). This paper focuses on the past - time fragment of STL (Past Time STL, ptSTL), that is, only past - time temporal operators are allowed.
### Specific Problem
Given a labeled trajectory dataset (such as test results), find a ptSTL formula such that the evaluation of this formula along the trajectory can mimic the given labels. The goal is to use the temporal semantics to capture the main causes of the labels as ptSTL formulas.
### Method Framework
The authors propose an iterative framework to efficiently synthesize ptSTL formulas. This framework performs parameter synthesis for each parameterized ptSTL formula up to a given formula length and describes the labels in the dataset by optimizing the formula combination. Therefore, this method does not require expert guidance. For parameter synthesis, the authors propose a new method that utilizes the monotonicity characteristics of the parameters. This method does not require an order relationship between the parameters but efficiently generates parameters that optimize a given monotonicity criterion and improves the results through an iterative process.
### Case Study
To verify the proposed method, the authors conducted experiments in a traffic system case. The system contains 6 links and 2 traffic lights (as shown in Figure 4). The system was simulated 20 times with random initial conditions, each for 100 steps. During the simulation, the signal values at each time step were randomly generated. The trajectories were labeled according to the following rule:
\[ l_t = \begin{cases}
1 & \text{if } (x_1^t > 30) \\
0 & \text{otherwise}
\end{cases} \]
That is, when the number of vehicles on link 1 exceeds 30 (75% of the capacity), it is marked as 1. The final dataset contains 2,000 data points, of which 456 are marked as 1.
### Results
The experimental results show that this method can generate complex ptSTL formulas within a reasonable time, with an accuracy rate as high as 97.46%, which proves the effectiveness and efficiency of this method.