Automatically Derived Semantic Scenario Instance Descriptions: Analyzing & Improving the Existing Approach

Authors: Michael Perl and Gabe Cohen
Supervisors: Nicola Kolb, Claudius Jordan
Organization: Technical University of Munich — Chair for Software & Systems Engineering

General Problem and Context

With the increase in international development of self-driving cars, there is also a need to implement new and efficient ways of testing the safety and reliability of these new technologies. More specifically, there are significant challenges when it comes to generating testing data for self-driving vehicles. One of the limiting factors in the roll-out of self-driving technologies is the challenge for car companies to argue for the safe behavior of their new technologies (UNECE 2020). Self-driving cars would need to complete approximately 5 billion miles of on-road testing to prove with 95% confidence that a self-driving car has a lower failure rate than a human driver (Kalra 2016). For this reason, researchers and developers must consider alternative ways in which automated vehicles can be tested for safety. The solution we worked on was to create simulations of traffic scenarios where the software of the self-driving vehicles can be tested. These simulations can be either manually created or automatically generated from recording traffic scenarios (Hauer 2020). As Hauer et al. (2020) states, "The goal of scenario-based testing is to identify instances that stress the autonomous driving behavior (e.g., near crashes, abrupt acceleration or deceleration)". The accumulation of more traffic scenarios will naturally allow vehicle manufacturers to test their self-driving software in more unique traffic scenarios. This will either demonstrate the strength of such self-driving vehicles or reveal bugs that may not have otherwise been visible.

Description of the Specific Human/Cyber-Physical System Problem

The two common ways to identify traffic scenarios can either be by knowledge-based approach or a data-based approach. The knowledge-based approach involves trained experts inventing possible traffic scenarios while the data driven approach uses real traffic data to generate the scenarios (Tenbrock 2021). The data-driven scenarios, for example, can be created with a drone hovering above a traffic intersection with a bird’s eye view (inD Dataset, 2020). However, the problem of creating simulations based on real-world drone footage of traffic is not a trivial problem and must be done with significant care and effort. The simulations that will be created will ultimately be used to legitimize the safety of these systems, and therefore if they are faulty, the safety of self-driving vehicles cannot be assured. Therefore, the problem at hand is how to evaluate the adequacy of a set of simulations generated automatically from a publicly available intersection drone dataset of traffic at four different intersections in Germany (inD Dataset, 2020). This dataset was captured from bird’s-eye drone footage of four intersections, and this footage was converted into metadata that included timestamps, position, velocity, and acceleration of each vehicle within the bounds of the footage (Brock 2020). There are several software tools that can create traffic scenarios in a simulation, but they are naturally not going to be completely accurate.

The problem at hand is developing an approach to assess the efficacy of a conversion from traffic footage to simulations generated by semantic descriptions. There are many future possibilities associated with converting drone footage into traffic scenarios that can be simulated. Creating simulatable traffic scenarios will be an integral part of autonomous driving research, but the context of this research is in assessing the descriptions created. As stated before, faulty simulation could result in a self-feeding error loop.

The Challenges of Reaching a Functional System

Scenario-based testing for autonomous vehicles is promising and is already commonplace in industry for testing autonomous vehicles (Hauer, 2020). Scenarios are small pieces of a traffic scenario. Traffic scenarios are the hours of raw drone footage of an intersection, and each scenario is a small piece of that footage. Each vehicle in the footage is defined once as an ego vehicle, and a scenario starts when that ego vehicle enters the designated recording section in the intersection and ends when that vehicle leaves. So, if there are 10 vehicles that enter the intersection during the entire drone footage (or traffic scenario), then there are 10 separate scenarios where each unique vehicle is the ego vehicle for one of those scenarios. The term “concrete scenario instances” is used for scenarios observed and recorded by real-world data (Menzel 2018). So, now that the concept of scenarios has been established, we can understand the utility of scenario-based testing. By placing autonomous vehicle software in a large number of generated scenario instances, we can observe how the vehicle behaves in these traffic scenarios.

However, there are several challenges associated with this approach. There is not a way to understand the positional data of vehicles in a scenario. While you may manually observe a car’s behavior in a simulated environment, the data is meaningless to a computer. The data are simply coordinates, heading, velocity, and acceleration. This is where semantic-descriptions are valuable, as they assign meaningful descriptions to a vehicle’s movement. However, in its current approach, semantic descriptions must be assigned manually which restricts its ability to scale and makes it infeasible for larger applications. There is a need to create and standardize an automatically derived semantic description of scenario instances to boost the meaning and utility of recorded traffic data for scenario-based testing.

The Technical Problem and Research Setting

Our research group at the Technical University of Munich was focused on developing new software systems in a variety of research areas including but not limited to: cybersecurity, machine learning, and Unmanned Automated Vehicles (UAVs). Specifically, our team was focused on developing a new infrastructure for simulating vehicle trajectories from real-world footage of vehicles in an intersection. This infrastructure, if successfully built, can improve our automated vehicle testing because we can simulate traffic scenarios with the automated vehicle software like Esmini (Esmini, 2022). In addition to this, our team aimed to create simulations of vehicles using semantic-descriptions instead of vehicle trajectories. To elaborate, current approaches to simulating vehicles relies on positional data to map out a traffic scenario (Park 2020). However, this semantic approach translates that positional data into semantic-descriptions, or phrases, that describe a vehicle’s movement at a given timestamp. For example, a trajectory (or positional) data might say that a vehicle is at position “x, y going n m/s with an acceleration of 0 m/s2” at a given time, while a semantic-description simply says that a car is “going straight and keeping velocity” at a given time. So, our goal was to improve the existing approach to translating positional data to semantic-descriptions, as well as to evaluate the adequacy via the simulations that are based on semantic-descriptions.

Prior work at our team had already developed an infrastructure for this “translation” from trajectory to semantic-descriptions; however, it was only functional in translating, simulating and evaluating footage/data from one intersection in Germany. So, this presented a few challenges for us, mainly involving abstracting and applying the existing infrastructure to the other three intersections. There also were other challenges in evaluating the adequacy of the newly-generated simulations. In its existing state, the footage was compared to the simulations using frechet-distances, a measure of similarity between trajectories (Eiter 1994). However, this evaluation measure was only calculated for each car in the footage and for each traffic scenario. There was no way to measure the frechet distances of each maneuver that was being described. For example, if we were interested in how similar the right turns in our simulation and the right turns in the original dataset were, there was no way to measure this. This meant that the current evaluation was limited in its ability to give valuable feedback and thus limited the advancement and accuracy of the generated simulations.

So, we improved two components of the project, based on the problems posed. We first optimized the generation of simulated scenarios by refactoring the initial codebase that translated the trajectory data into semantic descriptions, simulated traffic scenarios based on those semantic descriptions, and evaluated the adequacy of those simulations. The initial infrastructure of this code was decentralized and difficult to execute smoothly. So, we developed a program to automate the entire process for parties who work on this project in the future. We also added more functionality for assessing the frechet distances of the simulations, specifically in calculating frechet distance by maneuver (e.g. all right turns). This was a major accomplishment because it allows future work to focus on improving the translation process with more precise feedback. Eventually, we hope that our work will be a small piece in a larger, complex program that can generate adequate traffic instances based on semantic-descriptions.

Future Research

The next improvement to this research would be to improve the translation mechanism from traffic footage data to simulations. The current approach is to assign semantic descriptions based on a set of kinetic observations regarding the moving vehicle. More data would make it more logical to identify machine learning grouping algorithms that could reveal patterns about different types of maneuvers. There might be discrepancies between different types of similar maneuvers that can be identified by scraping larger data sets. Also, a machine learning approach would likely allow the translation to alter the maneuver description based on intersection. A more robust approach to identifying specific maneuvers would only make the work we completed for this project more meaningful. Future research could also begin incorporating non-vehicle objects like humans, bikes, etc. If a translation can capture these entities and use the same frechet distance method to test the translational adequacy, the generated traffic scenarios will improve. This is an important aspect to remember because ultimately self-driving vehicles will have to be tested in these non-vacuum environments. Further research will have the same, simple goal that we did, which was to develop testing in the context of the conversion of traffic data into traffic simulations.

References

A. Tenbrock, A. König, T. Keutgens, and H. Weber. “The ConScenD Dataset: Concrete Scenarios from the highD Dataset According to ALKS Regulation UNECE R157 in OpenX.” In: 2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops). 2021, pp. 174–181. doi: 10.1109/IVWorkshops54471.2021.9669219.
Environment Simulator Minimalistic (esmini). Technical Report. online at https://github.com/esmini/esmini, retrieved 15th September 2022.
F. Hauer, I. Gerostathopoulos, T. Schmidt, and A. Pretschner. “Clustering Traffic Scenarios Using Mental Models as Little as Possible.” In: 2020 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2020, pp. 1007–1012. doi: 10.1109/IV47402.2020.9304636.
inD Dataset Python Tools. Technical Report. online at https://github.com/ika-rwth-aachen/drone-dataset-tools, retrieved 15th September 2022.
J. Bock, R. Krajewski, T. Moers, S. Runde, L. Vater, and L. Eckstein. “The inD Dataset: A Drone Dataset of Naturalistic Road User Trajectories at German Intersections.” In: 2020 IEEE Intelligent Vehicles Symposium (IV). 2020, pp. 1929–1934. doi: 10.1109/IV47402.2020.9304839.
N. Kalra and S. M. Paddock. “Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability?” In: Transportation Research Part A: Policy and Practice (2016), pp. 182–193. doi: 10.1016/j.tra.2016.09.010.
T. Menzel, G. Bagschik, and M. Maurer. Scenarios for development, test and validation of automated vehicles. In IEEE Intelligent Vehicles Symposium (IV), pages 1821–1827, 2018.
T. Eiter and H. Mannila. Computing discrete frechet distance. 1994.
S.-W. Park, K. Patil, W. Wilson, M. Corless, G. Choi, and P. Adam. Creating driving scenarios from recorded vehicle data for validating lane centering system in highway traffic. The MathWorks, Inc., 2020.
UNECE. “Proposal for a new UN Regulation on uniform provisions concerning the approval of vehicles with regards to Automated Lane Keeping System.” In: ECE/TRANS/WP.29/2020/81. 2020.