Evaluation of multi-agent vehicle control based on the Daruma design pattern in CARLA #6488
Replies: 3 comments
-
Quick question to CARLA Leaderboard developers. A post in this forum has already mentioned several non-deterministic effects in the Leaderboard evaluations. Also in our experiments the Leaderboard scores vary substantially, if we run a route evaluation with the same agent software several times. A few examples:
This score variation also makes it hard to rank and objectively compare different agents. How is this accounted for in the official benchmarking tables? How can developers like us improve determinism in the benchmarking setup? Many thanks in advance! |
Beta Was this translation helpful? Give feedback.
-
The behavior-based robotics had a similar idea but often use a predefined ordering of different behaviors (channels in your terminology). It seems all the channels are created equal in your work. Do you publish what metrics you use to select/evaluate channels? |
Beta Was this translation helpful? Give feedback.
-
Thanks a lot for this association, @doganulus! I didn't know about the behavior-based robotics, but it has similarity with the Daruma design pattern. Would you recommend a good (survey) article about this technique? We also focus on benefits of diverse heterogeneous channels. One channel can, for example, use cameras and lidars with advanced neural networks; another channel can be based on traditional computer vision with built-in fault tolerance; a high-integrity minimum risk maneuver (MRM) channel can constantly work on evasive maneuvers based on the radar; etc. All channels, except for the MRM channel, have the same goal of following waypoints towards the destination. Channels can substantially disagree in their perception and/or motion planning, while being safe and available. By the way, the availability (i.e. ability to progress towards the destination instead of halting) is illustrated in the last demo video of our blog post. To select a channel, we don't use inadequate majority voting. Instead, Daruma leverages three metric categories described in our paper "Characterization and Mitigation of Insufficiencies in Automated Driving Systems" in section "DARUMA: ARCHITECTURAL DESIGN...": similarities, cross-channel risks, and preferences. These metrics are used at runtime to select a channel appropriate for the current scenario. Eindhoven University of Technology made the first Daruma implementation - the Safety Shell. You can read about the implemented algorithms in our paper "Detection and Mitigation of Functional Insufficiencies in Autonomous Vehicles: The Safety Shell". There are more publications in the pipeline. 😉 |
Beta Was this translation helpful? Give feedback.
-
Hi, recently our team has made an experiment with multiple autonomous agents driving a single ego vehicle in a CARLA Leaderboard route. The key idea, which we call the Daruma design pattern, is to dynamically select the agent that better fits the ongoing driving scenario. In a safety-critical scenario an agent may perform a safe stop in front of a crossing pedestrian, while another agent can unblock the stuck ego vehicle in another scenario. In our blog post we explain the idea in more details and present a demo video of switching between multiple agents, which we call Automated Driving (AD) channels, in the CARLA simulator. What do you think about this approach to safe automated driving system design?
What we learned in this experiment is that agents based on neural networks need to expose internal information for the Daruma cross-channel analysis to make a smart choice among the agents. In fact, exposure of road users’ locations and speeds, motion predictions, and ego vehicle’s path planning is also useful for debugging, testing and validation. Basically, an eXplainable AI instead of opaque end-to-end neural networks. So, instead of designing a neural network that triggers a braking action, it’s better to design a network computing the location and predicted motion of road users.
Beta Was this translation helpful? Give feedback.
All reactions