Publication Entry

[Back]

Talks and Poster Presentations (with Proceedings-Entry):

M. Lechner, R. Hasani, R. Grosu, D. Rus, T. Henzinger:
"Adversarial Training is Not Ready for Robot Learning";
Talk: 2021 IEEE International Conference on Robotics and Automation (ICRA 2021), Xi'an, China; 2021-05-31 - 2021-06-04; in: "In Proc. of ICRA'21, the International Conference on Robotics and Automation", IEEE, (2021), ISBN: 978-1-7281-9077-8; 1 - 8.

English abstract:

Adversarial training is an effective method to train deep learning models that are resilient to norm-bounded perturbations, with the cost of nominal performance drop. While adversarial training appears to enhance the robustness and safety of a deep model deployed in open-world decision-critical applications, counterintuitively, it induces undesired behaviors in robot learning settings. In this paper, we show theoretically and experimentally that neural controllers obtained via adversarial training are subjected to three types of defects, namely transient, systematic, and conditional errors. We first generalize adversarial training to a safety-domain optimization scheme allowing for more generic specifications. We then prove that such a learning process tends to cause certain error profiles. We support our theoretical results by a thorough experimental safety analysis in a robot-learning task. Our results suggest that adversarial training is not yet ready for robot learning.

German abstract:

Adversariales Training ist eine effektive Methode, um Deep-Learning-Modelle zu trainieren, die gegenüber normgebundenen Störungen widerstandsfähig sind, allerdings um den Preis eines nominalen Leistungsabfalls. Während adversariales Training die Robustheit und Sicherheit eines Deep-Learning-Modells in entscheidungskritischen Open-World-Anwendungen zu verbessern scheint, führt es kontraintuitiv zu unerwünschten Verhaltensweisen in Roboter-Lernumgebungen. In diesem Beitrag zeigen wir theoretisch und experimentell, dass neuronale Steuerungen, die durch adversariales Training gewonnen werden, drei Arten von Fehlern aufweisen, nämlich transiente, systematische und bedingte Fehler. Zunächst verallgemeinern wir das kontradiktorische Training zu einem Optimierungsschema für den Sicherheitsbereich, das allgemeinere Spezifikationen zulässt. Anschließend beweisen wir, dass ein solcher Lernprozess dazu neigt, bestimmte Fehlerprofile zu verursachen. Wir stützen unsere theoretischen Ergebnisse durch eine gründliche experimentelle Sicherheitsanalyse in einer Roboter-Lernaufgabe. Unsere Ergebnisse deuten darauf hin, dass adversariales Training noch nicht für das Roboterlernen geeignet ist.

Keywords:

robot learning, adversarial training

Electronic version of the publication:

https://publik.tuwien.ac.at/files/publik_300946.pdf

Created from the Publication Database of the Vienna University of Technology.