Title: Few-shot reasoning-based safe reinforcement learning framework for autonomous robot navigation

Authors: Weiqiang Wang; Xu Zhou; Benlian Xu; Siwen Chen; Mingli Lu; Jun Li; Yuejiang Gu

Addresses: School of Mechanical Engineering, Changshu Institute of Technology, Suzhou, 215500, China ' School of Mechanical Engineering, Changshu Institute of Technology, Suzhou, 215500, China ' School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, 215009, China ' Software Institute, Nanjing University, Nanjing, 210008, China ' School of Electrical and Automatic Engineering, Changshu Institute of Technology, Suzhou, 215500, China ' School of Automation, Nanjing University of Science and Technology, Nanjing, 210094, China ' R&D Center, General Elevator Co., Ltd., Suzhou, 215232, China

Abstract: Unsafe explorations in the training phase hinder the practical deployment of reinforcement learning (RL) on autonomous robots. Some safe RL methods use safety constraints from prior or external knowledge to reduce or avoid unsafe explorations, but such knowledge is usually unavailable in practice, especially in unknown environments. In this work, we propose a few-shot reasoning-based safe reinforcement learning framework that includes a new few-shot learning method with dynamic support set to reason the safety of unexplored actions and hence guide safer action selection. Additionally, it endows robots with the capability of reverting to previous safe states and reflecting on failures to update the dynamic support set and further improve the accuracy of safety reasoning. Experimental results show that our new few-shot learning method is more accurate, and our proposed framework can significantly reduce the number of failures in the learning phase, especially for long-term autonomy.

Keywords: safe reinforcement learning; few-shot learning; dynamic support set; autonomous robots.

DOI: 10.1504/IJAAC.2024.135093

International Journal of Automation and Control, 2024 Vol.18 No.1, pp.30 - 52

Received: 16 Dec 2022
Accepted: 10 Feb 2023

Published online: 30 Nov 2023 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article