Frontiers in Robotics and AI
We developed a novel framework for deep reinforcement learning (DRL) algorithms in task constrained path generation problems of robotic manipulators leveraging human demonstrated trajectories. The main contribution of this article is to design a reward function that can be used with generic reinforcement learning algorithms by utilizing the Koopman operator theory to build a human intent model from the human demonstrated trajectories. In order to ensure that the developed reward function produces the correct reward, the demonstrated trajectories are further used to create a trust domain within which the Koopman operator–based human intent prediction is considered. Otherwise, the proposed algorithm asks for human feedback to receive rewards. The designed reward function is incorporated inside the deep Q-learning (DQN) framework, which results in a modified DQN algorithm. The effectiveness of the proposed learning algorithm is demonstrated using a simulated robotic arm to learn the paths for constrained end-effector motion and considering the safety of the human in the surroundings of the robot.
Sinha A and Wang Y (2022) Koopman Operator–Based Knowledge-Guided Reinforcement Learning for Safe Human–Robot Interaction. Front. Robot. AI 9:779194. doi: 10.3389/frobt.2022.779194