Learning Spatial Behavior Cognition Model in the Dynamic Environment
- 指導教授 黃漢邦 博士 研究生 陳衍文 - Advisor :Dr.Han-Pang
Huang Student : Yen-Wen Chen Abstract:
With the rapid development of robotics, robots have expanded their applications from industry and production lines to daily life. Beside servants, they can be pets, companions, or guides. In the near future, robots will appear in human environments, such as campuses, offices, hospitals, museums and even households. For robots to be useful, and to be accepted by humans, they need to understand human behaviors as well as to adapt to, and relate with their environments. Human behaviors, however, are highly affected by implicit human factors such as culture, social conventions, laws and even the mental states of individuals and groups. If robots are to be accepted by humans, they must conform to common social norms and local customs as well as recognize highly socialized spatial behaviors. The main concept of this thesis is to develop the Dynamic Spatial Behavior Cognition Model (Dynamic SBCM) of the robot. The model makes robots learn the specific, invisible rules in human society, and successively attune the learning result when robots are operated in the learned environment, or other similar environments. Robots use inverse reinforcement learning (IRL) to learn the behavior by apprenticing human behavior. However, the perception for everyone feeling the same environment may not be identical, so the different perception will cause the different action. The thesis separates actions into many states using information entropy. Robots learn each state to represent the social rule more precisely. Bedsides, the robot also need to modify the learned result when operating to adapt to the dynamic environment. The thesis includes a demonstration of a method of using the same learning approach to cluster trajectories by three velocity levels, slow, medium and fast, to describe the preference of human, and the corresponding cost function can predict the human preference.
中文摘要:
本論文的宗旨即在於發展「動態空間行為認知模型」。此模型可以教導機器人學習特定環境中或是無法直接從肉眼看到的社會行為,而此學習的結果也可以在接下來的運作中進一步作微調以達到適合環境的目的。 機器人是透過逆向加強學習來模仿人類的行為。然而,因為每個人對世界的感知會因人而異,所以在同一環境下做出來的行為不一定會相同。因此,這篇論文會先用資訊熵將機器人所觀測到的行為分成好幾個狀態。機器人會學習各個狀態而能更精確表達所見的社會規範,此外,機器人也會在運作的時候修改之前學過的結果已達到適合動態環境的目的。 此篇論文能讓機器人自己經由人類速度的變化推測出可能的社會行為差異,藉由三個速度階層(快、中、慢)的變化當作行動,使用資訊熵分類這些行動序列就可以讓機器人認知不同的社會行為(在論文中是以偏好來表示),並加以學習。 |