
87580e040690b7f2d4f092c8079261bf.ppt
- Количество слайдов: 37
Learning and Control of Biped Locomotion Dr Changjiu Zhou School of Electrical & Electronic Engineering Singapore Polytechnic zhoucj@sp. edu. sg www. robo-erectus. org Development of Humanoid Soccer Robots www. robo-erectus. org
Outline · Introduction · Biped Walking Cycles · How to Control Biped Locomotion · How to Plan/Learn Biped Gaits · Biped learning by reinforcement · Some Research Topics Development of Humanoid Soccer Robots www. robo-erectus. org
Biped Gait (Frontal Plane) Single Support Time Double Support Single Support Biped Gait (Frontal View) Development of Humanoid Soccer Robots www. robo-erectus. org
Biped Gait (Sagittal Plane) Development of Humanoid Soccer Robots www. robo-erectus. org
Finite State Machine for Biped Walking Control Swing time completed Right Support Right foot touches down Right-to-Left-to-Right Transition Left foot touches down Left Support Development of Humanoid Soccer Robots Swing time completed www. robo-erectus. org
Static Walking · In static walking, the biped has to move very slowly so that the dynamics can be ignored. · The biped’s projected center of gravity (PCOG) must be within the supporting area. Single Support Double Support Development of Humanoid Soccer Robots www. robo-erectus. org
Dynamic Walking · In dynamic walking, the motion is fast and hence the dynamics cannot be negligible. · In dynamic walking, we should look at the zero moment point (ZMP) rather than PCOG. · The stability margin of dynamic walking is much harder to quantify. Development of Humanoid Soccer Robots www. robo-erectus. org
Why is Biped Robotics Hard? · Unpowered DOF between the foot and ground · This constraint limits the trajectory tracking approaches used commonly in manipulators research. Development of Humanoid Soccer Robots www. robo-erectus. org
Biped Control: Model-based Feet position and ZMP (PCOG) Inverse kinematics model Desired joint angles Biped Robot Development of Humanoid Soccer Robots www. robo-erectus. org
Biped Control: Model-based · Except for certain massless leg models, most biped models are nonlinear and do not have analytical solutions. · Massless leg model is the simplest model. The body of the robot is usually assumed to be point mass and can be viewed to be an inverted pendulum. · When the leg inertia and other dynamics like that of the actuator, joint friction, etc. are included, the overall dynamic equations can be very nonlinear and complex. Development of Humanoid Soccer Robots www. robo-erectus. org
Example: Massless leg model • The simplest biped model • Some assumptions, e. g. , • From D’Alembert’s principle Development of Humanoid Soccer Robots www. robo-erectus. org
Biped Control: Biologically Inspired · Since none of the humanoid robots match biological humanoids in terms of mobility, adaptability, and stability, many researchers try to examine biological bipeds so as to extract certain algorithms that are applicable to the robots. Reverse Engineering Development of Humanoid Soccer Robots www. robo-erectus. org
Biped Control: Biologically Inspired Two Main Research Areas 1. Central Pattern Generators (CPG) 2. Passive Walking Development of Humanoid Soccer Robots www. robo-erectus. org
ZMP-based Gait Planning • Plan the hip and ankle trajectories according to walking constraints and ground constraints. • Derive all joint trajectories by inverse kinematics. Development of Humanoid Soccer Robots www. robo-erectus. org
Example: Gait Planning for Walking on Slope - Plan gait using 3 rd order Spine which guarantees the continuity of both 1 st derivative and 2 nd derivative. Development of Humanoid Soccer Robots www. robo-erectus. org
Example: Planning Results Joint angles Consecutive walking gait along slope Development of Humanoid Soccer Robots www. robo-erectus. org
IP-based Gait Planning v L • The dynamic equation of the IP model • If the angle is small, it can be simplify as a linear homogeneous 2 nd order differential equation 2 wf Development of Humanoid Soccer Robots www. robo-erectus. org
3 D Linear Pendulum Model Development of Humanoid Soccer Robots www. robo-erectus. org
Example: IP-based Gait Planning Development of Humanoid Soccer Robots www. robo-erectus. org
Biped Kicking constraints: – Kicking range – Friction –… Development of Humanoid Soccer Robots www. robo-erectus. org
Kicking Pattern Development of Humanoid Soccer Robots www. robo-erectus. org
Biped Learning by Reinforcement (1) · A humanoid robot aims to select a good value for the swing leg parameters for each consecutive step so that it achieves stable walking. · A reward function that correctly defines this objective is critical for the reinforcement learning. Supporting foot Stable Unstable r = 0 (reward) r = -1 (punishment) Development of Humanoid Soccer Robots www. robo-erectus. org
Biped Learning by Reinforcement (2) • The control objective of the gait synthesizing for biped dynamic balance can be described as • To evaluate biped dynamic balance in the frontal plane, a penalty signal should be given if the biped robot falls down in the frontal plane www. robo-erectus. org
Biped Learning by Reinforcement (3) Good Supporting foot Very Bad Excellent OK Reinforcement Learning with Fuzzy Evaluative Feedback Development of Humanoid Soccer Robots www. robo-erectus. org
The RL Agent AEN - the action-state evaluation network - the action selection network ASN SAM - the stochastic action modifier • Both the AEN and ASN are initialized randomly. • Learning starts from scratch. • It needs a large number of trials for learning. Development of Humanoid Soccer Robots www. robo-erectus. org
The FRL Agent • Neural fuzzy networks are used to replace the neuron-like adaptive elements. • The expert knowledge can be directly built into the FRL agent as a starting configuration. • The ASN and/or AEN could house available expert knowledge to speed up its learning. Development of Humanoid Soccer Robots www. robo-erectus. org
The FRL Agent with Fuzzy Evaluative Feedback • The numerical evaluative feedback is not the biological plausible. • The fuzzy evaluative feedback is much closer to the learning environment in the real world. • The fuzzy evaluative feedback is based on a form of continuous evaluation. Development of Humanoid Soccer Robots www. robo-erectus. org
Comparison of FRL Agents Types RL agent Action Network (ASN) neural Critic Network (AEN) neural Evaluative Feedback numerical FRL agent neuro-fuzzy neural numerical (Type A) FRL agent neuro-fuzzy numerical (Type B) FRL agent neuro-fuzzy Fuzzy (Type C) Development of Humanoid Soccer Robots www. robo-erectus. org
Information Available for Biped Gait Synthesizing The Description of the Information Case A No expert knowledge is available. Only numerical reinforcement signal is used to train the gait synthesizer. Case B Only the intuitive biped balancing knowledge is used as the initial configuration of the gait synthesizer. Case C Both the intuitive biped balancing knowledge and walking evaluation knowledge are utilized. Case D Besides all the information used in case C, the fuzzy evaluative feedback, rather than numerical evaluative feedback, is included. Development of Humanoid Soccer Robots www. robo-erectus. org
The Gait Synthesizer Using Two Independent FRL Agents Development of Humanoid Soccer Robots www. robo-erectus. org
Before and After Learning Ankle joint Knee joint Development of Humanoid Soccer Robots www. robo-erectus. org
Results (1) The ZMP trajectory after FRL (type C) Development of Humanoid Soccer Robots www. robo-erectus. org
Results (2) Walk (Backward) Development of Humanoid Soccer Robots www. robo-erectus. org
Some Research Topics • Online gait generating • Online footprint planning • Constraints – ZMP constraint for stable walking – Friction constraint for stable walking –… • Current Challenges – Knee bending – Body shifting –… • … Development of Humanoid Soccer Robots www. robo-erectus. org
References • • • C. Zhou, “Robot learning with GA-based fuzzy reinforcement learning agents, ” Information Sciences 145 (2002) 45 -68. C. Zhou, “Fuzzy-arithmetic-based Lyapunov synthesis to the design of stable fuzzy controllers: a computing with words approach, ” Int. J. Applied Mathematics and Computer Science 12(3) (2002) 101 -111. C. Zhou and Q. Meng, “Dynamic balance of a biped robot using fuzzy reinforcement learning agents, ” Fuzzy Sets and Systems 134(1) (2003) 169 -187. C. Zhou, P. K. Yue, Z. Tang and Z. Sun, “Development of Robo-Erectus: A soccer-playing humanoid robot, ” Proc. IEEE-RAS Intl. Conf. on Humanoid Robots, CD-ROM, 2003. Z. Tang, C. Zhou and Z. Sun, “Gait synthesizing for humanoid penalty kicking, ” Dynamics of Continuous, Discrete and Impulsive Systems, Series B, (2003) 472 -477. D. Maravall, C. Zhou and J. Alonso, “Hybrid fuzzy control of inverted pendulum via vertical forces, ” Int. J. of Intelligent Systems, 2004 (in press). Development of Humanoid Soccer Robots www. robo-erectus. org
Acknowledgements • Staff Member P. K. Yue, F. S. Choy, Nazeer Ahmed M. F. Ercan, Mike Wong, H. Li • Research Associate Z. Tang (Tsinghua U. ), J. Ni (Shanghai Jiao Tong U. ) • Technical Support Officer H. M. Tan, W. Ye • Students P. P. Khing, H. W. Yin, H. F. Lu, H. X. Tan, J. X. Teo, Stephen Quah, H. M. Tan, Y. T. Tan Development of Humanoid Soccer Robots www. robo-erectus. org
Thanks! Dr Changjiu Zhou School of Electrical and Electronic Engineering Singapore Polytechnic zhoucj@sp. edu. sg www. robo-erectus. org Development of Humanoid Soccer Robots www. robo-erectus. org
87580e040690b7f2d4f092c8079261bf.ppt