From Human to Robot Grasping

Detta är en avhandling från Stockholm : KTH Royal Institute of Technology

Sammanfattning: Imagine that a robot fetched this thesis for you from a book shelf. How doyou think the robot would have been programmed? One possibility is thatexperienced engineers had written low level descriptions of all imaginabletasks, including grasping a small book from this particular shelf. A secondoption would be that the robot tried to learn how to grasp books from yourshelf autonomously, resulting in hours of trial-and-error and several bookson the floor.In this thesis, we argue in favor of a third approach where you teach therobot how to grasp books from your shelf through grasping by demonstration.It is based on the idea of robots learning grasping actions by observinghumans performing them. This imposes minimum requirements on the humanteacher: no programming knowledge and, in this thesis, no need for specialsensory devices. It also maximizes the amount of sources from which therobot can learn: any video footage showing a task performed by a human couldpotentially be used in the learning process. And hopefully it reduces theamount of books that end up on the floor.This document explores the challenges involved in the creation of such asystem. First, the robot should be able to understand what the teacher isdoing with their hands. This means, it needs to estimate the pose of theteacher's hands by visually observing their in the absence of markers or anyother input devices which could interfere with the demonstration. Second,the robot should translate the human representation acquired in terms ofhand poses to its own embodiment. Since the kinematics of the robot arepotentially very different from the human one, defining a similarity measureapplicable to very different bodies becomes a challenge. Third, theexecution of the grasp should be continuously monitored to react toinaccuracies in the robot perception or changes in the grasping scenario.While visual data can help correcting the reaching movement to the object,tactile data enables accurate adaptation of the grasp itself, therebyadjusting the robot's internal model of the scene to reality. Finally,acquiring compact models of human grasping actions can help in bothperceiving human demonstrations more accurately and executing them in a morehuman-like manner. Moreover, modeling human grasps can provide us withinsights about what makes an artificial hand design anthropomorphic,assisting the design of new robotic manipulators and hand prostheses.All these modules try to solve particular subproblems of a grasping bydemonstration system. We hope the research on these subproblems performed inthis thesis will both bring us closer to our dream of a learning robot andcontribute to the multiple research fields where these subproblems arecoming from.