Representation and Learning for Robotic Grasping, Caging, and Planning
Sammanfattning: Robots need to grasp, handle, and manipulate objects, navigate their environment, and understand the state of the world around them. Like all artificial intelligence agents, they have to make predictions, formulate goals, reason about actions, and make plans. Expressive, informative, and compact representations of their state, task, or environment are therefore essential, because they allow us to address these problems by computational means. To create suitable representations, we need to consider the agent’s goals, means or resources, external performance requirements, and have to decide what is relevant to the task.This thesis investigates the construction, learning, and application of representations in different robotic scenarios. We study representations and algorithms for agents that have the goal to reliably grasp an object, prevent an object from escaping by caging, or learn a model of their interaction with the environment to be able to plan actions and follow the state of the world. Each of the scenarios considers different aspects of representation: Efficient computation and optimization, tractable reasoning, relating different parameterizations, or autonomous learning and execution of behavior under uncertainty.For the grasping agent, we introduce an embedding space that allows us to associate contact locations with hand postures and derive a hierarchical representation of object surfaces which together give rise to an efficient fingertip grasp synthesis algorithm. For the caging agent, we only consider objects with holes through their body which allows us to focus on caging configurations that mechanically interlock objects and hands similar to links of a chain. Further, we change from a geometric to a topology- based representation which allows us to construct caging configurations by control-based optimization and sampling-based search. For the learning agent, we consider the environment and robot as a dynamical system and learn predictive state representations that are directly based on observable data. We demonstrate two contrasting methods to influence the resulting model. For an in-hand manipulation task, we consider training sequences as strings of symbols and introduce feature functions that integrate both actions and observations to reduce state ambiguity. For a simulated visual navigation task, we learn a feature embedding with prior information and training labels to enhance model interpretability while at the same time improving planning performance.
Denna avhandling är EVENTUELLT nedladdningsbar som PDF. Kolla denna länk för att se om den går att ladda ner.