Mapping and Merging Using Sound and Vision : Automatic Calibration and Map Fusion with Statistical Deformations

Sammanfattning: Over the last couple of years both cameras, audio and radio sensors have become cheaper and more common in our everyday lives. Such sensors can be used to create maps of where the sensors are positioned and the appearance of the surroundings. For sound and radio, the process of estimating the sender and receiver positions from time of arrival (TOA) or time-difference of arrival (TDOA) measurements is referred to as automatic calibration. The corresponding process for images is to estimate the camera positions as well as the positions of the objects captured in the images. This is called structure from motion (SfM) or visual simultaneous localisation and mapping (SLAM). In this thesis we present studies on how to create such maps, divided into three parts: to find accurate measurements; robust mapping; and merging of maps.The first part is treated in Paper I and involves finding precise – on a subsample level – TDOA measurements. These types of subsample refinements give a high precision, but are sensitive to noise. We present an explicit expression for the variance of the TDOA estimate and study the impact that noise in the signals has. Exact measurements is an important foundation for creating accurate maps. The second part of this thesis includes Papers II–V and covers the topic of robust self-calibration using one-dimensional signals, such as sound or radio. We estimate both sender and receiver positions using TOA and TDOA measurements. The estimation process is divided in two parts, where the first is specific for TOA or TDOA and involves solving a relaxed version of the problem. The second step is common for different types of problems and involves an upgrade from the relaxed solution to the sought parameters. In this thesis we present numerically stable minimal solvers for both these steps for some different setups with senders and receivers. We also suggest frameworks for how to use these solvers together with RANSAC to achieve systems that are robust to outliers, noise and missing data. Additionally, in the last paper we focus on extending self-calibration results, especially for the sound source path, which often cannot be fully reconstructed immediately. The third part of the thesis, Papers VI–VIII, is concerned with the merging of already estimated maps. We mainly focus on maps created from image data, but the methods are applicable to sparse 3D maps coming from different sensor modalities. Merging of maps can be advantageous if there are several map representations of the same environment, or if there is a need for adding new information to an already existing map. We suggest a compact map representation with a small memory footprint, which we then use to fuse maps efficiently. We suggest one method for fusion of maps that are pre-aligned, and one where we additionally estimate the coordinate system. The merging utilises a compact approximation of the residuals and allows for deformations in the original maps. Furthermore, we present minimal solvers for 3D point matching with statistical deformations – which increases the number of inliers when the original maps contain errors.

  KLICKA HÄR FÖR ATT SE AVHANDLINGEN I FULLTEXT. (PDF-format)