Visual Object Tracking and Classification Using Multiple Sensor Measurements
Sammanfattning: Multiple sensor measurement has gained in popularity for computer vision tasks such as visual object tracking and visual pattern classification. The main idea is that multiple sensors may provide rich and redundant information, due to wide spatial or frequency coverage of the scene, which is advantageous over single sensor measurement in learning object model/feature and inferring target state/attribute in complex scenarios. This thesis mainly addresses two problems, both exploiting multiple sensor measurement. One is video object tracking through occlusions using multiple uncalibrated cameras with overlapping fields of view, the other is multi-class image classification through sensor fusion of visual-band and thermal infrared (IR) cameras. Paper A proposes a multi-view tracker in an alternate mode with online learning on Riemannian manifolds by cross-view appearance mapping. The mapping of object appearance between views is achieved by projective transformations that are estimated from warped vertical axis of tracked object by combining multi-view geometric constraints. A similarity metric is defined on Riemannian manifolds, as the shortest geodesic distance between a candidate object and a set of mapped references from multiple views. Based on this metric, a criterion of multi-view maximum likelihood (ML) is introduced for the inference of object state. Paper B proposes a visual-IR fusion-based classifier by multi-class boosting with sub-ensemble learning. In our scheme, a multi-class AdaBoost classification framework is presented where information obtained from visual and thermal IR bands interactively complement each other. This is accomplished by learning weak hypotheses for visual and IR bands independently and then fusing them as learning a sub-ensemble. Proposed methods are shown to be effective and have improved performance compared to previous approaches that are closely related, as demonstrated through experiments based on real-world datasets.
Denna avhandling är EVENTUELLT nedladdningsbar som PDF. Kolla denna länk för att se om den går att ladda ner.