Local Signal Models for Image Sequence Analysis

Sammanfattning: The thesis describes novel methods for image motion computation and template matching.A multiscale algorithm for energy-based estimation and representation of local spatiotemporal structure by second order symmetric tensors is presented. An efficient spatiotemporal implementation of a signalmodellingmethod called normalized convolution is described. This provides a means to handle signals with varying degree of reliability.As an application of the above results, a smooth pursuit motion tracking algorithm that uses observations of both targetmotion and position for camera head control and motion prediction is described. The target is detected using a novel motion field segmentation algorithm which assumes that the motion fields of the target and its immediate vicinity, at least occasionally, each can be modelled by a single parameterized motion model. A method to eliminate camera-induced background motion in the case of a pan/tilt rotating camera is suggested.In a second application, a high-precision image motion estimation algorithm performing clustering in motion parameter space is developed. The algorithm, which can handle multiple motions by simultaneous motion parameter estimation and image segmentation, iteratively maximizes the posterior probability of the motion parameter set given the observed local spatiotemporal structure tensor field. The probabilistic formulation provides a natural way to incorporate additional prior information about the segmentation of the scene into the objective function. A simple homotopy continuation method (embedding algorithm) is used to increase the likelihood of convergence to a nearoptimal solution.The final part of the thesis is concerned with tracking of (partially) occluded targets. An algorithm for target tracking in head-up display sequences is presented. The method generalizes cross-correlation coefficient matching by introducing a signal confidencebased distance metric. To handle target shape changes, a method for template mask shape-adaptation based on geometric transformation parameter optimisation is introduced. The presence of occluding objects makes local structure descriptors (e.g., the gradient) unreliable, which means that only pixelwise comparisons of target and template can be made, unless the local structure operators are modified to take into account the varying signal certainty. Normalized convolution provides the means for such a modification. This is demonstrated in a section on phase-based target tracking, which also contains a presentation of a generic method for tracking of occluded targets by combining normalized convolution with iterative reweighting.