Enhancement of Salient Image Regions for Visual Object Detection

Detta är en avhandling från Chalmers University of Technology

Sammanfattning: Salient object/region detection aims at finding interesting regions in images and videos, since such regions contain important information and easily attract human attention. The detected regions can be further used for more complicated computer vision applications such as object detection and recognition, image compression, content-based image editing, and image retrieval. One of the fundamental challenge in salient object detection is to uniformly emphasize desired objects and meanwhile suppress irrelevant background. Existing heuristic color contrast-based methods tend to obtain false detection in complex scenarios and attenuate the inner part of large salient objects. In order to achieve uniform object enhancement and background suppression, several new techniques including color feature integration, graph-based geodesic saliency propagation, hierarchical segmentation based on graph spectrum decomposition are developed in this thesis to assist saliency computation. Paper 1 proposes a superpixel-based salient object detection method which takes advantages of color contrast and distribution. It develops complementary abilities among hypotheses and generates high quality saliency maps. Paper 2 proposes a novel geodesic propagation method for salient region enhancement. It leverages an initial coarse saliency map that highlight potential salient regions, and then conducts geodesic propagation. Local connectivity of objects is retained after the proposed propagation. Papers 3 and 4 use graph-based spectral decomposition for hierarchical segmentation, which enhances saliency detection. As most previous work on salient region detection is done for still images, paper 5 extends graph-based saliency detection methods to video processing. It combines static appearance and motion cues to construct graph. A spatial-temporal smoothing operation is proposed on a structured graph derived from consecutive frames to maintain visual coherence in both inter- and intra- frames. All these proposed methods are validated on benchmark datasets and achieve comparable/better performance to the state-of-the-art methods.