Deep Perceptual Loss and Similarity

Sammanfattning: This thesis investigates deep perceptual loss and (deep perceptual) similarity; methods for computing loss and similarity for images as the distance between the deep features extracted from neural networks. The primary contributions of the thesis consist of (i) aggregating much of the existing research on deep perceptual loss and similarity, and (ii) presenting novel research into understanding and improving the methods. This novel research provides insight into how to implement the methods for a given task, their strengths and weaknesses, how to mitigate those weaknesses, and if these methods can handle the inherent ambiguity of similarity.Society increasingly relies on computer vision technology, from everyday smartphone applications to legacy industries like agriculture and mining. Much of that groundbreaking computer vision technology relies on machine learning methods for their success. In turn, the most successful machine learning methods rely on the ability to compute the similarity of instances.In computer vision, computation of image similarity often strives to mimic human perception, called perceptual similarity. Deep perceptual similarity has proven effective for this purpose and achieves state-of-the-art performance. Furthermore, this method has been used for loss calculation when training machine learning models with impressive results in various computer vision tasks. However, many open questions exist, including how to best utilize and improve the methods. Since similarity is ambiguous and context-dependent, it is also uncertain whether the methods can handle changing contexts.This thesis addresses these questions through (i) a systematic study of different implementations of deep perceptual loss and similarity, (ii) a qualitative analysis of the strengths and weaknesses of the methods, (iii) a proof-of-concept investigation of the method's ability to adapt to new contexts, and (iv) cross-referencing the findings with already published works.Several interesting findings are presented and discussed, including those below. Deep perceptual loss and similarity are shown not to follow existing transfer learning conventions. Flaws of the methods are discovered and mitigated. Deep perceptual similarity is demonstrated to be well-suited for applications in various contexts.There is much left to explore, and this thesis provides insight into what future research directions are promising. Many improvements to deep perceptual similarity remain to be applied to loss calculation. Studying how related fields have dealt with problems caused by ambiguity and contexts could lead to further improvements. Combining these improvements could lead to metrics that perform close to the optimum on existing datasets, which motivates the development of more challenging datasets.

  Denna avhandling är EVENTUELLT nedladdningsbar som PDF. Kolla denna länk för att se om den går att ladda ner.