Understanding Neural Machine Translation : An investigation into linguistic phenomena and attention mechanisms

Sammanfattning: In this thesis, I explore neural machine translation (NMT) models via targeted investigation of various linguistic phenomena and thorough exploration of the internal structure of NMT models, in particular the attention mechanism. With respect to linguistic phenomena, I explore the ability of NMT models to translate ambiguous words, to learn long-range dependencies, to learn morphology, and to translate negation—linguistic phenomena that have been challenging for the older paradigm of statistical machine translation. I find that morphological inflection and negation are better modeled in encoder hidden states, while the senses of ambiguous words are better learned in decoder hidden states. Hidden states from lower layers are better at capturing aspects of form, such as morphological inflections and negation cues, while hidden states from higher layers are better at capturing semantic and relational aspects, such as word senses, negation events, and negation scope. I conclude that NMT models learn linguistic knowledge in a bottom-up manner. In the final part of the thesis, I interpret attention mechanisms in encoder-free models and character-level models. I show that attending to word embeddings directly does not make attention mechanisms more alignment-like but instead demonstrates that the attention mechanism is adaptable and more important for NMT than encoders. In character-level models, all characters attract equal attention except the final separators. Overall, the ability of NMT models to deal with the studied linguistic phenomena gets stronger with the evolution of architectures. NMT models perform well in translating frequent ambiguous words and learning long-range dependencies, but still suffer from morphological errors and the under-translation of negation. Attention mechanisms are crucial and adaptable, and there is no uniform behavior in different settings.

  KLICKA HÄR FÖR ATT SE AVHANDLINGEN I FULLTEXT. (PDF-format)