Automatic Adaptation of Swedish Text for Increased Inclusion

Sammanfattning: Imagine what life would be if words did not make any sense to you. If you had to use a dictionary, or Google Translate, to check the meaning of each word. If you forgot the first part of the sentence when you got to the last word. Working would be hard, studying perhaps even more difficult. Reading is a skill essential for participation in many parts of the modern society, but there are many individuals incapable of assimilating text. This could be for various reasons, for instance due to dyslexia, cognitive disabilities, or as a consequence of having a different first language. In order to automate the adaptation process, more knowledge about the properties of Easy Language texts is needed. There are several collections of guidelines aimed at writers of Easy Language texts. Such guidelines are useful for getting an idea of what Easy Language should be, but they are often rather vague and rely on the expert knowledge and intuition of the writer. By analysing texts produced by writers of Easy Language, it is possible to operationalise the guidelines and obtain data for building models for automatic text adaptation. However, this approach requires large corpora of high quality, and the results are highly dependent on the training data.This doctoral thesis adopts an end-user approach to automatic text adaptation. The application of a reading comprehension perspective on Easy Language text adaptation differs from the text-based perspectives commonly used within the field and this position implies that the strengths and weaknesses of individuals in the different groups of poor readers are important to consider. We present and evaluate a variety of techniques for automatic text adaptation, as well as work on text complexity visualisation. Assessment of text complexity can be useful for both readers and writers of Easy Language texts. However, common complexity features are not easily interpretable, and this thesis presents work on enhancing interpretability of such features, using clustering and visualisation methods. The main contributions of this thesis are 1) a mapping of Easy Language guidelines to a theoretical model of reading comprehension, 2) a corpus of simple and standard documents, aligned at the sentence level, 3) clustering and visualisation of text complexity measures, and 4) a number of tools and services for the automatic adaptation of Swedish text.

  Denna avhandling är EVENTUELLT nedladdningsbar som PDF. Kolla denna länk för att se om den går att ladda ner.