Visar resultat 1 - 5 av 107 avhandlingar innehållade ordet corpora.
Sammanfattning : “Idiomatic” expressions, usually called “idioms”, such as a dime a dozen, a busman’s holiday, or to have bats in your belfry are a curious part of any language: they usually have a fixed lexical (why a busman?) and structural composition (only dime and dozen in direct conjunction mean ‘common, ordinary’), can be semantically obscure (why bats?), yet are widely recognized in the speech community, in spite of being so rare that only large corpora can provide us with access to sufficient empirical data on their use.In this compilation thesis, four published studies focusing on idioms in corpora are presented. LÄS MER
2. Recycling Translations : Extraction of Lexical Data from Parallel Corpora and their Application in Natural Language Processing
Sammanfattning : The focus of this thesis is on re-using translations in natural language processing. It involves the collection of documents and their translations in an appropriate format, the automatic extraction of translation data, and the application of the extracted data to different tasks in natural language processing. LÄS MER
Sammanfattning : This thesis presents open source resources in the form of annotated corpora and modules for automatic morphosyntactic processing and analysis of Persian texts. More specifically, the resources consist of an improved part-of-speech tagged corpus and a dependency treebank, as well as tools for text normalization, sentence segmentation, tokenization, part-of-speech tagging, and dependency parsing for Persian. LÄS MER
Sammanfattning : Information contained in the free text of health records is useful for the immediate care of patients as well as for medical knowledge creation. Advances in clinical language processing have made it possible to automatically extract this information, but most research has, until recently, been conducted on clinical text written in English. LÄS MER
5. Marqueurs corrélatifs en français et en suédois : Étude sémantico-fonctionnelle de d’une part… d’autre part, d’un côté… de l’autre et de non seulement… mais en contraste
Sammanfattning : This thesis deals with the correlative markers d’une part… d’autre part, d’un côté… de l’autre and non seulement… mais in French and their Swedish counterparts dels… dels, å ena sidan… å andra sidan and inte bara… utan. These markers are composed of two separate parts generally occurring together, and announce a serial of at least two textual units to be considered together. LÄS MER