Studies in Corpora and Idioms : Getting the cat out of the bag

Sammanfattning: “Idiomatic” expressions, usually called “idioms”, such as a dime a dozen, a busman’s holiday, or to have bats in your belfry are a curious part of any language: they usually have a fixed lexical (why a busman?) and structural composition (only dime and dozen in direct conjunction mean ‘common, ordinary’), can be semantically obscure (why bats?), yet are widely recognized in the speech community, in spite of being so rare that only large corpora can provide us with access to sufficient empirical data on their use.In this compilation thesis, four published studies focusing on idioms in corpora are presented. Study 1 details the creation of and data in the author’s medium-sized corpus from 1999, the 3.7 million word Coll corpus of online university student newspapers, with comparisons to data from standard corpora of the time. Study 2 examines the extent to which recognized idioms are to be found in the Coll corpus and how they can be varied. Study 3 draws upon the British National Corpus and a series of British and American newspaper corpora to see how idioms may be “anchored” in their contexts, primarily by the device of premodification via an adjective appropriate to the context, not to the idiom. Study 4 examines idiom-usage patterns in the Time Magazine corpus, focusing on possible aspects of diachronic change over the near-century Time represents.The introductory compilation chapter places and discusses these studies in their contexts of contemporary idiom and corpus research; building on these studies, it provides two specific examples of potential ways forward in idiom research: an examination of the idioms used in a specific subgenre of newspapers (editorials), and a detailed suggestion for teachers about how to examine multiple facets of a specific modern idiom (the glass ceiling) in the classroom. Finally, a summing-up includes suggestions for further research, particularly at the level of the patterning of individual idioms, rather than treating them as a homogeneous phenomenon.