This that and the other: Multi-word clusters in spoken English as visible patterns of interaction
Keywords:Multi-word clusters, spoken English, corpus of conversational English, pragmatic integrity, Corpus Linguistics
This paper investigates multi-word strings automatically retrieved from a 5-million-word corpus of conversational English from Britain and Ireland. Many such strings have neither syntactic nor semantic integrity, for example at the, it was a, what do you. However, many strings display pragmatic integrity, encoding interactive functions such as hedging, vagueness, discourse marking, etc. Examples include and that sort of thing, you know, a couple of. We identify the most common pragmatically integrated clusters and discuss their functions, and compare their frequency with single words, illustrating that many clusters are more frequent than single words accepted as belonging to the core vocabulary of English. The clusters also contrast with the low frequency of opaque idiomatic expressions. High-frequency clusters raise issues around the distinction between lexis and grammar, and support a synthetic view of language production and storage, with implications for the understanding of notions such as fluency and idiomaticity.