Native vs. non-native English: data-driven lexical analysis

Ewa Witalisz,

Justyna Leśniewska


This article presents a preliminary, data-driven study of a corpus of texts written by advanced Polish learners of English, which were analysed with reference to a baseline corpus of native-speaker texts. The texts included in both corpora were produced in similar circumstances (classroom setting), with the same time and word limit, and in response to the same task. We conducted a comparative lexical analysis of the two corpora, using corpus methodology (word lists, cluster analysis, concordances, keyness) to identify the most significant differences. The most important conclusion from this study is that advanced foreign language use may differ from native-speaker language use in ways which only become visible in larger samples of language, and the differences, if analysed individually, would not be regarded as errors and would go unnoticed. There is some evidence in the study that some of these differences may be attributed to cross-linguistic influence.

Słowa kluczowe: advanced EFL use, corpus analysis of learner language, lexical features of L2 writing

Biber D., Barbieri F. 2007. Lexical bundles in university spoken and written registers. – English for Specific Purposes 26: 263–286.

Belz J. 2004. Learner corpus analysis and the development of foreign language proficiency. – System 32: 577–591.

Birdsong D. 2004. Second language acquisition and ultimate attainment. – Davies A., Elder C. (eds.) The handbook of applied linguistics. Oxford: 82–105.

Cook V. 2002. Background to the L2 user. – Cook V. (ed.) Portraits of the L2 user. Clevedon: 1–28.

Cortes V. 2004. Lexical bundles in published and student disciplinary writing: Examples from history and biology. – English for Specific Purposes 23: 397–423.

DeCock S., Granger S., Leech G., McEnery T. 1998. An automated approach to the phrasicon of EFL learners. – Granger S. (ed.) Learner English on computer. London, New York: 67–79.

Gilquin G., Paquot M. 2007. Spoken features in learner academic writing: Identification, explanation and solution. – Davies M., Rayson P., Hunston S., Danielsson P. (eds.) Proceedings of the corpus linguistics conference CL2007, University of Birmingham, UK, 27–30 July 2007 [].

Granger S. 1998. Prefabricated patterns in advanced EFL writing: Collocations and formulae. – Cowie A.P. (ed.) Phraseology: Theory, analysis, and applications. Oxford: 145–160.

Han Z. 2004. Fossilization in adult second language acquisition. Clevedon.

Han Z., Odlin T. 2006. Studies of fossilization in second language acquisition. Clevedon.

Hyland K. 2008. As can be seen: Lexical bundles and disciplinary variation. – English for Specific Purposes 2: 4–21.

Howarth P. 1996. Phraseology in English academic writing: Some implications for language learning and dictionary making. Tübingen.

Kellerman E. 2001. New uses for old language: cross-linguistic and cross-gestural influence in the narratives of non-native speakers. –

Cenoz J., Hufeisen B., Jessner U. (eds.) Cross-linguistic influence in third language acquisition: Psycholinguistic perspectives. Clevedon: 170–191.

Kilgarriff A., Rychly P., Smrz P., Tugwell D. The sketch engine. [Software.].

Anthony L. AntConc. [Software.].

Lorenz G. 1998. Overstatement in advanced learners’ writing: Stylistic aspects of adjective intensification. – Granger S. (ed.) Learner English on computer. London, New York: 53–66.

Lorenz G. 1999. Adjective intensification – learners versus native speakers: A corpus study of argumentative writing. Amsterdam, Atlanta.

Ringbom H. 1993. Near-nativeness and the four language skills: Some concluding remarks. –

Ringbom H. (ed.) Near-native proficiency in English. Abo: 295–306.

Ringbom H. 2007. Cross-linguistic similarity in foreign language learning. Clevedon.

Scott M. 1999. Wordsmith tools. [Software]. Oxford (UK).

Yeung L. 2009. Use and misuse of ‘besides’: A corpus study comparing native speakers’ and learners’ English. – System 37: 330–342.

Czasopismo ukazuje się w sposób ciągły on-line.
Pierwotną i jedyną formą czasopisma jest wersja elektroniczna.