Interantional Corpus of English (ICE)


The International Corpus of English (ICE) began in 1990 with the primary aim of collecting material for comparative studies of English worldwide. Twenty-six research teams around the world are preparing electronic corpora of their own national or regional variety of English. Each ICE corpus consists of one million words of spoken and written English  produced after 1989. For most participating countries, the ICE project is stimulating the first systematic investigation of the national variety. To ensure compatibility among the component corpora, each team is following a common corpus design, as well as a common scheme for grammatical annotation.



We have been informed that the Sri Lankan component of ICE (ICE-SL) has been released. It is available in standard SGML-annotated format and in a POS-tagged version built on the CLAWS C7 tagset. Along with the data, a manual, a manual, in which procedures and decisions are documented, is also available. Scholars interested in using the data can send an e-mail to and will then be provided with the data as well as the manual.


Until 2016, the ICE corpora were distributed by Prof. Gerry Nelson at the Department of English, The Chinese University of Hong Kong. They are now coordinated by Prof. Marianne Hundt and hosted at the English Department of the University of Zurich.

The ICE-hosting team:

Prof. Dr. Marianne Hundt
Dr. Hans Martin Lehmann
PD Dr. Gerold Schneider