This book discusses some of the basic issues relating to corpus
generation and the methods normally used to generate a corpus. Since
corpus-related research goes beyond corpus generation, the book also
addresses other major topics connected with the use and application of
language corpora, namely, corpus readiness in the context of corpus
sanitation and pre-editing of corpus texts; the application of
statistical methods; and various text processing techniques.
Importantly, it explores how corpora can be used as a primary or
secondary resource in English language teaching, in creating
dictionaries, in word sense disambiguation, in various language
technologies, and in other branches of linguistics. Lastly, the book
sheds light on the status quo of corpus generation in Indian languages
and identifies current and future needs.
Discussing various technical issues in the field in a lucid manner,
providing extensive new diagrams and charts for easy comprehension, and
using simplified English, the book is an ideal resource for non-native
English readers. Written by academics with many years of experience
teaching and researching corpus linguistics, its focus on Indian
languages and on English corpora makes it applicable to graduate and
postgraduate students of applied linguistics, computational linguistics
and language processing in South Asia and across countries where English
is spoken as a first or second language.