Corpus Approaches to Lexicogrammar

Corpus Approaches to Lexicogrammar

EnglishGrammar.Pro is a quasi lexicogrammar research website that starts on the grammar end of the continuum and ends by locating the lexis side.  Much of the work done there and its focus is on expanding L2 grammar complexity through L1 corpora to locate the most frequent collocates or Ngrams. The main implications of the research are developing a free online resource for language acquisition, teaching, testing and assessment, but also to some degree for corpus researchers. This paper will primarily report on the development and applications of the EnglishGrammar.Pro website focussing on its pivot: the ‘complexity checker’.

What is the complexity checker?

The complexity checker is primarily a PHP webpage that automatically highlights ‘English Grammar Profile’ (EGP) and ‘English Vocabulary Profile’ (EVP) research on submitted text.  A superficially similar, “Text Inspector”, already exists on the EVP website, but it works in the opposite direction along the lexicogrammar continuum.  The ‘Text Inspector’ highlights vocabulary as tokens and then allows the user to click through CEFR level options mostly related to word sense, mostly disregarding the grammar end.   Whereas the ‘complexity checker’ identifies grammar first, some vocabulary at its lowest sense, and then links highlighted grammar to L1 corpora research. The complexity checker does also have an overall text prediction function.  That is, it summarises the proportions of CEFR level examples, compares them to the texts of the Cambridge Learner Corpus and gives a prediction of CEFR level for the text submitted.

The webpages that are linked to are L1 corpus investigations that contain the vocabulary that goes with the grammar.  In addition, the ClAWS 7 tag set codes/patterns are freely shared so that anyone wishing to replicate searches, can on corpora such as iWeb.

The applications for TESOL

The complexity checker is a tool for Marking writing for a teacher.   One aspect of feedback should mention something about the complexity of the writing.  It helps highlight specific language items that can be shared with the student.  It can also give some overall assessment prediction that can be a part of grading.  *No automated system can totally be trusted with such a difficult task!

The complexity checker is a tool for any texts the students may be asked to read.  It can very quickly suggest just how difficult the text will be for students, and which parts may need to be explained.  Another way is to easily discover grammar points in the text that can be chosen as a language focus.

For any of these uses, a teacher is able to click on highlights and go to a page that gives authentic uses of the language with frequency information.  This in essence provides real or natural examples for a presentation of the grammar.

The linked webpages contain a good deal of material for what test writers could utilise also.

The development and theoretical standpoint

As a Teacher of English to speakers of other languages, I have come across areas to improve.

  1. Marking language complexity is not as intuitive to an L1 expert as error correction or achievement of communicative function.
  2. The extensive research of the EGP is challenging to utilize.
  3. There are shortcomings in some ESL textbooks to provide examples of lexicogrammar.