Testing the English Grammar Profiler against CLC

Sadly through the Sketch Engine, we do not get access to A1 or A2 level texts from the learner corpus.  It is for this reason, this short test will be run starting at B1 level students.  All the concordances will be gathered searching for the very common article “a”.

The following concordance lines are randomly shuffled sentences from B1 level learners in the Cambridge Learner Corpus (uncoded).  The first 17 were taken and put through the complexity checker.  It can be seen that the c1 language in blue is almost an accurate sentence apart from “a nice country with a special…. meal“.  The C2 signalling expression in purple, which is often a rote learnt writing structure at pre-intermediate has been highlighted yet we see that there is a comma splice at the end of it.  The B2 “everybody” should be followed by “has” and “whose” should be “whose fault it was”.

Therefore if error correction was added first, most of these higher-level highlights would not appear.

Overall, this first test against the corpus shows the complexity checker is quite useful, even though it is still in beta and incomplete.

B1 shuffled concordances

As I rush to follow the same pattern and then check it for the next level, I am a little sad that I don’t immediately see much evidence of higher-level highlights.  I notice more B2 lexis and a few “green” B2 grammar points highlighted.

B2 shuffled concordances


At C1 we start to notice that sentence length has increased and the academic vocabulary (in white) has doubled.



The 20 sentences at c2 continue the same trend.  Length keeps increasing and more academic vocabulary is used.  Unfortunately, as for grammar complexity, there does not seem to be such a clear indication of C2 grammar as proof of the level.


Leave a Comment

Your email address will not be published. Required fields are marked *