Assessment by numbers

I once sat in on a series of demos of primary school tracking systems for a large MAT. At lunchtime the CEO – a secondary head by trade – sat with his head in his hands and asked: “how have we reached the point where we have so little faith in teachers’ judgement that we need software to give us the answers?”. He was talking about ‘assessment by algorithm’ approaches employed by many systems – both commercial and home grown – whereby teachers tick-off, RAG-rate and score numerous objectives and a formula calculates an overall grade or level for the pupil.

And yet here we are – thousands of schools use such systems. Teachers dutifully spend hours ticking off the learning objectives, not just for core subjects but often for other subjects as well, and we leave it to the system to decide whether pupils are below, at or above ‘age-related expectations’. The fact that teachers may disagree with the computer is all too often overlooked, because senior leaders want the security of supposed consistency. And this speaks volumes. Senior leaders in such schools evidently do not have faith in teachers’ judgement, and perhaps teachers themselves are in a comfortable place here, having been absolved of responsibility for assessment they can point at the computer and say “it wasn’t me, it was the software”.

But the whole thing is a fallacy. There is no consistency – it’s not standardised, it just gives the illusion of standardisation. First, it relies on multiple, subjective micro-assessments; and the sum of multiple, subjective micro-assessments is never going to be anything approaching reliable. Second, teachers, when confronted with the result of the formula may tweak the underlying detail, ticking and unticking objectives to get the desired outcome. Make sure it’s low at the start of the year and high at the end because, you know, progress. Yes, there are still so many who think this process is actual evidence of progress.

The system itself may take various routes to arrive at its grade. The most simple is based on the percentage of objectives achieved at a particular point in time compared to some arbitrary threshold e.g 70%. But which 70%? What if one pupil is missing objectives relating to shape, another is struggling with fractions, whilst another can’t get their head round roman numerals or telling the time. Each has achieved 70% of objectives but are they all secure? Clearly not. And some of our 70% group are less secure than others. Having such simplistic approaches doesn’t work.

This leads us down the road of weighting objectives. A murky world of scoring objectives according to perceived difficulty or value, but who decides on the weighting? Can everyone agree? Are we going to spend the next few years constantly tweaking the weighting until we get something we accept – a reasonable fix? And based on the examples above, what is difficult for one child is easier for another, so weighting is utterly subjective both for teachers and pupils. In short, you’ll never get it right. And based on our scoring system, what if a pupil scores high on some objectives and low on others and averages out with an average score? Does that mean they are ‘on track’ or ‘expected’? No!

These approaches are so massively flawed, but they are still very much relied upon in so many schools. Senior leaders persist in the belief that their systems are oracles, that they will provide accurate and consistent data, but they are underpinned by subjective assessment – often made begrudgingly and with one eye on performance management – convoluted weighting systems, and arbitrary thresholds. Putting aside the workload issues – which can be enormous – these systems essentially undermine trust in teachers’ judgement. And for those that state “teachers need this because they’re not experienced enough to make that decision themselves”, how can we ever expect them to get the experience if we continue to rely on the assessment equivalent of stabilisers? Teachers are professionals – they should be diagnosticians – and they should be perfectly capable of stating whether pupils are where they expect them to be based on all the evidence at their disposal, without having to resort to endless RAG-rated tick lists.

Lets return to that CEO and his vital question: “how have we reached the point where we have so little faith in teachers’ judgement that we need software to give us the answers?”

It’s worth thinking about.

Subscribe to receive email updates when new blog posts are published.

Share this article

Leave a Reply Cancel reply