Could we use this as a catalyst to remove the outdated exam system?

Tuesday 18 August 2020

Could we use this as a catalyst to remove the outdated exam system?

Tuesday 18 August 2020

The latest exam fiasco should be used as a catalyst to finally fix the current "outdated" system, according to one local teacher.

Rory Steel shared what we can learn from the saga in his latest column for Express...

"On Sunday I sat down and wrote a scathing take-down about why the UK Government and exams regulator Ofqual, should trust teachers and not mathematical ‪algorithms. Student demonstrations, angry teachers and a huge national backlash have forced a staggering but important U-Turn. As of 16:00 yesterday, teachers predictions will be used for both GCSEs and A Levels.

So why the sudden U-Turn?

While last week's A-Level results put university placements into chaos, with GCSE results involving hundreds of thousands of students of an even younger age, the potential for harm was even more acute, with two million grades suspected to be downgraded.

Many colleges had already jumped the gun and said they were going to ignore the algorithms and use the predicted grades of their peers. The UK Government had lost the high ground, if ever they really had it.

Last week, it turned out that teachers were not trusted and 40% of grades were reduced using statistical modelling, it played a more significant role than many even dared suspect. The looming dread many felt was that the GCSE results will rely even more heavily on a similar mathematical model.

While there were many variables, A-Level students widely varying experiences were at least partially based on their GCSE results from two years ago, current GCSE students do not have that waypoint. Teachers have been gathering data on GCSE students for at least five years. We know what they should be getting, but the exams regulator Ofqual, said we were too generous.

The reality is that we are giving them results based on their prior work, while they rely every year on a single one to two-hour exam to judge them. Many cram at the last minute and pull off the seemingly impossible, others feel the overwhelming pressure of several high stakes tests placed on a young mind and fail to realise their potential. It is this last group that effectively brings the teacher’s predictions down to the levels Ofqual expected.

Pictured: Some students feel overwhelming pressure.

As a teacher, I am not going to try and predict which students will ultimately succumb to that ridiculous pressure pot, constructed to test memory over real life practical skills. I will not construct an exam week that places the same stresses on them, I will test their knowledge in a more stable mock exam environment. It would be sadistic to place a child through that kind of mental stress more than once in their life. I did, and will, continue to predict a student’s grade based on numerous data points gathered over many years and my experience, not from a single moment where they are crowded into a mental torture chamber.

So why was the exams regulator so hell-bent on using statistical modelling over teacher predictions?

Surely if teachers were all just inflating their grades they would have just used them and moderated those grades down? As someone that did statistical analysis at my school each year the day before results day, I know the depressing answer. They’re simply disconnected from the day-to-day running of a school and don’t understand the human reality of statistics.

Before a recent “shake-up” that doesn’t yet affect current GCSE students, all Year 7 students (11-year-olds) took a CAT (cognitive abilities test) that was not based on any subject knowledge. It was series of logical and word based tests that formed a prediction of each child’s potential which included predicted GCSE grades for five years in the future. The test takes a few hours and has flaws but the problem is they are actually quite accurate for a whole cohort which is why Ofqual likes mass prediction modelling.

Pictured: Students are more than data on a computer screen.

The reality of this “accuracy” is that while the predictions for a whole country, school district or even school averaged out, it is woeful at individual student predictions. When I used to analyse school result data I would consistently see the discrepancies. Some students would achieve a result one or two grades lower than their prediction while others one or two grades higher. The statistical reality is that those variations cancel each other out. When all the students grades are added together and treated as a whole it leaves an “accurate” overall prediction en masse, despite the individual errors. I believe this gave Ofqual an incorrect confidence in mathematical modelling at a student level because it works so well when averaged over five million grades.

Over five years, inevitably some students over- or underachieve for a variety of reasons, while some do hit those mathematical expectations. The fear for many was that the last five years of effort from the students and indeed the teachers will count for nothing if teacher assessments are once again largely ignored, we are all now collectively breathing a sigh of relief.

As a statistician, I see how the modelling works but as a teacher I see the human variation that statistics can overlook. There are far too many human variables for a mathematical model to understand which is why the teacher prediction is so important. We saw this most starkly with the significant way disadvantaged students’ A-Level results were disproportionately negatively affected. The model said that if your class size was under 15 they would rely more heavily on the teacher grade, due to lack of data. The issue with this was that more private schools in the UK have smaller class sizes. This statistical oversight saw UK private schools achieve 4.7% higher A-A* results while their comprehensive counterparts in some sectors saw no increase.

The UK Government and exams regulator knew they got it wrong last week, the last minute changes, the debacle that was the appeals process but they didn’t have a fall back position they trusted. There will be many articles explaining that this cohort now has an advantage over their peers of previous years, with “inflated” grades. I’m sure there will be schools singled out as having played the game - if so, they deserve scrutiny.

For now, however, maybe the mental stress this year group has gone through means they deserve a break. They have narrowly avoided an inevitable collateral damage statistical variation would have brought them this week.

Pictured: This year group deserves a break.

The new question is, could we use this as a catalyst to remove an outdated exam system no longer fit for purpose and continue to trust teacher predictions?

I don’t think the UK Government is brave enough but I can always dream."