Joanna's blog: a collection of thoughts about education and assessment for learning: August 2013

Wednesday, 7 August 2013

What do Grades Mean?

A ‘grade’ is a summative expression of performance in a task or examination taken at a particular time. Grades can be expressed as letters (A, B, C), numbers (6, 5, 4), grade descriptors (excellent, good, satisfactory) or sometimes as percentages which correspond to particular marks or grades. Attaining a particular grade in an exam, for example, should not be confused with measuring progress or being an indication of progress over time because a grade given for a piece of work or for an exam performance is just a reflection of that particular performance and nothing else. Examination grades are not only an approximation of a particular achievement as the same student on the same exam paper can produce different outcomes on different days. Grades also depend on types of questions set, mark schemes and the quality of markers, including the reliability of the whole process of quality assurance. Parents and policy makers would like to believe in the exact reliability of examination grades, however, this is not the case because, for many reasons, there is an element of error in any test.

In case of public examinations, grades are moderated and standardized to ensure, as far as possible, grade validity and reliability so certain comparisons can be made, and to warrant confidence in the system. Ensuring comparability of examinations in different subjects, has been more controversial and harder to achieve in order to reflect a different level of difficulty of different subjects. Although statistical models are applied to analysis, for example, GCSE grades for different subjects with different degrees of difficulty, an absolute inter-subject reliability is not easy to achieve because not only some subjects are harder than others, but there are gender difference in relation to achievement across subjects and there are differences in attainment between top grades and lower grades, where on average differences between the highest grades are twice as big as those between the bottom grades. In England in 2004, about 600,000 candidates’ GCSE scripts were analyzed[1] in order to construct greater grade reliability between different subjects.

Samples on the scale of 600,000 candidates are very large indeed and not applicable at school level, where students’ work is routinely graded in the course of their studies. Therefore it can be quite difficult to establish a degree of certainty of what the actual grades mean and how they translate from achievement in one subject into another. This grade consistency can be even difficult to achieve within one subject, unless a robust moderation system is in place.

On a practical level, I am often asked what attaining a “6” or “64%”, for example, in a test means. What this means is exactly what it says: that a particular student’s performance was judged as “6” or “64%” attainment in this particular test. This grade or score does not give any other information and, as mentioned above, is only an approximation of a student’s performance. It is a summative judgement of a performance in a particular task. It is a performance at a given time and it is not a predictor of any future performance which can change with effort, task and many other variables. Similarly, assigning a student to a particular set (where schools have different ability sets in some subjects), reflects the best-fit ability position at the time and should not be in any way a predictor or an indication of where the student may end up with further learning and effort. In other words, these are positions in a given time and should not be viewed as fixed positions as this could be counter-productive to future learning and student effort.

When I asked students what type of feedback was helpful to their learning and whether they understood grades/marks in different subjects, these were some of the typical answers:

“Corrected work and told us how to be done right.”

“It is different in different subjects and I don’t really understand what the grades mean.”

“I understand grades and marks some of the time.”

“It is useful when teachers tell us what we’ve done well and how to improve.”

“It helps when it shows were you could’ve done better. I don’t really know what is a B in history and what it is in science.”

It seems that students are rather confused regarding what their grades mean and make frequent references to guidance on improvement, which is what they seem to value as helpful feedback to future learning.

There is another risk of too much focus on grading: students may see themselves as being a certain grade performer, e.g. a C-grader or even and A-grader without putting further effort as they can be satisfied from the grades already attained. This attitude puts a ceiling on learning, even at the higher end, where students may stop trying their best through continued effort and develop a ‘fixed mindset’ (satisfaction from own ends).

Parents, who have their best intentions at heart, may contribute to this type of mindset as they often put too much emphasis on grades and can praise ability as a form of encouragement, which is counter-productive to effort and learning development, and results in students’ setbacks because they are reluctant to try in case they fail and may become defensive, blaming outside factors for their lack of achievement (Dweck[2]).

The meaning of grades can be even more confusing, when looking at the grading of transfer tests/examinations, where different institutions set their own grade criteria and boundaries. The examples of such tests, where there is no moderation and no grade standardization, are transfer tests to different or senior schools, for instance ISEB Common Entrance examination.

Confused about the meaning of grades in these situations?

I am.

We should be questioning the validity of such examinations as they can have a negative impact on learning and render grades quite meaningless, to be frank. They also contribute nothing in terms of performance/data analysis because of the lack of any standardization. The only purpose they serve is selection to particular institutions according to their own criteria.

Therefore, if we are really concerned with learning and individual progress, we should be questioning the meaning of the status quo regarding reporting educational progress in the form of grades or levels, where level descriptors inhibit the overall performance and undermine learning[3], and grades can be ambiguous and can put a ceiling on learning.

To serve students well, we need to have high expectations and involve them in their learning to a greater extent, where we value their voice and guide them to their next steps of learning through formative feedback, and the grades will take care of themselves...

Dr Joanna Goodman

[1] Comparability of GCSE examinations in different subjects: an application of the Rasch model

[2] Dweck, C. (2000). Essays in Social Psychology. Self-theories: Their Role in Motivation, Personality, and Development. Hove: Brunner/Mazel.

[3] A report by the Expert Panel for the National Curriculum review

Saturday, 3 August 2013

Does the New Curriculum Aim to Improve Standards?

Examining curricula in different countries, it seems that there can be a gap between what the curriculum prescribes and what is actually being taught at the class level. This discrepancy is mainly down the fact that at the grass-root level, teachers are predominantly concerned with how the examination boards examine different parts of the curriculum and their teaching is focused on what is required to pass examinations successfully. In this sense, the high stake assessment drives the curriculum content which is being taught to students, especially if the programmes of study are not sufficiently linked with attainment targets.

This concerns me as we are currently on the brink of introducing the new national curriculum content aimed at raising educational standards. The policy makers appear to be convinced that the new national curriculum with its greater focus on early knowledge acquisition will be a driver for much needed improvement in educational standards. So far, the curriculum content has been emphasised with little reference to aims and purposes, despite the recommendations by the Expert Panel (James, Oats, Pollard, Wiliam) who stated in their report[1]

We believe defining curricular aims is the most effective way of establishing and maintaining coherent provision.

If the Government is sincere in its desire to reduce central prescription, we need to evaluate the goals implicit in our current practices and select only those that provide a sound basis for the future. In other words, we need be very clear about the particular aims and purposes of the school curriculum and the justification for them – bearing in mind the needs of society, the nature of knowledge, and the needs of pupils, as well as comparisons with other jurisdictions. Then we need to be thorough in our analysis of what content will serve them best.

The Expert Panel also advised about a proper consultation process with all stakeholders involved and cautioned against the pace of changes:

...we believe it is right that there should be a period of engagement/consultation on the key decisions that have the potential to radically change the National Curriculum, beyond changes to the content. This is important given the pace of the review. In Hong Kong, a review process extended over a decade[2].

So far, most of the recommendations have been ignored. Although Programmes of Study for some subjects have been published, they are being developed as-we-go-along and it would seem without much considered attention to aims and purposes, as the only aim seems to be the rush to push the changes through.

Do the policy makers really believe that the new framework has the “potential to result in radical change to the National Curriculum, beyond change to curriculum content” (Expert Panel Review)? The risk could be another missed opportunity and that risk seems very real indeed.

If the Programmes of Study are not sufficiently linked to effective assessment that defines the expected standards, the danger again is that the new curriculum may fail to raise standards – the raison d’être behind driving all the changes. The experts agreed:

We emphasise the importance of establishing a very direct and clear relationship between ‘that which is to be taught and learned’ and assessment (both formative and ongoing and periodic and summative).

The threat still remains that if these direct links between what is being taught and assessed are not established at the onset, high-stake testing developed by examination boards to serve the new qualifications may be the real driver for what is being taught and learnt. Unless there are well developed, explicit Programmes of Study clearly linked to Attainment Targets defining expected standards, the danger is that teachers will teach to-the-test and schools will deliver irrelevant qualifications in the quest for satisfying self-manipulated accountability aims (in the high-stakes accountability game of meeting required standards), despite the Panel recommending to, “reduce the flexibility schools currently enjoy to ensure that the Key Stage 4 curriculum meets the vocational and academic aspirations of their students at the time”.

The students only have one chance – let’s give them the best chance possible.

[1] Department for Education, (2011). The Framework for the National Curriculum. A report by the Expert Panel for the National Curriculum review. (London: Department for Education).

[2] Kwok, S., (2008). New Horizons in Cultivation of Talents: a decade of education in Hong Kong. (Hong Kong: Education Bureau).

Thursday, 1 August 2013

Brief Overview of the use of AfL in English Schools

It appears that despite the overwhelming evidence that effective use of AfL has on improving pupils’ learning, with ‘effect size’ (the ratio between the average improvements in pupils’ scores on tests and the range of scores for typical groups of pupils on the same tests) between 0.4 and 0.7 (Black and Wiliam,1998), where a gain of effect size of 0.4 equals to improvement of 1 – 2 grades in GCSEs (public examinations at 16), the use of AfL in everyday classroom practice is still quite patchy.

The strategies seem to be poorly understood by the teaching profession, and despite investment in training and Ofsted requirement for evidence of AfL in practice, its effective use is still in very early developmental stages.

In a recent interview, Dylan Wiliam (TES, July 2012) expressed his disappointment with poor implementation of the AfL principles, after 14 years of government initiative, by stating:

There are very few schools where all the principles of AfL, as I understand them, are being implemented effectively.

The problem is that government told schools that it was all about monitoring pupils' progress; it wasn't about pupils becoming owners of their own learning.

Dylan also expressed his regret at using the word “assessment” (in AfL):

The big mistake that Paul Black and I made was calling this stuff 'assessment', he said. Because when you use the word assessment, people think about tests and exams. For me, AfL is all about better teaching.

My research also confirms that teachers are reluctant to implement it fully mainly because they lack the in-depth knowledge of what it involves and can be satisfied from their own ends of using tried and trusted methods. As any change involves a certain shift in trusted methods, teachers can be reluctant to make this change for fear of ‘letting go’.