Case Title: 2. Rethinking Grading Part 2

Element Title: The averaging trap (7 of 13)

Averaging is where the grading train can often come off the rails. While we certainly need aggregation schemes to come up with an overall judgement on performance, it doesn't have to be based on average of all student scores. Take the two students below. Are they the same?

StudentAssignment 1Assignment 2Assignment 3Assignment 4Assignment 5

Let's revisit with percentages:

StudentAssignment 1Assignment 2Assignment 3Assignment 4Assignment 5


The point here is that while the mean, median and mode are all the same, these are really different students in terms of their level of mastery. If these were free throw shooting rates, who would you want on the line at the end of the game? Let's take a test that has been developed using a table of specifications and a guiding rubric that describes the complexity of work at each level. For simplicity's sake let's just say the test is made of 'A', 'B' and 'C' questions (see image).  If a student got all of the C questions right and all of the B questions right they would still have a failing percentage grade (66%, a D is really just a fancy F). Is that really fair or accurate to describe what this student knows and can do? Why not give them a B? Why not score each section seperately and depending on the highest level assign them that "tier" a.k.a. grade? Retest a specific section if you need to? The point is that we can and should think hard about what our aggregation schems are doing.

We do need aggregation protocols that are defensible, but they don't have to be averaging. Averaging is just a statistical tool. Use it wisely. Here's how I aggregate scores in my course.