Thursday, March 18, 2010

Grading on a Curve

This is the second post of two about grading issues.

The phrase "grading on a curve" has become meaningless because it is so often misused to mean all sorts of things by people ignorant of what they are trying to discuss.

Here is how it works according to the real, statistical definition.

Picture test scores on a histogram picture. Here is a histogram of a recent test I gave.

Now, my tests are designed to put students into clumps corresponding to letter grades. In the histogram above you can see the A clump (13 to 15), the B clump (10 to 12), the C clump (7 to 9.5), and the students below that. My tests are designed to compare students to the test's difficulty and questions are carefully picked so these clumps will happen.

A well-written test intended for grading on a curve works differently. It groups students in one clump, centered around the average. This is a bell curve.

The vertical lines mark "standard deviations", which is a topic we'll skip for now. But do notice the percentages of students in each section of the histogram. These percentages are the same for any bell curve, whether it measures people's heights or weights or test scores.

When grading to the curve, students are graded compared to each other. On tests like
the SAT, where thousands of students take it, this becomes the same as comparing the students to the test's difficulty. But for a small group of students it might not work, just like for a large group of men we can predict that 10% will be taller than 6' 1", but many small groups of 10 guys have zero guys or have more than one guy taller than 6' 1".

Traditionally, those vertical lines and percentage categories are used "as is" for letter grades when grading on a curve. So on the right hand side of average:
  • 34% get C's (the right side of the center area)
  • 13.6% get B's (the next area to the right)
  • 2.4% get A's (the two farthest right areas)

On the left hand side of average:
  • 34% get D's (the left side of the center area)
  • 16% get F's (the three farthest left areas)

Because deciding before a class even starts that only 2.4% of the students will get A's is harsh by today's standards, many instructors change shift the vertical lines and percentage categories. But there is rarely a logical and defensible pedagogical reason for doing so in any particular way, besides the ever-popular "because I'm the teacher and I said so." (Nevertheless, a syllabus that describes how grading happens is a binding agreement that should be respected.)

Many teachers do a sloppy thing they mistakenly name "grading on a curve" by boosting everyone's grades by the amount the highest student was below 100. This is a symptom of poor test-writing. Boosting scores might prevent grumbling, but the boosted test neither compares students to the test's difficulty nor to each other. If the student with the highest failing grade complained, how would the instructor defend this grading plan? "I'm not good at math and I've always done it this way," is a terrible reason to assign someone a failing grade! (And a syllabus rarely describes how each test is graded.)

For students, there are two lessons to take with you and make use of...

First, ask your next term's instructors to describe in detail how they assign grades. Don't trust that they mean the same thing if they say they "curve grades". It might be worth switching to a different section of a class, or waiting until another term, if an instructor grades haphazardly and your schedule is so busy that you are aiming for a C.

Similarly, do not trust quiz grades to predict overall grades until you've talked to the instructor about his or her grading plan. Many instructors grade quizzes in a straightforward manner (70% = C, 80% = B, etc.) and then do something totally different for overall grades. For example, quizzes might compare your knowledge to the quiz difficulty, but overall grades might compare you to your classmates.

Second, if on a test you are borderline pass/fail but fail, ask to see the instructor's grading histograms. You might be able to argue with calm logic that you should be in the passing clump--or that the test failed to clump students, showing it was poorly written, so you should be able to take another version of the test.

No comments: