27 June 2008

More Thoughts on Number Crunching

Six months ago, I wrote about what I see as the grail of standards-based grading: how to convert rubric scores to standard A-F grades. The best solution, natch, is not to have to convert them at all. If you have the option of reporting your standards-based scores on the report card, then you're golden. The rest of us have to operate with two different ways of grading and reporting. We must number crunch.

I am a bit torn about this search for some mystical tool for converting between the two systems. The concrete-sequential part of me would love to have some sort of simple formula---something that anyone could use and parents/students/counselors could see. I like the idea of something which maintains the integrity of the scores while communicating a single letter grade. The reality, however, is that grading is messy. No matter how many rubrics, standards, and valid assessment tools you have, we are still human beings evaluating the work of other human beings. Reliability is always going to be an issue.

Knowing all of that, I'm still looking for a way to tilt this Quixotic windmill. Here is where my thoughts have led so far.

There is a rough sort of equivalence. If you think about the continuum above, student performance can either be described along an A - F scale or a 4 - 1 scale. If we plot those, then we think about an F as the lowest possible grade and an A the highest. In Standards-based Land, the lowest possible scores are 1's and 2's while the highest are 3's and 4's. I am leaving percentages out of this for two reasons. One is simply that their only purpose is to rank students and that is not the point here. Here, we want to consider an individual's performance. Secondly, percentages don't translate well to an ordinal sort of scale (if you're the kind of teacher who uses 10-point spreads between A - D and a 60-point spread for an F).

Although I don't have the other letter grades plotted here, we might think of them as being evenly spaced along the line. This idea led me to draft the graph shown below.

Here again, things are a bit messy. I chose not to hatch the y-axis because the number of standards evaluated in any given grading period can vary. I think we could safely say that any performance which included no 3's (evidence of standard performance) would be an "F," and student performances of all 3's (or above) would be an "A," but between that, things get interesting. If you're a school which uses + and - along with letter grades, can we use the number of 1's and 2's as a way to distinguish between a C+ and B-?

I have to think some more about the possible practical applications (if any) of a graph like this. Could a teacher, perhaps, use this to develop some sort of algorithm at the end of each grading period? If four standards had been assessed during a given grading period, could you get to a point where two 3's + one 2 + one 1 = C-?

There are different "end users" for grades. I understand that a college will look at a transcript differently than a parent, student, employer, or other teachers. We all see different things in the alphabet soup at that emerges at the end of a reporting period. As a teacher, my most important goal is that students can explain why they have earned the grades that they have---that they know what their grades represent. I don't have any way to have those conversations with the other stakeholders, but I would like to think that we're all more or less on the same page. Maybe that's the real purpose behind the number-crunching.


Jim Anderson said...

Why does your rubric go from 1-4, and not, say, 1-5?

The Science Goddess said...

It certainly could, depending upon what the teacher was using. I use a 1 - 4 scale in my class, mainly because it's in line with what the state uses for scoring---and what our elementary schools use for both grading and reporting (it's familiar to students).

Roger Sweeny said...

Maybe I'm missing something but ... why not make your rubric go from 1 to 5 and then rename 1,2,3,4,5 to F,D,C,B,A?

Of course, you'll still have the problem that some assessements are worth more than others but you'd also have that problem in a 1-4 reporting world.

The Science Goddess said...

I suppose you could, but then what would be the point? If a 5 = A, 4 = B..., then why not just use A, B, C...to begin with?

I don't see any problem with just using letter grades as long as what they mean is described by standards. After all, whether it's letters or numbers, they're just symbols being used to represent something. The goal here is to ensure that they represent learning (and not a hodgepodge of other things).

Personally, I think using a numeric scale was important this year because it forced kids (and parents) to think about grades differently. Because they weren't using some sort of pre-conceived notion of what an "A" meant, our conversations were much more focused on learning (and not grades). It wasn't about percentages, or averages, or weighting. We talked about whether or not there was enough evidence to "convict a student of learning."

Roger Sweeny said...

What's the point? As you say, to get kids and parents to think about grades differently by presenting them in a different "language" (even if you know, and the kids will soon know, that there is a simple translation from that language to the report card letter language).

I'm not sure what you mean when you say the numerical score "wasn't about percentages, or averages, or weighting." If you have to give a term-end summative grade, even if it is on a 4-point number scale, and even if it only represents "learning," you will still have to somehow "average" your assessments, and you will probably want to weigh some of them more than others.

The Science Goddess said...

It's possible that there are teachers out there who will want to do that. As for me, I didn't use averages or weighting. It's really about looking for something summative. At the end of term, what do these scores add up to in terms of understanding where a student is with the standards.

Most teachers use some sort of percentages for scoring assignments...I only used rubric scores. 3's were just fine for earning A's (at end of term) because the goal is to meet the standard. A 3 doesn't represent a range of percentage right.

It's quite likely that there will never be some simple way to take a series of rubric scores at the end of a term and mash them into a single letter grade (and I wish I didn't have to). Kids and I discussed the grades that we be on their report cards this year and that worked well. The most important thing (to me) is that they could talk about what their grade represented and why.

MR. MARTI said...

I know you mentioned that there is no grail in grading, but I do have some questions. Can you explain the 2-D chart some more? I understand that the more 3's and 4's you have, the higher grade you should get. But what does the x-axis represent? What are the 1's and 2's? And what do you mean by the "receives letter grade indicated" and "receives below letter grade indicated"?

Also, what was your final grade distribution? Was it different than before?

Sorry for the tons of questions, but your blog has really got me thinking about next school year.

The Science Goddess said...

Excellent questions! Really, the more questions the better, because it helps me refine my own thinking.

I use a 4-point scale in my grading, with 1 and 2 representing work that is below standard, 3 is at standard, and 4 would exceed the standard. The 1's and 2's on the graph are shown (3's and 4's are not, because a student would have to have them in order to reach an A, B, C, or D).

The x-axis on that chart is trying to represent a continuum. For example, both a "B" and a "C" must mean that there are at least some scores at 3 or 4, but a student who has 1's mixed in is closer to a C...and a student with 2's mixed in is moving toward a B. It's an attempt to think about the variety of scores students are going to generate and determine how those might "fit" on an A - F scale.

I have not done this yet, but I would like to see if it is possible to scatter plot the scores and determine the line of best fit. From there, it would seem that you could determine the report card grade if you knew (a) how many 3's/4's a student had (x-axis) and (b)how many 3's/4's were possible (y-axis). If the point where those met was above the line of best fit, the student would have the letter grade shown directly below on the x-axis. If not, they earn the next letter grade down. (Although + and - could certainly be factored in, too.)

I don't have my grade info handy at the moment; however, from what I recall, I had fewer F's than in previous years. It has been 4 or 5 years since I had taught "regular" biology---and this was also a very different SES population. Hard for me to say with certainty that distributions were different.

What I can say, however, is that for the first time, the grades actually represented student learning only. That was awesome for both kids and me.