Monday, January 26, 2015

Observation Bias?

At a recent networking event for teacher evaluation leaders, we were presented with a provocative research article about classroom observations as the basis for teacher evaluation. One of their contentions is that there is an inherent bias in the Charlotte Danielson rubrics against teachers working with underperforming students. The authors write, “…a   rating of ‘distinguished’ on questioning and discussion techniques requires the teacher’s questions to consistently provide high cognitive challenge with adequate time for students to respond, and requires that students formulate many questions during discussion. Intuitively, the teacher with a greater share of students who are challenging to teach is going to have a tougher time performing well under this rubric than the teacher in the gifted and talented classroom.”

My initial response to this was one of disagreement. As a former instructional coach and Charlotte Danielson fan, I wanted to argue that good teaching is good teaching. For years we’ve argued that the rubrics apply equally in all settings: PE classrooms, special education classrooms, basic skills classrooms, and honors classrooms. It’s been a foundation of both our Q-Comp program as well as our teacher evaluation program for over a decade. Despite my initial dismissal of the article, I continued reading. And then the authors shared the data.



The disproportionate ratings gave me pause to reflect on my own teaching experiences. When I returned to teaching after spending three-and-a-half years as an instructional coach, I asked my principal to assign me the students with the greatest needs. It was rewarding. And challenging. And exhausting. How would I have been rated that year?  I loved the kids I taught, believed in them, and still they did not engage in the learning in the same year as my students who had historically been successful in school.

The implications of this research are huge. As the stakes associated with teacher evaluation increase, teachers may be less inclined to work with students who need the most support. Currently, there’s legislation in discussion to connect lay-offs with teacher evaluation. While 35% of teacher evaluation is now, by statute, based on student achievement measures, the remainder is likely based on administrative observation.

An additional implication of this research is its effect on the feedback teachers receive on their classes. In most cases, teachers choose which of their classes will be the subject of the observation. In a growth model, such as the Q-Comp instructional coach model, teachers will frequently invite their colleague into their most challenging class. The second set of eyes provides them with insights that lead them to reflect on their practice and consider alternatives to their current methodologies. If, however, the observations are high-stakes, and if the research is correct, teachers will be less likely to invite their administrators into their most challenging classes, perhaps bypassing the opportunity to get feedback that would ultimately help their students.

If the research is valid, how might evaluators level the playing field for teachers? The authors of the research study suggest a complex statistical analysis of student demographics and performance to create a value-add formula that would adjust for these difference. This may be oversolving the problem, and may actually lead to less transparency and additional issues. A simpler solution may be to simply raise awareness to this potential bias. The simple awareness will influence the administrators’ assessments. This also begs additional emphasis on the pre-observation and post-observation conferences. Teachers need to have the opportunity to educate the evaluator on the composition of their classes and to articulate the strategies they are employing to meet the needs of those students.

To be sure, the research presented in this article wasn’t perfect. It was a small sample size, and the authors seem to have some biases of their own. And even with those flaws, it should give pause to all of those involved in high-stakes observations.


No comments:

Post a Comment