Evaluating Teachers Who Lack Test Data
One of the most cogent arguments against a teacher evaluation system such as the one recently adopted by Tennessee (and others) is the question of how to evaluate teachers and other school professionals who don’t have test data. On that point, The Tennessean ran a recent story about the “wait and see” mode that we’re all in as regards the new evaluations and the soon-to-be-created Teacher Evaluation Advisory Committee.
As it stands now, this group of folks is, by far, in the majority. Using TCAP scores only captures about 30% of Tennessee teachers at this point. So, how do we evaluate the other 70% of school staff (including librarians, guidance counselors, music teachers, and P.E. teachers)? It’s a good question, but it is NOT unanswerable. Taking a look around the web, it seems at first as if there are a lot of folks asking the question, but not a lot offering answers. It’s mostly pointed out, with trembling hands and quavering voice, like some impassible Scylla and Charybdis (either that, or it’s triumphantly trotted out as impassable obstacle destined to doom any such effort). For those who are pointing this obstacle out, here are some ideas:
- Memphis is planning to up the number of teachers with test data from 30% to 64% over 5 years. They plan to do this in the following manner:
- Year 1 (p. 33): Convene task force to develop agreed-upon approach to capture value-added data for as many specialist teachers as possible (e.g., SPED, ESL, Reading specialists)
- Year 2 (p. 34): Implement methodology to capture value-added for specialists (e.g., SPED, ESL, Reading Specialists) based on existing assessment data for students with whom they work most; Plan for implementation of additional value-added data from new End of Course exams
- Year 3 (p. 34): Capture value-added data for all teachers in subjects with new End of Course exams implemented as part of the TN Diploma project; Plan implementation of additional assessments to expand value-added to additional grade levels and subject areas (as needed)
- Year 4-5 (p. 35): Tests in place to capture value-added data for all major academic subjects (~65% of teachers); Implement additional assessments across grade levels and subject areas to expand the percent of teachers with value-added data (as needed)
- D.C. uses a substitute measure, much more heavily weighted on observations, for teachers without tests (h/t Prichard Blog):
We could create more authentic measures of “student achievement/improvement” for each teacher. This would involve creating more in-depth evaluations based on observations, student work, teacher feedback, student portfolios, videotapes, etc. The National Board for Professional Teaching Standards (NBPTS), whom I have discussed in another context, actually has a pretty good model for evaluation teachers, looking at videotaped lessons, examples of student assignments, examples of student feedback, and portfolios showing growth.
- Some places (e.g., Denver and Washington, D.C.) use school-wide achievement data as an aspect (though not a large one) of non-tested teachers’ evaluations.
CECR (The Center for Educator Compensation Reform), has an excellent paper on the subject called “The Other 69%.” Granted, this agency is advocating (strongly) for compensation reform (read: pay for achievement), but their research is pretty high quality. Also note that the report came from a Vanderbilt staffer, Cynthia Prince. Looks like we may have some expertise in this right in our own back yard! Among its conclusions:
- For K-2 (an area of major concern, since traditional tests are notoriously unreliable for catching student learning/development at this age):
- Create a developmentally appropriate rubric to assess how well teachers in pre-kindergarten to Grade 2 are supporting young children’s development on dimensions such as students’ cognitive development, social-emotional development, motor development, and language acquisition.
- Use student results from adaptive tests such as the Dynamic Indicators of Basic Early Literacy Skills (DIBELS) or Measures of Academic Progress (MAP) to assess teacher performance at the early grades.
- Use measures other than individual classroom achievement as a way to include teachers of pre-K to Grade 2 in the compensation system.
- For non-tested subjects (art, music, etc., but also social studies and science in many places)
- Teachers are eligible for schoolwide performance bonuses only.
- Teachers are eligible for some, but not all, of the individual performance bonuses that teachers of core academic subjects are eligible to receive.
- Student test scores are not used to determine non-core teachers’ eligibility for rewards. Instead, eligibility is based exclusively on non-test measures, such as observed evaluations of classroom performance, acquisition of additional knowledge and skills, or assumption of additional roles or responsibilities. All of these, and other non-standardized test measures, may be displayed in a thoughtful portfolio.
- New student tests are created to assess teacher performance in non-core subjects.
- There is an interesting variation on the schoolwide performance bonus option: Teachers of non-tested subjects get to choose what kind of skills they will emphasize in class (likely reading vs. math) and then be a part of the school-wide bonus for those gains. This seems particularly interesting, since I could easily imagine an art teacher incorporating geometry and, at more advanced levels, basic trigonometry (angles, perspective, etc.). South Carolina, through the TAP program, is doing something along these lines.
- For ELL teachers:
- Base performance rewards for teachers of English language learners on schoolwide achievement gains or reward them by team when the performance of English language learners improves.
- Use student gains in English language proficiency, in addition to gains in subject matter knowledge, as an additional performance measure for teachers of English language learners.
- Use knowledge and skills-based pay structures to reward teachers of English language learners for their expertise.
- For teachers of students with disabilities:
- Base performance rewards for teachers of students with disabilities on schoolwide achievement gains.
- Reward teacher teams when the performance of students with disabilities improves.
- Develop a new “student sharing” average to assess the performance of special education teachers.
There is, of course, much richer detail in the report, and I encourage anyone interested in the subject to read it.
***************
The point here is not that this is going to be easy, or that there is a ready-made solution. Clearly, tackling the issue of pay for achievement for teachers of non-core subjects or specialized students (early grades, ELL, students with disabilities) is going to take some careful thought, planning, and tweaking once the system goes into place. HOWEVER, this challenge is not insurmountable. States around the country, as well as academics, administrators, and teachers, have been grappling with this question already. There are options out there. For my money, evaluating teachers on a portfolio model would be one of the best things we could do. Instead of a one-shot test score that needs (secret) statistical manipulation to be “correct,” why not look at the continuum of student work and teacher feedback/reflection over the course of a year? Combine this with observations of teacher practice, and you have a strong measure of how effective a teacher is. Certainly this would work for art/music/shop teachers (compare student work from the beginning of the year with student work from the end of the year, combined with interspersed observations during the year), but would also be a wonderful measure for all teachers.
The problem you run into, however, is the same one you run into with students. We know what richer tests/evaluations look like. We can do a lot better than multiple choice (or a one-shot test score). For science, it would be much more genuine (for HS Chemistry, say) to put kids in a lab with materials and ask them to set up an experiment to create a precipitate out of a specific solution (or whatever). That’s a genuine test. But that takes time and money and isn’t as easy to grade. Same thing with teachers. If we can just plug in a TVAAS score, it’s a lot quicker and easier (though, given the ridiculous amounts we pay SAS, I’m not sure cheaper) than doing an full-on, genuine evaluation of teachers using more authentic measures like portfolios and observations. Don’t take this to mean that my support for holding teachers accountable for student achievement gains has weakened. It hasn’t. And I still think some sort of student achievement score should be in the mix. However, in the end, quickest and easiest isn’t usually the best model. I know we’ve been focusing so much on doing what’s best for students, but in this case, doing what’s best for teachers may well be the best thing for everybody.
Trackbacks
- TEA Gubernatorial Candidate Forum « Nashville Jefferson: A Nashville Education Blog
- Let’s Not Forget About the TEAC « Nashville Jefferson: A Nashville Education Blog
- How Should We Be Evaluating Our Director of Schools? « Nashville Jefferson: A Nashville Education Blog
- Right Bill, Wrong Time « Nashville Jefferson: A Nashville Education Blog
- A Word on Using Test Scores to Evaluate Teachers « Nashville Jefferson: A Nashville Education Blog
- The (Education) Heisenberg Uncertainty Principle « Nashville Jefferson: A Nashville Education Blog
- The Disconnect Over Tennessee’s New Teacher Evaluations « Nashville Jefferson: A Nashville Education Blog

Humans were not intended to be quantified.