Not all evaluations are created equally. Research demonstrates that students can be biased in their feedback for a variety of reasons. What we do with this information could impact both our staff and students into the future.
Evaluations are used for multiple purposes, so it’s important to understand their flaws before we implement decisions based on a potentially biased dataset. Decisions around promotions and awards are made on the backs of evaluations, just as changes to the curriculum can derive from comments made in student feedback.
*EDI stands for Equality, Diversity and Inclusion.
What kind of bias shows up in the data?
Gender and race/ethnicity are perhaps the most well discussed elements of bias that emerge in evaluations. Women are often expected to perform more nurturing roles compared with men, so when female instructors do not fulfil a perceived caring role, students can respond negatively, while male staff have no such burden of expectations. Caring roles also have less traditional professional prestige associated with them.
When all conditions are the same, it matters who the students are. A 1978 study in the USA showed that female students judged their teachers equally regardless of gender, while male students judged their male teachers as superior.
Higher education suffers from a lack of diversity across senior teaching and leadership roles, which drives perceptions of who has power within institutions, so it is unsurprising that “women and minorities must work harder to be perceived as equally competent as White men” (see Bason & Martin 2012). There is much literature from the USA that explores this and provides illustrative examples, such as this quote from an academic at Ohio State University:
The complaints are never-ending, voluminous, and contradictory. I talk too loud or not loud enough. I walk too close to people and make them nervous. If I look at students, they are nervous. If I do not look at them they are angry. If I call on them, I am picking on them. If I do not call on them, I have a personal vendetta against them …. When I talk to students in an attempt to ascertain what I do that is so different from the other professors teaching the same section of first-year students, they admit that I do no more in class than their white male professors-my class is no more rigorous, no more intimidating, no more work. In fact, they seem to like the class…. Most students appear to like the use of overheads, the introductory and periodic summaries, and question and answer periods …. The only difference appears to be that I am a Black female …. (Merritt 2008)
The article that uses this quote concludes that social stereotypes are powerful influencers of thought and action, and it is incredibly difficult to replace them with reflective thinking. Such reflective thinking would, however, help students to produce fairer evaluations.
The more patriarchally traditional the discipline, the more pronounced these negative differences can appear. How people teach also impacts ratings, whether they deliver small group teaching or lecturing, or how didactic versus student-oriented they are. Students also care about what grades they get and there are correlations in the data that grade leniency can provoke positive evaluations from students.
Other factors such as age and attractiveness impact data too. Older staff tend to be given poorer evaluations, while staff deemed attractive are rewarded with kinder evaluations. Culture can also push bias in one direction or another, such as the importance of class or accent. Contemporary social issues can also impact evaluations, such as exaggerated political issues (e.g. the ridiculous “War on Woke”; the hateful smears against the Trans community).
Heffernan’s (2021) synthesis of the literature articulates the issues with evaluations, acknowledging the impact of racism, sexism, homophobia, disability and linguistic diversity. He also notes that free text comments are often abusive to female staff and marginalised groups. His paper gets to the crux of an uncomfortable truth in higher education, that evaluations are typically flawed and can inflict negative consequences on to already marginalised staff.
How do we defend against bias?
Perhaps higher education policy needs to change, as argued by Heffernan (2021), so that hiring and promotion systems abolish reliance on flawed evaluations as evidence. In a sector still beset by pay gaps, maybe biased evaluations are part of the problem. This may seem like a step too far for some. Certainly, Universities find it tricky to change complex integrated systems.
Another study demonstrated that a very simple tactic of “informing students of the potential for gender biases can have significant effects on the evaluation of female instructors” at least partially negated the bias of male students evaluating female instructors (Peterson et al. 2019). Bias can be conscious and unconscious, so the act of articulating it as a risk can support students to identify and mitigate their own unconscious bias.
There is value to be found in evaluations, so maybe we can find a way to fight against bias instead. One option is to use alternative methods, here are some suggestions by Baldwin & Blattner 2010:
Use a variety of teaching evaluation methods as “one size fits all” approaches don’t always work.
Use formative assessment of teaching effectiveness and methods rather than relying on one piece of evaluation at the end of a unit.
Put together a teaching portfolio that showcases your practice. This can include multi-media content.
Invite others to evaluate your teaching. This can work well when well-regarded colleagues observe your practice and provide evaluation.
Put results and comments into context, such as the context of adapting to the pandemic.
Some similar recommendations are found in a study by Kreitzer & Cushman (2022) such as contextualising comments. Other recommendations are directed at senior staff who use this data to make decisions, such as their role in accepting the data with caution and rejecting the data as the only method of assessing teaching. One recommendation suggests the complete elimination of qualitative comments as this is the space where gender and marginalised bias and cruelty is most evident.
These recommendations are perhaps best reviewed by senior colleagues who provide leadership for career development and steer trends in recruitment in academia. Perhaps a culture-shift in what is taken as evaluation evidence will open our schools to more diverse methods of reviewing and valuing our teaching and learning practice.
Key reading: Basow, S.A. and Martin, J.L. (2012) Bias in Student Evaluations. In: Kite, M.E., Ed., Effective Evaluation of Teaching: A Guide for Faculty and Administrators, Society for the Teaching of Psychology, Washington DC, 40-49 https://ldr.lafayette.edu/concern/publications/fb4948799
Merritt, D. J. (2008). Bias, the brain, and student evaluations of teaching. St. John’s Law Review, 82(1), 235-288