
Last week I came across a 2019 article published on Kellogg Insight about how numeric performance reviews can be biased against women. The article focused on 10-point rating systems and proposed a simple fix: reducing it to a six-point scale. As I read through, I was struck by some underlying assumptions folded into the narrative. The overarching theme suggested that men held biases against women. However, as they explored the studies, they found that in more balanced teams, men and women were rated equally, while bias was more prevalent in male-dominated settings. The researchers argued that, under a 10-point scale, a "10" is perceived as perfection, but in a six-point scale, a "6" isn’t. They linked the gender rating gap to stereotypes around what a "perfect 10" looks like.
When you dig into the evidence, there's an inference that the root cause is connected to familiarity—raters tend to view performance through the lens of familiar traits. This equates with other findings that people tend to have a universal view of performance based on their own characteristics. In unbalanced populations, these traits often align with those of the dominant group, which frequently disadvantages women, particularly in male-dominated fields and leadership roles, due to their underrepresentation. This is important to be aware of, because while we often don’t see bias in ourselves, being aware of bias in the system can help us more objectively review our own decisions.
With many organisational talent systems using performance as a measure of top talent—and diversity widely recognised as critical to driving innovation and improvement—bias in performance rating systems can significantly impact long-term organisational performance. To better understand this issue, I looked into more recent research to identify recurring themes and potential solutions.
Latest Research on Performance Bias
A 2022 study from Stanford echoed similar themes to the Kellogg research, identifying gender bias in performance reviews, especially in male-dominated industries. The researchers found that while women were often praised for communal behaviours, such as being helpful, they weren’t rewarded for these traits. Instead, their technical skills were scrutinised more harshly than men’s. Meanwhile, men were penalised for behaviours perceived as being "soft," such as a lack of assertiveness. The researchers noted that vague evaluation criteria exacerbated these biases, allowing managers’ stereotypes to influence their judgments. This lack of clarity leaves room for subjective interpretations, which often perpetuates bias.
A 2023 review further emphasised the persistence of biases in job evaluations, showing that non-white employees tend to receive lower scores than their white counterparts. This discrepancy was linked to "response bias"—where ratings on performance tasks are influenced by the evaluator’s overall impression of the employee, rather than their actual work quality. For example, a positive or negative perception of an employee could shape ratings on unrelated tasks, thus distorting the feedback provided.
Additionally, a 2021 study[1] explored how perceptions of fairness in performance appraisal systems impact overall organisational performance. The study highlighted that these systems are less effective in knowledge-based work environments, where tasks are often non-routine, complex, and collaborative. Traditional performance measures, which tend to favour short-term, quantifiable goals, fall short in these contexts, underscoring the need for more tailored evaluation criteria in certain industries.
Another key driver of performance discrepancies is the quality of feedback received. A 2022 study found that employees who received regular, high-quality feedback, including clear information about their rank within the team, significantly outperformed those who received low-quality or no feedback. Interestingly, no significant difference in performance was found between those given low-quality feedback and those receiving none at all. While this study didn’t directly tie its findings to bias, it suggests that low-quality feedback—often vague or biased—can create a self-fulfilling prophecy. Employees who are perceived more favourably by their managers tend to receive better feedback, which enhances their performance, while those who are less favoured struggle to improve, regardless of their potential. This ties into the idiosyncratic rater effect [2], where more than half of a manager’s rating of someone else reflects their own biases and characteristics, not the employee’s.
This reminded me of the well-known Rosenthal and Jacobson (1965) experiment in educational psychology. In this study, students at The Oak School were given the "Harvard Test of Inflected Acquisition," and teachers were told that certain students were identified as "growth spurters" based on the results. By the end of the year, these randomly selected students showed a 10 to 30-point IQ increase. The catch? There was no such thing as a "Harvard Test of Inflected Acquisition", and "growth spurters" were chosen at random. The improved performance was due to the teachers’ higher expectations, which influenced how they interacted with the students. Interestingly, this effect was less pronounced with older children, likely because teachers had already formed solid perceptions of these students. This experiment highlights how perceptions can powerfully shape performance.
Addressing Bias in Performance Reviews
Given that bias is ingrained in the human psyche, what options do we have to create a more equitable performance environment (beyond bias training), one that supports our organisations in thriving? Here are a few approaches:
1. Transparency and Consistent Criteria
Focus on outcomes-based objective setting. By rewarding outcomes rather than effort, you reduce the impact of bias on how work is perceived. When the focus is on what’s been achieved, rather than how, there's less room for subjective biases. Switching to outcomes-based working requires planning, training, and, initially, some difficult conversations, but it will fundamentally change how business performance is measured.
2. Involve Employees in Setting Evaluation Criteria
Rating scales that use leaders' expectations as measures of performance are susceptible to bias. While your 5-point scale with a normal distribution might work for your reward system, it could fold employee achievements under subjective interpretations. Involving employees in defining these scales, and making them more objective, can deliver fairer results.
3. Smaller, More Frequent (Outcome-Based) Reviews
Bias thrives in ambiguity. Regular, smaller reviews that focus on tangible outcomes can drive more meaningful, performance-based discussions. This approach prevents managers from curating narratives based on vague recollections of an employee’s work over long periods, thus reducing the influence of bias on final ratings.
4. Blind Reviews / Calibration
Removing identifying information, such as gender, race, or even names, from performance reviews can help minimise bias. Blind assessments encourage managers to focus solely on performance data, reducing the influence of personal characteristics that might unconsciously sway their judgments. While this method may not work for all aspects of performance, especially in smaller teams or roles with heavy interpersonal elements, it can be effective in reviewing quantitative achievements or deliverables.
5. Use Data Analytics to Track Bias
Leveraging data can reveal trends that human evaluators might overlook. By analysing patterns in performance ratings (e.g., differences between genders, races, or other groups), organisations can identify where bias might be occurring. This data can highlight whether certain groups are consistently receiving lower ratings despite equal performance, allowing leadership to address disparities before they impact promotions, pay, or other opportunities. An analysis I once conducted, that looked beyond the usual end-of-process gender metrics, led to the realisation that mid-level females were hitting a ceiling due to not being extended the same development opportunities as their male counterparts. This triggered a change to the development planning process, increasing transparency and promoting more development of female staff.
By taking these steps, organisations can begin to mitigate the effects of bias in performance reviews, creating a fairer and more accurate assessment process that benefits both employees and the company’s long-term success.
References
[1]: (2021), "Academic performance appraisal systems: design, fairness and effectiveness", Human Resource Management International Digest, Vol. 29 No. 6, pp. 36-38.
[2]: Understanding the latent structure of job performance ratings. By Scullen, Steven E.,Mount, Michael K.,Goff, Maynard Journal of Applied Psychology, Vol 85(6), Dec 2000, 956-970