Stanford University researchers found that an AI system gave more praise and less criticism to essays written by Black students.
This finding suggests that linguistic biases within AI tools can lead to differential feedback based on race and gender. Such disparities may impact how students perceive their academic progress, and the quality of guidance they receive from automated systems.
The study analyzed 600 middle-school essays [1]. Researchers observed a higher frequency of positive feedback and a lower frequency of criticism for Black students compared to other groups [2].
The researchers said the AI exhibited linguistic biases. These biases likely stem from the data used to train the system, creating a pattern where the AI responds differently depending on the perceived identity of the student.
Automated grading and feedback tools are increasingly used in U.S. classrooms to reduce teacher workloads. However, the Stanford results indicate that these tools may not provide equitable evaluations across different demographics [1].
The study highlights a gap between the intended neutrality of AI and the actual output produced by these models [2]. By identifying these patterns, researchers aim to push developers toward more transparent and unbiased training sets.
“An AI system gave more praise and less criticism to Black students' essays compared to essays by other students.”
This research underscores the risk of 'algorithmic bias,' where AI does not necessarily discriminate through overt hostility but through skewed positivity or leniency. In an educational context, providing less criticism to specific groups can be as detrimental as providing too much, as it may deprive students of the critical feedback necessary for academic growth.





