Feedback or gossip.
Study of how bias propagates and intensifies in LLMs.
Bias amplification in Large Language Models (LLMs) refers to the tendency of these systems to not only inherit biases present in their training data, but to intensify and propagate them at scale. While LLMs are designed to generate coherent and contextually relevant language, their reliance on massive, real-world datasets means they often reflect existing societal inequalities related to gender, race, politics, culture, and ideology. When deployed widely, these amplified biases can influence public opinion, reinforce stereotypes, and impact real-world decision-making.
Designing reliable metrics to identify, quantify, and compare bias amplification across models, tasks, and prompts.
Understanding how architecture, optimization, and reinforcement learning contribute to reinforcing biased associations.
Evaluating techniques such as data rebalancing, counterfactual augmentation, and controlled generation to reduce amplification without harming performance.
Embedding bias evaluation and mitigation into real-world LLM pipelines before deployment.
Creating reusable tools to monitor and mitigate bias in current and future language models at scale.