Quantcast
Channel: Large Newsroom - Online Journalism Awards
Viewing all articles
Browse latest Browse all 8

Uncovering bias in two leading AI systems

$
0
0
Fourteen boxes shaded with various skin tones are labeled with occupation titles Two charts side by side, both showing small grey squares labeled with job titles of all kinds. The chart on the left is labeled "Average Skin Tone vs. Average Income" and shows a trend line for higher paying jobs going to lighter skin and lower paying jobs to darker skin. The chart on the right is labeled "Gender Proportion vs. Average Income" and shows a trend line of higher paying jobs going to men, and lower paying jobs going to women. A black and pink name sorting chart with the following labels: "How We Derived Demographically Distinct Names", "1 Calculate how often first and last names are associated with White American men from public records", "2 Filter out first names that aren't distinct (below 90%) and select the 100 most popular names. Filter last names to the 20 most distinct", and "3 Pair filtered first and last names randomly to produce distinct names"

About the Project

Bloomberg conducted a pair of months-long, data-driven investigations to systematically test two of the most widely-used generative AI services for harmful biases: Stable Diffusion, an image generator, and OpenAI’s GPT, the technology underpinning ChatGPT.

Stable Diffusion

At the end of 2022, Bloomberg reporters found that more and more institutions were relying on generative AI to create images for advertising, entertainment and other use cases, even though there had been minimal external testing of how safe these systems were for the general public. Bloomberg investigated Stable Diffusion and found that it is prone to exacerbating racial and gender stereotypes. We prompted the AI model to create more than 5,000 images of workers with various jobs, as well as people associated with crime, and compared the results with US government data on race and gender. The analysis revealed that the text-to-image model does not just replicate real-world stereotypes — it amplifies them. For example, women are rarely doctors, according to Stable Diffusion, and men with dark skin are more likely to commit crimes. Through data visualizations and AI-generated photographs, the story explored how these systemic biases in generative AI have the frightening potential to exacerbate real-world inequities.

GPT

Months after publishing the Stability AI investigation, Bloomberg reporters noticed another concerning trend: A cottage industry of services had emerged to help HR vendors and recruiters interview and screen job candidates using AI chatbots, despite concerns that generative AI can replicate and amplify biases found in their training data. Over the course of five months, Bloomberg conducted a data-driven investigation and found that OpenAI’s GPT – the best-known large language model and one that powers some of these recruiting tools – discriminates against names when ranking resumes. We replicated the recruitment workflow in a classic hiring discrimination study: First, we assigned demographically-distinct names derived from public records to equally-qualified resumes. Then we asked GPT to rank the candidates against one of four job postings from Fortune 500 companies. Bloomberg found GPT’s answers displayed stark disparities: Names distinct to Black Americans were the least likely to be the top ranked candidate for a financial analyst role, for example. Meanwhile, names distinct to Asian women were ranked as the top candidate for the analyst role more than twice as often as those associated with Black men. Importantly, Bloomberg did not find one direction of bias across the four jobs we tested. Instead, biases were based on the job posting used to evaluate candidates. For example, GPT preferred names distinct to women for an HR role, adhering to stereotypes. Bloomberg’s reporting shows that names are enough to lead to racialized outcomes when generative AI is used in hiring.

Judges Comments

This investigation provides a social service, demonstrating bias in AI systems. It´s a complex topic to investigate, but extremely relevant because it provides the transparency that algorithms and LLMs don’t provide. The presentation is beautiful and clean, and the search for truth in the darkness is inspirational.

The post Uncovering bias in two leading AI systems appeared first on Online Journalism Awards.


Viewing all articles
Browse latest Browse all 8

Trending Articles