As artificial intelligence becomes more sophisticated and more capable of closely depicting reality, researchers at Carnegie Mellon University's Human-Computer Interaction Institute (HCII) are working to ensure that the outputs of large language models are representative of the communities they reference.

This work is the primary focus of William Agnew. He is a Carnegie Bosch postdoctoral fellow and one of the leading organizers of Queer in AI. Alongside Carnegie Mellon, Queer in AI is a member of the National Institute of Standards and Technology's AI Safety Institute Consortium, which seeks to advance the trustworthiness and safety of AI systems.

"Researchers, corporations and governments have long and painful histories of excluding marginalized groups from , deployment and oversight," Agnew and the other organizers of Queer in AI wrote in their paper on AI risk management. "As a result, these technologies are less useful and even harmful to minoritized groups."

Since starting to work with this organization approximately eight years ago, Agnew has used his expertise to analyze the integrity of training datasets for large language models. Through his work, he helps AI developers identify and overcome biases across mediums—generated text, images, voice and music—with the end goal of helping technology be more equitable in its application.

"These audits really have the goal of asking, are they representative?" Agnew said. "Are they inclusive or are they biased? Do they contain toxic stereotypes? Are they taking people's intellectual property or other work without their permission? Do communities not want to be in these datasets?"

Answering all of these questions requires diving into the data behind the content. "By understanding the dataset, we can really understand what's going to happen in all the downstream models," he said.

The ultimate goal of this work is to empower those who have previously been left out of discussions around privacy and security in the implementation of AI.

"A lot of communities want to have control over their data and their representations. They don't want companies to decide how they are represented in media or AI. They want to control that," Agnew said. "It's valid and important. Marginalized groups have had decades if not centuries of stereotypes, caricatures and misrepresentation in media."

Agnew also explained that anyone who creates content that is ultimately seen online—not just marginalized communities—would benefit from the ability to opt out of inclusion in these datasets, giving the example of the increasing trend in journalism to train models on the writing of authors without their consent.

How does representation in the AI age differ from the past?

Until recently, questions of representation and belonging—most notably, how marginalized communities depict and perceive themselves—have fallen into the domain of traditional artists, and historians.

Harrison Apple, founder of the Pittsburgh Queer History Project and associate director of the Frank-Ratchye STUDIO for Creative Inquiry in the College of Fine Arts, noted that the historical archive itself plays a role in recording events and situations as "technologies of belonging"—tools which identify members of a group through shared experience. For archivists, this experience takes the form of a shared and localized past. For technologists and social media users, it is an immediate and globally-accessible present.

Individuals in both contexts are often used as a point of reference to identify or demarcate their entire community, even if they cannot consent to it. "Setting out to form a community is such a fraught project. Community is deployed by whoever wields the word—to protect, to destroy—but it is always a circumscribing concept," Apple said.

Apple identified the ability to use imagery to shape discourse around a topic as a privilege in their article critiquing community archives, titled "I Can't Wait for You to Die." For instance, archival exhibitions often focus on those who are no longer alive with the goal of helping modern-day onlookers develop a sense of community and identity. However, because late individuals cannot approve of the use of their stories or likenesses, questions of ethics can arise for those given personal material.

More information: Rebuilding Trust: Queer in AI Approach to Artificial Intelligence Risk Management. docs.google.com/document/d/19d … ading=h.fkl2vs8ckg7z

Citation: How should AI depict marginalized communities? Technologists look to a more inclusive future (2024, June 26) retrieved 26 June 2024 from https://techxplore.com/news/2024-06-ai-depict-marginalized-communities-technologists.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.