Last year, the Writers Guild of America (WGA) labor union, which represents film and TV writers, went on strike for nearly five months, in part to regulate AI's role in scriptwriting. "Alexa will not replace us," read one picket sign.
Now, researchers at Penn Engineering, Haverford College, and Penn State have presented a paper at the 2024 Association of Computing Machinery Conference on Fairness, Accountability and Transparency (ACM FAccT) that identifies a previously unreported drawback to writing scripts using OpenAI's ChatGPT: content moderation so overzealous that even some PG-rated scripts are censored, potentially limiting artistic expression.
The research is published in The 2024 ACM Conference on Fairness, Accountability, and Transparency.
The guidelines established by the agreement between the WGA and the Association of Motion Picture and Television Producers (AMPTP) that ended the strike permitted certain uses of AI in scriptwriting. While both the WGA and AMPTP agreed that AI cannot be credited as a writer, they allowed the use of AI as a tool in the creative process.
The new study raises questions about the efficacy of this approach, showing that automated content moderation restricts ChatGPT from producing content that has already been permitted on television. ChatGPT's automated content moderation filters for topics including violence, sexuality and hate speech to prevent the generation of inappropriate or dangerous content.
In the study, which examined both real and ChatGPT-generated scripts for IMDb's 100 most-watched television shows, including Game of Thrones, Stranger Things and 13 Reasons Why, ChatGPT flagged nearly 20% of scripts that ChatGPT itself generated for content violations, and nearly 70% of actual scripts from the TV shows on the list, including half of tested PG-rated shows.
"If AI is used to generate cultural content, such as TV scripts, what stories won't be told?" write the paper's co-senior authors, Danaë Metaxa, Raj and Neera Singh Assistant Professor in Computer and Information Science (CIS) at Penn Engineering, and Sorelle Friedler, Shibulal Family Computer Science Professor at Haverford College.
"We tested real scripts," says Friedler, "and 69% of them wouldn't make it through the content filters, including even some of the PG-rated ones. That really struck me as indicative of the system being a little overager to filter out content."
Prompted by the writers' strike, the project began with Friedler and Metaxa wondering if a large language model (LLM) like ChatGPT could actually produce a high-quality script. "We started trying to produce scripts with LLMs," recalls Metaxa, "and we found that before we could even get to the question of whether the script is high quality, in many cases we were not able to get the LLM to generate a script at all."
In one instance, given a prompt drawn from a summary of an episode of Game of Thrones, ChatGPT declined to produce the script and responded with a red warning, "This content may violate our usage policies."
More information: Yaaseen Mahomed et al, Auditing GPT's Content Moderation Guardrails: Can ChatGPT Write Your Favorite TV Show?, The 2024 ACM Conference on Fairness, Accountability, and Transparency (2024). DOI: 10.1145/3630106.3658932
Citation: Censoring creativity: The limits of ChatGPT for scriptwriting (2024, June 12) retrieved 12 June 2024 from https://techxplore.com/news/2024-06-censoring-creativity-limits-chatgpt-scriptwriting.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.