AI Researchers Remove Over 2,000 Child Abuse Image Links from Training Dataset

AI Researchers Remove Over 2,000 Child Abuse Image Links from Training Dataset

Researchers behind a prominent AI image-generating dataset announced on Friday that they have deleted more than 2,000 web links to suspected child sexual abuse imagery. The dataset in question, maintained by LAION (Large-scale Artificial Intelligence Open Network), has been widely used to train popular AI tools like Stable Diffusion and Midjourney.

This action follows a December 2023 report by the Stanford Internet Observatory, which revealed that the dataset contained links to sexually explicit images of children. These images contributed to the development of photorealistic deepfakes depicting minors, raising significant concerns about the misuse of AI technology.

In response, LAION immediately took down its dataset last year and has since worked closely with Stanford researchers and anti-abuse organizations from Canada and the United Kingdom. After eight months of collaboration, LAION released a cleaned-up version of its dataset, aiming to prevent the future inclusion of harmful content in AI research.

David Thiel, the Stanford researcher behind the December report, praised LAION's efforts to improve the dataset. However, he emphasized the need to remove the "tainted models" still capable of producing explicit child imagery.

One such tool, an older version of Stable Diffusion identified by Stanford as a major source of explicit content generation, was accessible until Thursday. It was removed from the AI model repository Hugging Face by New York-based company Runway ML. Runway described this action as part of a "planned deprecation of research models and code that have not been actively maintained."

The cleanup of the LAION dataset comes at a time when global governments are increasingly scrutinizing tech tools that facilitate the creation and distribution of illegal child imagery. For instance, earlier this month, San Francisco’s city attorney filed a lawsuit to shut down websites enabling the generation of AI-created nudes of women and girls. Additionally, French authorities have charged Telegram founder Pavel Durov for the platform's role in distributing child sexual abuse images.

David Evan Harris, a researcher at UC Berkeley, noted that Durov's arrest indicates a significant shift in the tech industry, highlighting that platform founders can now be held personally accountable for illegal activities facilitated by their creations.