EU's New AI Laws to Mandate Greater Transparency in Data Usage

EU's New AI Laws to Mandate Greater Transparency in Data Usage

The European Union has introduced a new set of laws that will require AI companies to be more transparent about the data used to train their models, challenging one of the industry's most closely guarded secrets.

Since the launch of ChatGPT by OpenAI 18 months ago, there has been a significant increase in public interest and investment in generative AI, which can quickly create text, images, and audio. However, concerns have been raised about the sources of data used for training these models and whether using copyrighted materials without permission violates intellectual property rights.

The recently enacted AI Act in the EU, to be implemented over the next two years, aims to address these issues by obliging companies to provide detailed summaries of the training data for general-purpose AI models like ChatGPT. The newly formed AI Office plans to issue a template for these reports in early 2025 after consulting with stakeholders.

Many AI companies are resistant to these new transparency requirements, arguing that revealing their datasets would compromise their competitive edge. "It's like cooking," said Matthieu Riouf, CEO of Photoroom. "There's a secret part of the recipe that the best chefs wouldn't share."

The level of detail required in these transparency reports could significantly impact both smaller AI startups and major tech firms such as Google and Meta. Over the past year, companies like Google, OpenAI, and Stability AI have faced lawsuits for allegedly using copyrighted content to train their models without permission.

In the U.S., President Joe Biden has issued executive orders addressing AI security risks, but copyright issues remain largely unresolved. There is bipartisan support in Congress for requiring tech companies to compensate rights holders for the data they use.

Despite signing content-licensing agreements with media outlets, AI companies continue to face scrutiny. OpenAI recently faced backlash for using an AI-generated voice similar to Scarlett Johansson’s without permission.

Thomas Wolf, co-founder of AI startup Hugging Face, supports greater transparency but acknowledges the industry's mixed feelings. "It's hard to know how it will work out. There is still a lot to be decided," he said.

European lawmakers are divided on the issue. Dragos Tudorache, a key figure in drafting the AI Act, believes AI companies should disclose their datasets to allow creators to verify if their work was used. However, French officials, including President Emmanuel Macron, have expressed concerns that stringent regulations might hinder the competitiveness of European AI startups.

As the AI Act rolls out, the industry awaits clearer guidelines on balancing trade secret protection with the rights of copyright holders.