Skip to content

Canada’s copyright laws face AI test as creators demand accountability

With the rise of generative Artificial Intelligence (AI) systems in the mainstream, there are growing concerns from creators on the volume of copyrighted data being used to train these models. In February 2025, the Canadian Government released its Consultation on Copyright in the Age of Generative AI, which includes observation regarding a lack of transparency around the use of copyrighted data, and how this might be addressed – reflections that likely ring true for nations across the globe.

Creators vs. tech innovators

The first observation made in the Consultation focuses on a prevalent concern amongst creators – that Generative AI models use their copyrighted content through text and data mining without establishing consent or compensation.

“Actors from the creative industries are generally strongly opposed to any potential exceptions that would grant permissions to those undertaking data mining to use their works. They would say it should not be done without fair compensation,” said Vincent Bergeron, Principal at ROBIC.

Establishing this would involve implementation of restrictions, but the rapid development of generative AI models means legislators are playing catch up. On the flip side, tech industry professionals express enthusiasm for this speed of innovation, with concerns that bringing in regulations would limit progress, particularly for smaller companies.

“The tech industry is generally in strong favour of such an exception. They’re saying, we’re not extracting the expressive content, we’re just trying to see the factual patterns which are not protected. They argue that on costs and compensation, because of the massive amount of data that is required to train such a model, it would make it almost impossible for smaller companies to compete with the big ones in developing Large Language Models (LLMs),” added Bergeron.

Barriers to regulation

Whether regulation is the right way forward or not can’t be considered until it’s determined whether it is even a possibility. Generative AI models employ millions of sources for text and data mining. This volume makes it very challenging to establish what content to offer consent and compensation to. This also blocks potential solutions such as compulsory licensing, where a licensee (in this case the owner of the AI model) would pay a predetermined fee to the copyright holder.

“Compulsory licensing has been raised as a solution, but I think this would be hard because these systems aren’t just mining copyright protected content – they’re mining everything. It’s hard to know what the data has been trained on. Right now, there is nothing that would force the owner of such a model disclose these sources. At the moment, it’s more like a fishing expedition for the people filing these lawsuits. They know their content has been mined as they can find traces of the content, or see similar content produced, so they can deduce that the model has mined their content,” explains Bergeron.

“In addition, some companies will have a hard time describing what the model has actually learned. Initially, they’re feeding the model certain types of content intentionally, but at some point the model is also learning from itself, and from synthetic data.”

This lack of available data on the sources used to train AI models raises challenges when claiming copyright infringement, creating an enhanced need for transparency in regulation – another observation raised in the Canadian Government’s consultation, as transparency would help to identify which people should then be offered fair compensation.

Currently, it’s unclear whether increased copyright regulation is the path forward – but unpicking the complexities around establishing increased transparency would serve as a good start.