How do you know if content has been used by artificial intelligence?

Language models like ChatGPT's are accused of being trained on copyrighted texts.

While seventeen American writers represented by the Authors Guild, including Jonathan Franzen and John Grisham, filed a complaint against OpenAI and its conversation agent ChatGPT for copyright infringement in September, an English team from Imperial College London has found a way to detect or a literary or scientific text was seen through a language model during editing.

As a reminder, a language model is software that uses a chatbot respond or communicate with a human in natural language: the conversation robot creates grammatically correct sentences, adapts its style, creates original expressions, etc. These capabilities are obtained through a rather “brutal” learning method that consists of making it guess the next word in a sentence from a huge corpus of texts, reaching trillions of ‘tokens’ (or semantic subunits, such as syllables, prefixes, suffixes, etc.). These texts come from web pages, forums, scientific articles, books and newspaper articles, most likely protected by copyright.

Few players describe this corpus in detail, including those whose language models are said to be open source. OpenAI does not communicate this information, Meta did that for Llama, but not for Llama 2. Google was no longer available for Bard…

Also read: Article reserved for our subscribers In addition to artificial intelligence, the ChatGPT chatbot owes its speaking skills to humans

Despite the lack of transparency, can we read the ‘brains’ of these algorithms, which consist of billions of parameters? Can we know what they read or not? The English team answers in the affirmative. “We were motivated by the idea of ​​making this aspect of language models less opaque, because what they know comes exactly from this data”explains Yves-Alexandre de Montjoye, associate professor at Imperial College.

An opaque learning corpus

The researchers carried out a so-called “membership inference” attack on a large language model, Llama, from the company Meta, or rather on an identical version, OpenLlama, whose learning corpus was made public – making it possible to test researchers’ predictions , presented in a preprint (an article that has not yet been accepted by a scientific journal) submitted to a conference on October 23.

The researchers first selected their own corpus of books (38,300 in number) and scientific articles (1.6 million), drawn from the Redpajama database of the Hugging Face company. Each of these families was divided into two, “possible member of the training corpus” Or “no member” (because it was taken at a later time during the OpenLlama training). For each token in these texts, they tested the language model by studying which word it suggests after a sentence of about 128 tokens and what probability the token corresponds to the real word. These gaps between the model and reality across thousands of sentences make it possible to construct a kind of signature for each book or article. “We actually look at whether the model is ‘surprised’ by a text”, summarizes Yves-Alexandre de Montjoye. In a second step, they built a program that could classify a text as “member of the training corpus” or “non-member”, by training this program with the results they obtained on the two types of text. These calculations take approximately one minute per pound of approximately 100,000 tokens.

You still have 30% of this article to read. The rest is reserved for subscribers.

Leave a Comment