The entire internet is turning into a generative AI mess caused by OpenAI. Even Grok, supposedly an original large language model, mimics OpenAI's response – “As it goes against OpenAI’s use case policy…” – while refusing a task. This anomaly is called “model collapse”, where AI chatbots lose the original information and replace it with the synthetic data of other AI models.
Last year, this issue was highlighted in a research paper named ‘The Curse of Recursion: Training on Generated Data Makes Models Forget’. The paper found that using model-generated content in training causes irreversible defects in the resulting models, where the tails of the original content distribution disappear. The warning was then seen as farsighted and only theoretical, but evidence of the problematic technology has emerged.
According to experts, the issue will be aggravated with the internet flooding with generative AI data from models like OpenAI, LLaMA, Grok and others. It’s highly likely that in future, all the AI models, though built by different companies, will look all the same, producing similar results.
Keep reading with a 7-day free trial
Subscribe to Sector 6 | The Newsletter of AIM to keep reading this post and get 7 days of free access to the full post archives.