Nature’s research show ‘Model collapse’: Scientists warn against letting AI eat its own tail

Why Letting AI Train on Its Own Data Could Lead to “Model Collapse”

Model Collapse – A Primer

Now scientists from Oxford and other institutions warn AI models could hit a problem they’re calling “model collapse”.
As AI systems learn from data produced by other AIs instead of original human-created content.

AI models work by identifying patterns in the data they are taught to recognize.

So, for instance, if you ask an AI how to make a snickerdoodle it will provide the most frequent one seen in its training data.

If later AIs are trained on these patterns that AI was taught by other men, the AIs themselves begin to believe this is more common than it truly is.

Image Courtesy – Nature.com

AI-generated images are influencing how AI “thinks” – If an AI is trained on them, something like it will see a picture of one thing and instantly think of another.
For example: if many – but really all – dogs in web search results look like golden retrievers due to the use there by people making fake pictures with no humans refused, then most dogs looked at image, however not photographed golden breeders they vision.
With time, this misinformation adds up to make the AI worse at understanding dogs are as they really exist.

AI-generated content is teaching AI models that the world works according to how it appears in rehearsed data.
This creates a downward spiral where AIs get dumber and dirtier, called model collapse.

This would be a terrible outcome because it is the fault of training AIs with this AI-generated data but scientists say one solution could be marking these types of models.
To ensure AI quality, companies need to protect and spread real human data.

The concern is that such toolkits might eventually end up only being able to generate synthetic data from their datasets, due to the so-called “model collapse”.
The prepared AI training data should be protected for ensuring the quality of output results from this bias.