Urgent Call for Cross-Disciplinary Research on Social and Cultural Threats of AI Systems

Proponents of artificial intelligence (AI) and news outlets are forecasting the end of the generative AI frenzy and speculating about a catastrophic “model collapse.” However, how plausible are these forecasts? In any case, what is model collapse?

“Model collapse,” a theory that was first discussed in 2023 but has gained popularity more lately, is the idea that as the amount of AI-generated data on the internet increases, future AI systems will become more and more incompetent. The requirement for information Machine learning is the building block of modern AI systems. The fundamental mathematical structure is created by programmers, but the system’s true “intelligence” is gained through training it to recognize patterns in data—not just any data.

The current generative AI system requires large amounts of high-quality data. Prominent tech firms such as OpenAI, Google, Meta, and Nvidia continuously mine the internet for this data, gathering terabytes of stuff to feed the computers. However, after the introduction of practical and generally accessible generative AI systems in 2022, more and more users are uploading and disseminating content that is generated entirely or in part by AI.

In 2023, scientists began to question if they could get away with using purely artificial intelligence (AI) generated data for training rather than human-provided data. There are strong incentives to get this to happen. AI-generated content is not just widely available online but also far less expensive to get than human data. Additionally, gathering in large quantities is not morally or legally dubious. However, as each model learns from the preceding one, researchers discovered that AI systems trained on AI-made data became increasingly less intelligent in the absence of high-quality human input.

It is comparable to the issue of inbreeding on a digital scale. Model behavior seems to be less varied and of worse quality as a result of this “regurgitive training.” Here, “quality” generally refers to a blend of being kind, innocent, and truthful. Diversity in AI outputs refers to the range of replies and the representation of people’s social and cultural perspectives. To put it briefly, we may be contaminating the same data source that we require for AI systems to be useful in the first place if we utilize them excessively.

Leave a Reply

Your email address will not be published. Required fields are marked *