Why DeepSeek’s new AI mannequin thinks it is ChatGPT

December 29, 2024

13

Earlier this week, DeepSeek, a well-funded Chinese language AI lab, launched an “open” AI mannequin that beats many rivals on in style benchmarks. The mannequin, DeepSeek V3, is giant however environment friendly, dealing with text-based duties like coding and writing essays with ease.

It additionally appears to suppose it’s ChatGPT.

Posts on X — and TechCrunch’s personal exams — present that DeepSeek V3 identifies itself as ChatGPT, OpenAI’s AI-powered chatbot platform. Requested to elaborate, DeepSeek V3 insists it’s a model of OpenAI’s GPT-4 mannequin launched in 2023.

This really reproduces as of in the present day. In 5 out of 8 generations, DeepSeekV3 claims to be ChatGPT (v4), whereas claiming to be DeepSeekV3 solely 3 occasions.

Provides you a tough thought of a few of their coaching knowledge distribution. https://t.co/Zk1KUppBQM pic.twitter.com/ptIByn0lcv

— Lucas Beyer (bl16) (@giffmana) December 27, 2024

The delusions run deep. In case you ask DeepSeek V3 a query about DeepSeek’s API, it’ll offer you directions on how one can use OpenAI’s API. DeepSeek V3 even tells a number of the similar jokes as GPT-4 — right down to the punchlines.

So what’s occurring?

Fashions like ChatGPT and DeepSeek V3 are statistical methods. Educated on billions of examples, they be taught patterns in these examples to make predictions — like how “to whom” in an e-mail usually precedes “it could concern.”

DeepSeek hasn’t revealed a lot concerning the supply of DeepSeek V3’s coaching knowledge. However there’s no scarcity of public datasets containing textual content generated by GPT-4 by way of ChatGPT. If DeepSeek V3 was skilled on these, the mannequin may’ve memorized a few of GPT-4’s outputs and is now regurgitating them verbatim.

“Clearly, the mannequin is seeing uncooked responses from ChatGPT sooner or later, but it surely’s not clear the place that’s,” Mike Prepare dinner, a analysis fellow at King’s Faculty London specializing in AI, instructed TechCrunch. “It may very well be ‘unintended’ … however sadly, now we have seen situations of individuals straight coaching their fashions on the outputs of different fashions to attempt to piggyback off their information.”

Prepare dinner famous that the apply of coaching fashions on outputs from rival AI methods will be “very unhealthy” for mannequin high quality, as a result of it could possibly result in hallucinations and deceptive solutions just like the above. “Like taking a photocopy of a photocopy, we lose increasingly info and connection to actuality,” Prepare dinner mentioned.

It may additionally be in opposition to these methods’ phrases of service.

OpenAI’s phrases prohibit customers of its merchandise, together with ChatGPT clients, from utilizing outputs to develop fashions that compete with OpenAI’s personal.

OpenAI and DeepSeek didn’t instantly reply to requests for remark. Nonetheless, OpenAI CEO Sam Altman posted what seemed to be a dig at DeepSeek and different opponents on X Friday.

“It’s (comparatively) simple to repeat one thing that you realize works,” Altman wrote. “This can be very arduous to do one thing new, dangerous, and tough once you don’t know if it should work.”

Granted, DeepSeek V3 is much from the primary mannequin to misidentify itself. Google’s Gemini and others generally declare to be competing fashions. For instance, prompted in Mandarin, Gemini says that it’s Chinese language firm Baidu’s Wenxinyiyan chatbot.

And that’s as a result of the online, which is the place AI corporations supply the majority of their coaching knowledge, is changing into littered with AI slop. Content material farms are utilizing AI to create clickbait. Bots are flooding Reddit and X. By one estimate, 90% of the online may very well be AI-generated by 2026.

This “contamination,” if you’ll, has made it fairly tough to completely filter AI outputs from coaching datasets.

It’s definitely doable that DeepSeek skilled DeepSeek V3 straight on ChatGPT-generated textual content. Google was as soon as accused of doing the identical, in spite of everything.

Heidy Khlaaf, chief AI scientist on the nonprofit AI Now Institute, mentioned the fee financial savings from “distilling” an current mannequin’s information will be enticing to builders, whatever the dangers.

“Even with web knowledge now brimming with AI outputs, different fashions that might unintentionally practice on ChatGPT or GPT-4 outputs wouldn’t essentially exhibit outputs harking back to OpenAI personalized messages,” Khlaaf mentioned. “If it’s the case that DeepSeek carried out distillation partially utilizing OpenAI fashions, it will not be shocking.”

Extra possible, nevertheless, is that lots of ChatGPT/GPT-4 knowledge made its manner into the DeepSeek V3 coaching set. Which means the mannequin can’t be trusted to self-identify, for one. However what’s extra regarding is the chance that DeepSeek V3, by uncritically absorbing and iterating on GPT-4’s outputs, might exacerbate a number of the mannequin’s biases and flaws.

TechCrunch has an AI-focused publication! Join right here to get it in your inbox each Wednesday.

Why DeepSeek’s new AI mannequin thinks it is ChatGPT

Related Articles

TikTok goes darkish within the US

The Common 401(okay) Stability for a 50-Yr-Previous Might Shock You. How Do You Examine?

Trump Returns to the White Home Subsequent Week. How Are Buyers Making ready?

LEAVE A REPLY Cancel reply

Latest Articles

TikTok goes darkish within the US

The Common 401(okay) Stability for a 50-Yr-Previous Might Shock You. How Do You Examine?

Trump Returns to the White Home Subsequent Week. How Are Buyers Making ready?

Are you able to switch Hyatt factors to a different particular person?

4 Causes to Keep at Your Agency