ºÚÁÏÉçÇø

Skip to Main Content
Excellence and Expertise

Farmer School ISA professor weighs in on DeepSeek AI

New artificial intelligence model could have big implications on what we know about AI and how to create and train one

Arthur Carvalho lecturing in class
Excellence and Expertise

Farmer School ISA professor weighs in on DeepSeek AI

While it may seem like announcements about artificial intelligence and what they’re capable of happen all the time, the recent disclosure of the DeepSeek AI model hit the cyberworld as a bit of a bombshell.

Farmer School of Business information systems and analytics professor Arthur Carvalho said there are several interesting things about the DeepSeek model:

  • The claimed cost to train the model, a little less than $6 million, is considerably less money than U.S. companies are spending to train their own AI models.
  • DeepSeek is open source, yet benchmarks very similarly to better-known and more expensive proprietary models
  • It’s a “thinking” AI that thoroughly explains its thought process to the user, while most thinking AIs are less transparent about their process.

In general terms, creating and training an AI involves defining a problem, gathering relevant data, preparing that data to be used, selecting an appropriate algorithm, training the model with the data, evaluating its performance, and then deploying the AI for practical application.

All that costs money, Carvalho said.

“There are estimates about ChatGPT that put this number at well over $100 million and there are discussions that for the next ChatGPT version, that number could very well, if we continue as it is, hit $1 billion,” Carvalho said.

To demonstrate the “thinking” aspect of DeepSeek, Carvalho asked it to “create a poem about love, but it must have elements of sadness, as well as death and resilience.”

The resulting poem was “not bad,” Carvalho said. But what makes DeepSeek unique is that it also gave him several sentences of explanation about what it considered when creating the poem.

“’How do we represent love? Maybe through imaginary roses or stars.’ It keeps going. ‘The poem should have a balance between the melancholy of death and the strength of resilience. These are the kind of words it should perhaps use or avoid.’ It's absolutely amazing. It's absolutely outstanding, because that's the first model out there that gives you details of the thought process. It's almost like there's a human here struggling to come up with an answer to my query,” he said.

How long did DeepSeek think about that request? Nine seconds.

“In my personal experimentation, the model is good. It's on par with some of the top ones, maybe slightly worse. Some people could say slightly better -- but it was cheap to train,” Carvalho said.

DeepSeek is also an open-source model, reportedly created by graduate ºÚÁÏÉçÇøs and young professionals in China. “I can take this model, download it to my computer and run it locally if I have a powerful enough computer,” Carvalho said. “But they also have what's called a distilled version of the model, which are smaller versions that are less powerful. So, I can have less-powerful versions of this model for free, completely running on my laptop right now.”

The Monday after the announcement saw a considerable stock price dip among the companies known as “The Magnificent Seven” -- Google, Meta, Nvidia, Alphabet, Amazon, Tesla, and Microsoft -- all of which have direct investments in AI and in AI-related hardware. “You can see the risk of all these companies investing a lot of money in these big, closed models, and then there's something much smaller coming that's cheaper and almost equally good,” Carvalho said.

Carvalho said we can expect a lot of scrutiny about that less-than-$6 million cost. “Be careful about looking at these numbers, because we don't know how true it is. Number one, it could be someone trying to promote a model and saying, ‘Hey, it's incredibly cheap. We can do it,’ that will attract a lot of investors and eyes. I'm not saying that this is what's happening here, but you must always be suspicious,” he said. “Number two, the number does not really tell the whole story. The story so far is about renting computers for a low cost to train the model, but how about the price of people, all the people doing all the data work, the data scientists, the price of acquiring data?”

“But again, all things being equal, this number is amazing, truly amazing. If this number is true, $5.6 million in terms of computational power, that's way cheap, and that's why everybody is looking at this company very carefully now, thinking ‘Maybe there’s something really, really cool here,’” he said.

Arthur Carvalho is the Dinesh & Ila Paliwal Innovation Chair & Associate Professor & FSB Faculty Fellow of Information Systems & Analytics at the Farmer School of Business at ºÚÁÏÉçÇø.