DeepSeek Proves It: Open Source is the Secret to Dominating Tech Markets (and Wall Street has it wrong).

Cat@ponder.cat · 19 hours ago

DeepSeek Proves It: Open Source is the Secret to Dominating Tech Markets (and Wall Street has it wrong).

vxx@lemmy.world · edit-2 29 minutes ago

Didnt it turn out that they used 10000 nvidia cards that had the 100er Chips, and the “low level success” and “low cost” is a lie?

adr1an@programming.dev · 5 hours ago

Not exactly sure of what “dominating” a market means, but the title is on a good point: innovation requires much more cooperation than competition. And the ‘AI race’ between nations is an antiquated mainframe pushed by media.

coherent_domain@infosec.pub · 12 hours ago

I hate to disagree but IIRC deepseek is not a open-source model but open-weight?

canadaduane@lemmy.ca · edit-2 10 hours ago

It’s tricky. There is code involved, and the code is open source. There is a neural net involved, and it is released as open weights. The part that is not available is the “input” that went into the training. This seems to be a common way in which models are released as both “open source” and “open weights”, but you wouldn’t necessarily be able to replicate the outcome with $5M or whatever it takes to train the foundation model, since you’d have to guess about what they used as their input training corpus.

vrighter@discuss.tchncs.de · 3 hours ago

I view it as the source code of the model is the training data. The code supplied is a bespoke compiler for it, which emits a binary blob (the weights). A compiler is written in code too, just like any other program. So what they released is the equivalent of the compiler’s source code, and the binary blob that it output when fed the training data (source code) which they did NOT release.

meowmeowbeanz@sh.itjust.works · 14 hours ago

Wall Street’s panic over DeepSeek is peak clown logic—like watching a room full of goldfish debate quantum physics. Closed ecosystems crumble because they’re built on the delusion that scarcity breeds value, while open source turns scarcity into oxygen. Every dollar spent hoarding GPUs for proprietary models is a dollar wasted on reinventing wheels that the community already gave away for free.

The Docker parallel is obvious to anyone who remembers when virtualization stopped being a luxury and became a utility. DeepSeek didn’t “disrupt” anything—it just reminded us that innovation isn’t about who owns the biggest sandbox, but who lets kids build castles without charging admission.

Governments and corporations keep playing chess with AI like it’s a Cold War relic, but the board’s already on fire. Open source isn’t a strategy—it’s gravity. You don’t negotiate with gravity. You adapt or splat.

Cheap reasoning models won’t kill demand for compute. They’ll turn AI into plumbing. And when’s the last time you heard someone argue over who owns the best pipe?

Flocklesscrow@lemm.ee · 10 hours ago

Governments and corporations still use the same playbooks because they’re still oversaturated with Boomers who haven’t learned a lick since 1987.

Dave@lemmy.world · 15 hours ago

DeepSeek shook the AI world because it’s cheaper, not because it’s open source.

And it’s not really open source either. Sure, the weights are open, but the training materials aren’t. Good look looking at the weights and figuring things out.

ℍ𝕂-𝟞𝟝@sopuli.xyz · 13 hours ago

I think it’s both. OpenAI was valued at a certain point because of a perceived moat of training costs. The cheapness killed the myth, but open sourcing it was the coup de grace as they couldn’t use the courts to put the genie back into the bottle.

Hackworth@lemmy.world · 14 hours ago

True, but they also released a paper that detailed their training methods. Is the paper sufficiently detailed such that others could reproduce those methods? Beats me.

Grandwolf319@sh.itjust.works · 18 hours ago

Pretty sure Valve has already realized the correct way to be a tech monopoly is to provide a good user experience.

finitebanjo@lemmy.world · edit-2 10 hours ago

Idk, I kind of disagree with some of their updates at least in the UI department.

They treat customers well, though.

bean@lemmy.world · edit-2 56 minutes ago

Yeah. Steam and I are getting older. Would be nice to adjust simple things like text size in the tool.

Also that ‘Live’ shit bothers me. Live means live. Not ‘was recorded live, and now presented perpetually as LIVE’

Semi-Hemi-Lemmygod@lemmy.world · edit-2 18 hours ago

Time to dust off this old chestnut

Jesus@lemmy.world · 17 hours ago

I remember this being some sort of Apple meme at some point. Hence the gum drop iMac.

Tony Bark@pawb.social · 19 hours ago

Personally, I think Microsoft open sourcing .NET was the first clue open source won.

Stovetop@lemmy.world · 19 hours ago

Deepseek is not open source.

wise_pancake@lemmy.ca · 18 hours ago

The model weights and research paper are, which is the accepted terminology nowadays.

It would be nice to have the training corpus and RLHF too.

The Octonaut@mander.xyz · 14 hours ago

the accepted terminology

No, it isn’t. The OSI specifically requires the training data be available or at very least that the source and fee for the data be given so that a user could get the same copy themselves. Because that’s the purpose of something being “open source”. Open source doesn’t just mean free to download and use.

https://opensource.org/ai/open-source-ai-definition

Data Information: Sufficiently detailed information about the data used to train the system so that a skilled person can build a substantially equivalent system. Data Information shall be made available under OSI-approved terms.

In particular, this must include: (1) the complete description of all data used for training, including (if used) of unshareable data, disclosing the provenance of the data, its scope and characteristics, how the data was obtained and selected, the labeling procedures, and data processing and filtering methodologies; (2) a listing of all publicly available training data and where to obtain it; and (3) a listing of all training data obtainable from third parties and where to obtain it, including for fee.

As per their paper, DeepSeek R1 required a very specific training data set because when they tried the same technique with less curated data, they got R"zero’ which basically ran fast and spat out a gibberish salad of English, Chinese and Python.

People are calling DeepSeek open source purely because they called themselves open source, but they seem to just be another free to download, black-box model. The best comparison is to Meta’s LlaMa, which weirdly nobody has decided is going to up-end the tech industry.

In reality “open source” is a terrible terminology for what is a very loose fit when basically trying to say that anyone could recreate or modify the model because they have the exact ‘recipe’.

kryptonidas@lemmings.world · 15 hours ago

The training corpus of these large models seem to be “the internet YOLO”. Where it’s fine for them to download every book and paper under the sun, but if a normal person does it.

Believe it or not:

ayyy@sh.itjust.works · 15 hours ago

I wouldn’t call it the accepted terminology at all. Just because some rich assholes try to will it into existence doesnt mean we have to accept it.

gamer@lemm.ee · 13 hours ago

The model weights and research paper are

I think you’re conflating “open source” with “free”

What does it even mean for a research paper to be open source? That they release a docx instead of a pdf, so people can modify the formatting? Lol

The model weights were released for free, but you don’t have access to their source, so you can’t recreate them yourself. Like Microsoft Paint isn’t open source just because they release the machine instructions for free. Model weights are the AI equivalent of an exe file. To extend that analogy, quants, LORAs, etc are like community-made mods.

To be open source, they would have to release the training data and the code used to train it. They won’t do that because they don’t want competition. They just want to do the facebook llama thing, where they hope someone uses it to build the next big thing, so that facebook can copy them and destroy them with a much better model that they didn’t release, force them to sell, or kill them with the license.

maplebar@lemmy.world · 14 hours ago

the accepted terminology nowadays

Let’s just redefine existing concepts to mean things that are more palatable to corporate control why don’t we?

If you don’t have the ability to build it yourself, it’s not open source. Deepseek is “freeware” at best. And that’s to say nothing of what the data is, where it comes from, and the legal ramifications of using it.

Stovetop@lemmy.world · edit-2 18 hours ago

A lot of other AI models can say the same, though. Facebook’s is. Xitter’s is. Still wouldn’t trust those at all, or any other model that publishes no reproduceable code.

GissaMittJobb@lemmy.ml · 17 hours ago

Llama has several restrictions making it quite a bit less open than Grok or DeepSeek.

justOnePersistentKbinPlease@fedia.io · 14 hours ago

The term open source is not free to redefine, nor has it been redefined.

lemmydividebyzero@reddthat.com · 15 hours ago

But then, people would realize that you got copyrighted material and stuff from pirating websites…

sem@lemmy.blahaj.zone · 15 hours ago

They are trying to make it accepted but it’s still contested. Unless the training data provided it’s not really open.

legolas@fedit.pl · edit-2 16 hours ago

well if they really are and methodology can be replicated, we are surely about to see some crazy number of deepseek comptention, cause imagine how many us companies in ai and finance sector exist out there that are in posession of even larger number of chips than chinese clamied to have trained their model on.

Although the question rises - if the methodology is so novel why would these folks make it opensource? Why would they share results of years of their work to the public losing their edge over competition? I dont understand.

Can somebody who actually knows how to read machine learning codebase tell us something about deepseek after reading their code?

wise_pancake@lemmy.ca · 16 hours ago

Hugging face already reproduced deepseek R1 (called Open R1) and open sourced the entire pipeline

legolas@fedit.pl · 16 hours ago

Did they? According to their repo its still WIP https://github.com/huggingface/open-r1

legolas@fedit.pl · edit-2 19 hours ago

Apparently DeepSeek is lying, they were collecting thousands of NVIDIA chips against the US embargo and it’s not about the algorithm. The model’s good results come just from sheer chip volume and energy used. That’s the story I’ve heard and honeslty it sounds legit.

Not sure if this questions has been answered though: if it’s open sourced, cant we see what algorithms they used to train it? If we could then we would know the answer. I assume we cant, but if we cant, then whats so cool about it being open source on the other hand? What parts of code are valuable there besides algorithms?

Pennomi@lemmy.world · 18 hours ago

The open paper they published details the algorithms and techniques used to train it, and it’s been replicated by researchers already.

legolas@fedit.pl · edit-2 16 hours ago

So are these techiques so novel and breaktrough? Will we now have a burst of deepseek like models everywhere? Cause that’s what absolutely should happen if the whole storey is true. I would assume there are dozens or even hundreds of companies in USA that are in a posession of similar number but surely more chips that Chinese folks claimed to trained their model on, especially in finance sector and just AI reserach focused.

ArchRecord@lemm.ee · edit-2 14 hours ago

So are these techiques so novel and breaktrough?

The general concept, no. (it’s reinforcement learning, something that’s existed for ages)

The actual implementation, yes. (training a model to think using a separate XML section, reinforcing with the highest quality results from previous iterations using reinforcement learning that naturally pushes responses to the highest rewarded outputs) Most other companies just didn’t assume this would work as well as throwing more data at the problem.

This is actually how people believe some of OpenAI’s newest models were developed, but the difference is that OpenAI was under the impression that more data would be necessary for the improvements, and thus had to continue training the entire model with additional new information, and they also assumed that directly training in thinking times was the best route, instead of doing so via reinforcement learning. DeepSeek decided to simply scrap that part altogether and go solely for reinforcement learning.

Will we now have a burst of deepseek like models everywhere?

Probably, yes. Companies and researchers are already beginning to use this same methodology. Here’s a writeup about S1, a model that performs up to 27% better than OpenAI’s best model. S1 used Supervised Fine Tuning, and did something so basic, that people hadn’t previously thought to try it: Just making the model think longer by modifying terminating XML tags.

This was released days after R1, based on R1’s initial premise, and creates better quality responses. Oh, and of course, it cost $6 to train.

So yes, I think it’s highly probable that we see a burst of new models, or at least improvements to existing ones. (Nobody has a very good reason to make a whole new model of a different name/type when they can simply improve the one they’re already using and have implemented)

Aatube@kbin.melroy.org · 13 hours ago

Note that s1 is transparently a distilled model instead of a model trained from scratch, meaning it inherits knowledge from an existing model (Gemini 2.0 in this case) and doesn’t need to retrain its knowledge nearly as much as training a model from scratch. It’s still important, but the training resources aren’t really directly comparable.

ArchRecord@lemm.ee · 13 hours ago

True, but I’m of the belief that we’ll probably see a continuation of the existing trend of building and improving upon existing models, rather than always starting entirely from scratch. For instance, you’ll almost always see nearly any newly released model talk about the performance of their Llama version, because it just produces better results when you combine it with the existing quality of Llama.

I think we’ll see a similar trend now, just with R1 variants instead of Llama variants being the primary new type used. It’s just fundamentally inefficient to start over from scratch every time, so it makes sense that newer iterations would be built directly on previous ones.

gamer@lemm.ee · 13 hours ago

There’s so much misinfo spreading about this, and while I don’t blame you for buying it, I do blame you for spreading it. “It sounds legit” is not how you should decide to trust what you read. Many people think the earth is flat because the conspiracy theories sound legit to them.

DeepSeek probably did lie about a lot of things, but their results are not disputed. R1 is competitive with leading models, it’s smaller, and it’s cheaper. The good results are definitely not from “sheer chip volume and energy used”, and American AI companies could have saved a lot of money if they had used those same techniques.

Aatube@kbin.melroy.org · 18 hours ago

Sauce?

TachyonTele@lemm.ee · 17 hours ago

It’s open sauce.

legolas@fedit.pl · 16 hours ago

internet

Aatube@kbin.melroy.org · 15 hours ago

Elaborate? Link? Please tell me this is not just an “allegedly”.

extra time which Im not sure I want to spend

It’s your burden of proof, bud.

legolas@fedit.pl · edit-2 14 hours ago

https://www.youtube.com/watch?v=RSr_vwZGF2k This is what I watched. I base my opinion on this. Im not saying this is true. It just sounded legit enough and I didnt have time to research more. I will gladly follow some links that lead me to content that destroys this guys arguments

Aatube@kbin.melroy.org · 13 hours ago

My god, the preamble for that thing is so dang long. 13:30 with some AI sponsorship the comments are talking about I may have accidentally skipped over, and only 10:27-11:37 deals with what you’re talking about. The video makes a good point that they have existing operating infrastructure. However, for the stockpiling accusation, the statements that it cites are from the CEO of big competitor “Chips AI”, who cite nothing except “only costing $6 million is impossible, therefore it actually cost more and they must have cheated! I think they have 50,000 illegally imported Nvidia GPUs!” which just sounds like the behavior of a cult ringleader trying to maintain power to me. The other source it cites for this claim is Elon Musk, whose reasoning was “Obviously”.

TachyonTele@lemm.ee · 14 hours ago

This is after all, a court of law.

Aatube@kbin.melroy.org · 13 hours ago

I just think that no matter whether DeepSeek smuggled or not, an investigation into whether or not they smuggled is of course going to be launched. I do want more transparency regarding where the Singapore billing goes, but that alone is too shaky for conclusions.

TachyonTele@lemm.ee · 11 hours ago

No one here is going to be involved with any of it.

ayyy@sh.itjust.works · edit-2 14 hours ago

It’s time for you to do some serious self-reflection about the inherent biases you believe about ~~Asians~~ Chinese people.

legolas@fedit.pl · 14 hours ago

WTF dude. You mentioned Asia. I love Asians. Asia is vast. There are many countries, not just China bro. I think you need to do these reflections. Im talking about very specific case of Chinese Deepseek devs potentiall lying about the chips. The assumptions and generalizations you are thinking of are crazy.

ayyy@sh.itjust.works · 14 hours ago

And how do your feelings stand up to the fact that independent researchers find the paper to be reproducible?

legolas@fedit.pl · 14 hours ago

Well maybe. Apparntly some folks are already doing that but its not done yet. Let’s wait for the results. If everything is legit we should have not one but plenty of similar and better models in near future. If Chinese did this with 100 chips imagine what can be done with 100000 chips that nvidia can sell to a us company

Viri4thus@feddit.org · 18 hours ago

“China bad”

*sounds legit

Sounds legit is what one hears about FUD spread by alglophone media every time the US oligarchy is caught with their pants down.

Snowden: “US is illegally spying on everyone”

Media: Snowden is Russia spy

*Sounds legit

France: US should not unilaterally invade a country

Media: Iraq is full of WMDs

*Sounds legit

DeepSeek: Guys, distillation and body of experts is a way to save money and energy, here’s a paper on how to do same.

Media: China bad, deepseek must be cheating

*Sounds legit

Deceptichum@quokk.au · 18 hours ago

This is your brain on Chinese/Russian propaganda.

ayyy@sh.itjust.works · 14 hours ago

Can you point out any factual inaccuracies or is it just that your wittew fee-fees got hurt?

ripcord@lemmy.world · 12 hours ago

Ah, cool, a new account to block.

mspencer712@programming.dev · 17 hours ago

I don’t like this. Everything you’re saying is true, but this argument isn’t persuasive, it’s dehumanizing. Making people feel bad for disagreeing doesn’t convince them to stop disagreeing.

A more enlightened perspective might be “this might be true or it might not be, so I’m keeping an open mind and waiting for more evidence to arrive in the future.”

SadSadSatellite @lemmy.dbzer0.com · 16 hours ago

Not the original commenter, but what theirs saying stands true. The issue of “sounds legit” is the main driving force in misinformation right now.

The only way to combat it is to truly gain the knowledge yourself. Accepting things at face value has lead to massive disagreements on objective information, and allowed anti science mindsets to flourish.

Podcasts are the medium that I give the most blame to. Just because someone has a camera and a microphone, viewers believe them to be an authority on a subject, and pairing this with the “sounds Legit” mindset has set back critical thinking skills for an entire population.