My point was a mixture of Experts model could suffer from generalization. Although in reading more I’m not sure if it’s the newer R model that had the MoE element.
My point was a mixture of Experts model could suffer from generalization. Although in reading more I’m not sure if it’s the newer R model that had the MoE element.
Yeah, even the cherry picked examples they provide look only okay.
To be honest everything with this company feels like an ad campaign more than anything else.
So I’m still on the fence about the AI arms race in general. However, reading up on DeepSeek it feels like they built a model specifically to work well on the benchmarks.
I say this cause it’s a Mixture of Experts approach, so only parts of the model are used at any given point. The drawback is generalization.
Additionally, it isn’t a multimodal model and the only place I’ve seen real opportunity for workflows automation is using the multimodal models. I guess you could use a combination of models, but that’s definitely a step back from the grand promise of these foundational models.
Overall, I’m just not sure if this is lay people getting caught up in hype or actually a significant change in the landscape.
I like your assumption is cheese covered corn rather than it being Mac and cheese.
I mean Meta opened up Llama for free a while ago. But at the end of the day, the AI models posed to actually impact things are those integrated or integrateable into workflows, and those are all still more or less locked down.
LLMs are general purpose models trained on text. The thougbt is they should be able to address anything that can be represented in a textual format.
While you could focus the model by only providing specific types of text, the general notion is they should be able to handle tasks ranging across different domains/disciplines.
Then why mention his name at all. It’s just like the covid checks, Trump demands/wants the attention.
Just some quick Google searches so not sure how reputable, but didn’t feel like copying random links.
But yeah, that’s why I called them out as estimates as I suspect there is a lot of room for error in those numbers.
With these kinds of models you can’t ever stop training them, otherwise you reach a point where the data becomes dated and thus the model is dated.
Think world events. Say another bird flu became a pandemic. The model can only know about that if it’s trained on those events.
There are systems (Rag) that can answer questions based on additional content, but that would only work on a subset of problems/situations.
I tried a reverse image search and couldn’t find a source. I’m assuming/hoping it’s out of context, but definitely a pretty crazy frame.
I had to looks this one up, but missed the “galaxy” vs “universe”. There are an estimated 3 trillion trees, 100-400 billion stars in the milky way galaxy, but potentially 1 septilliom stars in the universe.
However all three of these are estimates, so who actually knows.
I’m actually not sure how you’d label the axis here. The info being conveyed is the relationship between two separate things.
Another bad one is reddit. The amount it pushes the mobile interface while on mobile is painful, but switch to desktop mode and it all goes away.
I don’t go on it a lot anymore, but when I do it’s typically from my phone and its really gotten worse.
Wasn’t claiming it was a good or bad call, just that the Supreme Court is about legality and there is history of the US banning software and a global history of banning this specific app.
I think it’s more there is a precedence of the government blocking software they think is a threat - https://cmmcinfo.org/list-of-hardware-software-and-services-banned-by-us-government/
We give the government a lot of leeway when it comes to “national security”
Also, we’re not the first country to ban it - https://www.pbs.org/newshour/world/heres-what-happened-when-india-banned-tiktok
The power bill side is also not even clear cut. The longer processing time for slower chips sometimes ends up resulting in higher costs. It’s surprisingly not as simple as lower wattage chip is cheaper to operate.
Yeah, it’s not particularly interesting or impressive. We’ve been able to do this for years, just in the news cause it plays into a lot of people’s fears about AI.
Just in case you aren’t aware, it’s satire. IsGlitch isn’t real news.
It’s not about doing the things it’s about the optics of saying you’ll do them. The wall isn’t done and never will be. Even if it was it would never deliver what was promised.
Its all about optics and it’s why Trump sabotaged the immigration reform bill. It would show democrats are capable of actually executing/delivering in a way he can’t.
UK went through industrialization leading to its empire, and the US was the industrial power during its ascent. Same thing with Japan before WWII.
Many imoeralistic powers seem to go through big industrial growth before expansion.