Holotypic Occlupanid Research Group - A Database of Bread Clip Research

ArchRecord@lemm.ee · 13 minutes ago

It actually does exist, at least on Mastodon, but is still very janky (e.g. old posts aren’t moved over due to “technical limitations”)

Automatically makes people unfollow your old account and re-follow your new account, then makes your old instance’s link redirect to your new instance’s one.

ArchRecord@lemm.ee · 11 hours ago

I don’t personally think it’s because of that. Sure, federation as a concept outside of email has a bit of a messaging problem for explaining it to newbies, but… everyone uses email, and knows how that works. This is identical, just with it being posts instead of emails. Users aren’t averse to federation, in concept or practice.

Bluesky was directly created as a very close clone of Twitter’s UI, co-governed and subsequently pushed by the founder of Twitter himself, who will obviously have more reach than randoms promoting something like Mastodon, and, in my opinion, kind of just had better branding.

“Bluesky” feels like a breath of fresh air, while “Mastodon” just sounds like… well, a Mastodon, whatever that makes the average person think of at first.

So when you compare Bluesky, with a familiar UI, nice name, and consistent branding, not to mention algorithms, which Mastodon lacks, all funded by large sums of money, to Mastodon, with unfamiliar branding, minimal funding, and substantially less reach from promoters, which one will win out, regardless of the technology involved?

ArchRecord@lemm.ee · 15 hours ago

To anyone bemoaning BlueSky’s lack of federation, check out Free Our Feeds.

It’s a campaign to create a public interest foundation independent from the Bluesky team (although the Bluesky team has said they support them) that will build independent infrastructure, like a secondary “relay” as an alternative to Bluesky’s that can still communicate across the same protocol (The “AT Protocol”) while also doing developer grants for the development of further social applications built on open protocols like the AT Protocol or ActivityPub.

They have the support of an existing 501c(3), and their open letter has been signed by people you might find interesting, such as Jimmy Wales (founder of Wikipedia).

ArchRecord@lemm.ee · 20 hours ago

Because, on average, black people are more economically disadvantaged than white people.

Choosing to explicitly buy from black farmers will, on average, tend to support those with the least financial means out of the general population of farmers, whereas choosing to explicitly buy from white farmers will, on average, tend to support those who are already more financially advantaged.

One side is directly choosing to help those most likely to be economically disadvantaged, the other would be explicitly ignoring those with the least means in order to help those who already have the most, thus the situations are not quite comparable.

I personally would prefer an index that directly assessed farmers based on overall wealth to determine who you should buy from, but because that’s extraordinarily difficult to constantly update & maintain, verify, etc, it can just be easier to divide among racial lines since that still tends to produce a grouping that is relatively similar.

ArchRecord@lemm.ee · 2 days ago

An infamous former U.S. National Security Agency (NSA) contractor and whistleblower has unexpectedly shared his opinion on the state of the graphics card market.

Man who did big cool thing once also has opinions on unrelated thing, news at 11.

ArchRecord@lemm.ee · 2 days ago

It’s not a boycott, neither is it illegal. He’s literally just being a crybaby and believes that anybody not pandering to his business model should be forced by the courts to give him money regardless.

ArchRecord@lemm.ee · 3 days ago

ArchRecord@lemm.ee · 4 days ago

What other proprietary software is necessary to use model weights?

ArchRecord@lemm.ee · 7 days ago

the company states that it may share user information to "comply with applicable law, legal process, or government requests.

Literally every company’s privacy policy here in the US basically just says that too.

Not only does DeepSeek collect “text or audio input, prompt, uploaded files, feedback, chat history, or other content that [the user] provide[s] to our model and Services,” but it also collects information from your device, including “device model, operating system, keystroke patterns or rhythms, IP address, and system language.”

Breaking news, company with chatbot you send messages to uses and stores the messages you send, and also does what practically every other app does for demographic statistics gathering and optimizations.

Companies with AI models like Google, Meta, and OpenAI collect similar troves of information, but their privacy policies do not mention collecting keystrokes. There’s also the added issue that DeepSeek sends your user data straight to Chinese servers.

They didn’t use the word keystrokes, therefore they don’t collect them? Of course they collect keystrokes, how else would you type anything into these apps?

In DeepSeek’s privacy policy, there’s no mention of the security of its servers. There’s nothing about whether data is encrypted, either stored or in transmission, and zero information about safeguards to prevent unauthorized access.

This is the only thing that seems disturbing to me, compared to what we’d like to expect based on the context of what DeepSeek is. Of course, this was proven recently in practice to be terrible policy, so I assume they might shore up their defenses a bit.

All the articles that talk about this as if it’s some big revelation just boil down to “company does exactly what every other big tech company does in America, except in China”

ArchRecord@lemm.ee · 7 days ago

Corporations are allowed to steal, just not from each other, that’s bad. /s

ArchRecord@lemm.ee · 8 days ago

I doubt that will be the case, and I’ll explain why.

As mentioned in this article,

SFT (supervised fine-tuning), a standard step in AI development, involves training models on curated datasets to teach step-by-step reasoning, often referred to as chain-of-thought (CoT). It is considered essential for improving reasoning capabilities. DeepSeek challenged this assumption by skipping SFT entirely, opting instead to rely on reinforcement learning (RL) to train the model. This bold move forced DeepSeek-R1 to develop independent reasoning abilities, avoiding the brittleness often introduced by prescriptive datasets.

This totally changes the way we think about AI training, which is why while OpenAI spent $100m on training GPT-4, running an expected 500,000 GPUs, DeepSeek used about 50,000, and likely spent that same roughly 10% of the cost.

So while operation, and even training, is now cheaper, it’s also substantially less compute intensive to train models.

And not only is there less data than ever to train models on that won’t cause them to get worse by regurgitating other worse quality AI-generated content, but even if additional datasets were scrapped entirely in favor of this new RL method, there’s a point at which an LLM is simply good enough.

If you need to auto generate a corpo-speak email, you can already do that without many issues. Reformat notes or user input? Already possible. Classify tickets by type? Done. Write a silly poem? That’s been possible since pre-ChatGPT. Summarize a webpage? The newest version of ChatGPT will probably do just as well as the last at that.

At a certain point, spending millions of dollars for a 1% performance improvement doesn’t make sense when the existing model just already does what you need it to do.

I’m sure we’ll see development, but I doubt we’ll see a massive increase in training just because the cost to run and train the model has gone down.

ArchRecord@lemm.ee · 8 days ago

That set of tokens/s is the performance, or response time if you’d like to call it that. GPT-o1 tends to get anywhere from 33-60, whereas in the example I showed previously, a Raspberry Pi can do 200 on a distilled model.

Now, granted, a distilled model will produce worse performance than the full one, as seen in a benchmark comparison done by DeepSeek here (I’ve outlined the most distilled version of the newest DeepSeek model, which is likely the kind that is being run on the Raspberry Pi, albeit likely with some changes made by the author of that post, as well as OpenAI’s two most high-end models of a comparable distillation)

The gap in quality is relatively small for a model that is likely distilled far past what OpenAI’s “mini” model is, when you consider that even regular laptop/PC hardware is orders of magnitudes more powerful than a Raspberry Pi, or that an external AI accelerator can be bought for as little as $60, the quality in practice could be very comparable with even slightly less distillation, especially with fine-tuning for a given use case (e.g. a local version of DeepSeek in a code development platform would be fine-tuned specifically just to produce code-related results)

If you get into the region of only cloud-hosted instances of DeepSeek that are running at-scale on GPUs like OpenAI’s models are, the performance is only 1-2 percentage points off from OpenAI’s model, at about 3-6% of the cost, which effectively means 3-6% of the total amount of GPU power being paid for compared to the amount of GPU power OpenAI is paying for.

ArchRecord@lemm.ee · 8 days ago

Here’s someone doing 200 tokens/s (for context, OpenAI doesn’t usually get above 100) on… A Raspberry Pi.

Yes, the “$75-$120 micro computer the size of a credit card” Raspberry Pi.

If all these AI models can be run directly on users devices, or on extremely low end hardware, who needs large quantities of top of the line GPUs?

ArchRecord@lemm.ee · 11 days ago

Plugins are also keeping me on Obsidian as opposed to using LogSeq, but I’m essentially keeping it in my back pocket as a “fire exit” in the case of Obsidian enshittifying, since of course all Obsidian notes are in markdown and cross-compatible.

ArchRecord@lemm.ee · edit-2 11 days ago

Practically every single FOSS application I use is highly useful to me, and of course, free, so I’ll just list them all here.

Immich - A full-featured replacement for Google Photos, has a sleek UI, face detection, albums, a timeline, etc.
Paperless-ngx - Document management system, saves me a ton of paper hoarding, and makes everything easily searchable with OCR.
Syncthing - Simple file synchronization between my devices, on my terms. Doesn’t share data with big tech companies about my files, and hooks up extremely fast P2P connections that beat cloud-based services by a long shot.
Metube & Seal - Simple interfaces for downloading with yt-dlp, can download from YouTube, but also many other sites. Doesn’t spam you with popup ads or junk redirects like those “youtube downloader” type sites. Seal is my favorite of the two, but is only on Android.
Image Toolbox - Insanely feature-packed app for doing practically anything you could want to an image. Converting formats, clearing EXIF data, removing backgrounds, feature-packed editing, OCR, convert to SVG, create color palettes, converting PDFs to images, decode and encode Base64 to and from images, extract frames from gifs, encrypt & decrypt files, make zip files, and a lot more. All local.
Rustdesk - No-nonsense remote desktop, tons of features, simple file transfer, cross-platform compatibility, and P2P communication without needing a third party server if you so choose.
LibreOffice - Essentially everything you’d get with Office 365 (e.g. Word, Excel, PowerPoint) but without the $150 price point. Compatible with the same file formats, and has the same functionality.
Cashew - Feature rich financial app for budgeting, tracking purchases, saving for goals, etc. Doesn’t have automatic import, but I find that manually putting every transaction in keeps me aware of my spending much better than before, so for me it’s quite worth it. Install directly from the APK, or use on web though. The version on the app stores has some features locked behind a paywall.
Linkwarden - Bookmark manager with cross-platform support, a web interface, automatic tagging, automatic archiving of any saved links in multiple formats, collaborative sharing capabilities, and more. It’s free, but you can also pay $3/mo if you want them to host it for you.

Edit: And Umbrel (on Raspberry Pi) if you want to host things more easily. Basically just a much more hands-off, user-friendly docker for people who don’t want to tinker as much.

Edit 2: Non-FOSS, but Obsidian is the best note taking app I’ve ever used. Great selection of community-made plugins (which are FOSS) for additional functionality, and all notes are in standard cross-software-compatible Markdown. No locked-in proprietary formats.

ArchRecord@lemm.ee · 15 days ago

Proton does this.

ArchRecord@lemm.ee · 18 days ago

I hate community notes, it’s a cost free way of fact checking with no accountability.

I don’t think it’s necessarily bad, but it can be harmful if done on a platform that has a significant skew in its political leanings, because it can then lead to the assumption that posts must be true because they were “fact checked” even if the fact check was actually just one of the 9:1 ratio of users that already believes that one thing.

However, on platforms that have more general, less biased overall userbases, such as YouTube, a community notes system can be helpful, because it directly changes the platform incentives and design.

I like to come at this from the understanding that the way a platform is designed influences how it is used and perceived by users. When you add a like button but not a dislike button, you only incentivize positive fleeting interactions with posts, while relegating stronger negative opinions to the comments, for instance. (see: Twitter)

If a platform integrates community notes, that not only elevates content that had any effort at all made to fact check it (as opposed to none at all) but it also means that, to get a community note, somebody must at least attempt to verify the truth. And if someone does that, then statistically speaking, there’s at least a slightly higher likelihood that the truth is made apparent in that community note than if none existed to incentivize someone to fact check in the first place.

Again, this doesn’t work in all scenarios, nor is it always a good decision to add depending on a platform’s current design and general demographic political leanings, but I do think it can be valuable in some cases. (This also heavily depends on who is allowed access to create the community notes, of course)

ArchRecord@lemm.ee · 18 days ago

There is some logic to using crypto, but solely using it as « haha numbers go up, profit, profit! » is stupid

I heavily agree with this. I see too much blanket anti-crypto sentiment regardless of the possible use case.

When I pay for my VPN, paying in XMR means they can’t tie my real-world name and address from my card to my account. That’s objectively beneficial compared to my VPN knowing my exact name and address in conjunction with my browsing activity.

If I want to donate to a creative in a different country but they can’t use traditional banking rails that connect to my country, how else do I send them money online?

Sure, there’s a ton of issues with crypto not just in practice, but even in concept, but as you said, there is some logic to using crypto.

ArchRecord@lemm.ee · 18 days ago

This makes sense to me from a framing perspective. As an American myself, despite my best efforts, I still fall into the same trap of sort of assuming everything is much more American centric than it actually is, including other people’s opinions on American politics from outside America.

His post does come off as wildly tone deaf, but seeing how he would have perceived it, it makes a lot of sense. He endorses policy by a party that shared his values, and then gets pushback for it from people who support his values. I’d probably be as confused as him if I was in his shoes.

ArchRecord@lemm.ee · 29 days ago

Local AI Tagging (Optional)

Finally a good use of AI that doesn’t overpromise functionality it can’t actually provide. Just a system for rewriting page text as tags. This is a feature I’ll legitimately use.

ArchRecord@lemm.ee · 3 months ago

Holotypic Occlupanid Research Group - A Database of Bread Clip Research

ArchRecord@lemm.ee · 5 months ago

‘Right to Repair for Your Body’: The Rise of DIY, Pirated Medicine

ArchRecord

Holotypic Occlupanid Research Group - A Database of Bread Clip Research

Holotypic Occlupanid Research Group - A Database of Bread Clip Research

‘Right to Repair for Your Body’: The Rise of DIY, Pirated Medicine

‘Right to Repair for Your Body’: The Rise of DIY, Pirated Medicine