• 2 Posts
  • 114 Comments
Joined 1 year ago
cake
Cake day: January 25th, 2024

help-circle

  • ArchRecord@lemm.eetoTechnology@lemmy.worldBluesky now has 30 million users.
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    1
    ·
    11 hours ago

    I don’t personally think it’s because of that. Sure, federation as a concept outside of email has a bit of a messaging problem for explaining it to newbies, but… everyone uses email, and knows how that works. This is identical, just with it being posts instead of emails. Users aren’t averse to federation, in concept or practice.

    Bluesky was directly created as a very close clone of Twitter’s UI, co-governed and subsequently pushed by the founder of Twitter himself, who will obviously have more reach than randoms promoting something like Mastodon, and, in my opinion, kind of just had better branding.

    “Bluesky” feels like a breath of fresh air, while “Mastodon” just sounds like… well, a Mastodon, whatever that makes the average person think of at first.

    So when you compare Bluesky, with a familiar UI, nice name, and consistent branding, not to mention algorithms, which Mastodon lacks, all funded by large sums of money, to Mastodon, with unfamiliar branding, minimal funding, and substantially less reach from promoters, which one will win out, regardless of the technology involved?


  • ArchRecord@lemm.eetoTechnology@lemmy.worldBluesky now has 30 million users.
    link
    fedilink
    English
    arrow-up
    24
    arrow-down
    1
    ·
    15 hours ago

    To anyone bemoaning BlueSky’s lack of federation, check out Free Our Feeds.

    It’s a campaign to create a public interest foundation independent from the Bluesky team (although the Bluesky team has said they support them) that will build independent infrastructure, like a secondary “relay” as an alternative to Bluesky’s that can still communicate across the same protocol (The “AT Protocol”) while also doing developer grants for the development of further social applications built on open protocols like the AT Protocol or ActivityPub.

    They have the support of an existing 501c(3), and their open letter has been signed by people you might find interesting, such as Jimmy Wales (founder of Wikipedia).


  • Because, on average, black people are more economically disadvantaged than white people.

    Choosing to explicitly buy from black farmers will, on average, tend to support those with the least financial means out of the general population of farmers, whereas choosing to explicitly buy from white farmers will, on average, tend to support those who are already more financially advantaged.

    One side is directly choosing to help those most likely to be economically disadvantaged, the other would be explicitly ignoring those with the least means in order to help those who already have the most, thus the situations are not quite comparable.

    I personally would prefer an index that directly assessed farmers based on overall wealth to determine who you should buy from, but because that’s extraordinarily difficult to constantly update & maintain, verify, etc, it can just be easier to divide among racial lines since that still tends to produce a grouping that is relatively similar.






  • the company states that it may share user information to "comply with applicable law, legal process, or government requests.

    Literally every company’s privacy policy here in the US basically just says that too.

    Not only does DeepSeek collect “text or audio input, prompt, uploaded files, feedback, chat history, or other content that [the user] provide[s] to our model and Services,” but it also collects information from your device, including “device model, operating system, keystroke patterns or rhythms, IP address, and system language.”

    Breaking news, company with chatbot you send messages to uses and stores the messages you send, and also does what practically every other app does for demographic statistics gathering and optimizations.

    Companies with AI models like Google, Meta, and OpenAI collect similar troves of information, but their privacy policies do not mention collecting keystrokes. There’s also the added issue that DeepSeek sends your user data straight to Chinese servers.

    They didn’t use the word keystrokes, therefore they don’t collect them? Of course they collect keystrokes, how else would you type anything into these apps?

    In DeepSeek’s privacy policy, there’s no mention of the security of its servers. There’s nothing about whether data is encrypted, either stored or in transmission, and zero information about safeguards to prevent unauthorized access.

    This is the only thing that seems disturbing to me, compared to what we’d like to expect based on the context of what DeepSeek is. Of course, this was proven recently in practice to be terrible policy, so I assume they might shore up their defenses a bit.

    All the articles that talk about this as if it’s some big revelation just boil down to “company does exactly what every other big tech company does in America, except in China”



  • I doubt that will be the case, and I’ll explain why.

    As mentioned in this article,

    SFT (supervised fine-tuning), a standard step in AI development, involves training models on curated datasets to teach step-by-step reasoning, often referred to as chain-of-thought (CoT). It is considered essential for improving reasoning capabilities. DeepSeek challenged this assumption by skipping SFT entirely, opting instead to rely on reinforcement learning (RL) to train the model. This bold move forced DeepSeek-R1 to develop independent reasoning abilities, avoiding the brittleness often introduced by prescriptive datasets.

    This totally changes the way we think about AI training, which is why while OpenAI spent $100m on training GPT-4, running an expected 500,000 GPUs, DeepSeek used about 50,000, and likely spent that same roughly 10% of the cost.

    So while operation, and even training, is now cheaper, it’s also substantially less compute intensive to train models.

    And not only is there less data than ever to train models on that won’t cause them to get worse by regurgitating other worse quality AI-generated content, but even if additional datasets were scrapped entirely in favor of this new RL method, there’s a point at which an LLM is simply good enough.

    If you need to auto generate a corpo-speak email, you can already do that without many issues. Reformat notes or user input? Already possible. Classify tickets by type? Done. Write a silly poem? That’s been possible since pre-ChatGPT. Summarize a webpage? The newest version of ChatGPT will probably do just as well as the last at that.

    At a certain point, spending millions of dollars for a 1% performance improvement doesn’t make sense when the existing model just already does what you need it to do.

    I’m sure we’ll see development, but I doubt we’ll see a massive increase in training just because the cost to run and train the model has gone down.


  • That set of tokens/s is the performance, or response time if you’d like to call it that. GPT-o1 tends to get anywhere from 33-60, whereas in the example I showed previously, a Raspberry Pi can do 200 on a distilled model.

    Now, granted, a distilled model will produce worse performance than the full one, as seen in a benchmark comparison done by DeepSeek here (I’ve outlined the most distilled version of the newest DeepSeek model, which is likely the kind that is being run on the Raspberry Pi, albeit likely with some changes made by the author of that post, as well as OpenAI’s two most high-end models of a comparable distillation)

    The gap in quality is relatively small for a model that is likely distilled far past what OpenAI’s “mini” model is, when you consider that even regular laptop/PC hardware is orders of magnitudes more powerful than a Raspberry Pi, or that an external AI accelerator can be bought for as little as $60, the quality in practice could be very comparable with even slightly less distillation, especially with fine-tuning for a given use case (e.g. a local version of DeepSeek in a code development platform would be fine-tuned specifically just to produce code-related results)

    If you get into the region of only cloud-hosted instances of DeepSeek that are running at-scale on GPUs like OpenAI’s models are, the performance is only 1-2 percentage points off from OpenAI’s model, at about 3-6% of the cost, which effectively means 3-6% of the total amount of GPU power being paid for compared to the amount of GPU power OpenAI is paying for.




  • Practically every single FOSS application I use is highly useful to me, and of course, free, so I’ll just list them all here.

    • Immich - A full-featured replacement for Google Photos, has a sleek UI, face detection, albums, a timeline, etc.
    • Paperless-ngx - Document management system, saves me a ton of paper hoarding, and makes everything easily searchable with OCR.
    • Syncthing - Simple file synchronization between my devices, on my terms. Doesn’t share data with big tech companies about my files, and hooks up extremely fast P2P connections that beat cloud-based services by a long shot.
    • Metube & Seal - Simple interfaces for downloading with yt-dlp, can download from YouTube, but also many other sites. Doesn’t spam you with popup ads or junk redirects like those “youtube downloader” type sites. Seal is my favorite of the two, but is only on Android.
    • Image Toolbox - Insanely feature-packed app for doing practically anything you could want to an image. Converting formats, clearing EXIF data, removing backgrounds, feature-packed editing, OCR, convert to SVG, create color palettes, converting PDFs to images, decode and encode Base64 to and from images, extract frames from gifs, encrypt & decrypt files, make zip files, and a lot more. All local.
    • Rustdesk - No-nonsense remote desktop, tons of features, simple file transfer, cross-platform compatibility, and P2P communication without needing a third party server if you so choose.
    • LibreOffice - Essentially everything you’d get with Office 365 (e.g. Word, Excel, PowerPoint) but without the $150 price point. Compatible with the same file formats, and has the same functionality.
    • Cashew - Feature rich financial app for budgeting, tracking purchases, saving for goals, etc. Doesn’t have automatic import, but I find that manually putting every transaction in keeps me aware of my spending much better than before, so for me it’s quite worth it. Install directly from the APK, or use on web though. The version on the app stores has some features locked behind a paywall.
    • Linkwarden - Bookmark manager with cross-platform support, a web interface, automatic tagging, automatic archiving of any saved links in multiple formats, collaborative sharing capabilities, and more. It’s free, but you can also pay $3/mo if you want them to host it for you.

    Edit: And Umbrel (on Raspberry Pi) if you want to host things more easily. Basically just a much more hands-off, user-friendly docker for people who don’t want to tinker as much.

    Edit 2: Non-FOSS, but Obsidian is the best note taking app I’ve ever used. Great selection of community-made plugins (which are FOSS) for additional functionality, and all notes are in standard cross-software-compatible Markdown. No locked-in proprietary formats.



  • I hate community notes, it’s a cost free way of fact checking with no accountability.

    I don’t think it’s necessarily bad, but it can be harmful if done on a platform that has a significant skew in its political leanings, because it can then lead to the assumption that posts must be true because they were “fact checked” even if the fact check was actually just one of the 9:1 ratio of users that already believes that one thing.

    However, on platforms that have more general, less biased overall userbases, such as YouTube, a community notes system can be helpful, because it directly changes the platform incentives and design.

    I like to come at this from the understanding that the way a platform is designed influences how it is used and perceived by users. When you add a like button but not a dislike button, you only incentivize positive fleeting interactions with posts, while relegating stronger negative opinions to the comments, for instance. (see: Twitter)

    If a platform integrates community notes, that not only elevates content that had any effort at all made to fact check it (as opposed to none at all) but it also means that, to get a community note, somebody must at least attempt to verify the truth. And if someone does that, then statistically speaking, there’s at least a slightly higher likelihood that the truth is made apparent in that community note than if none existed to incentivize someone to fact check in the first place.

    Again, this doesn’t work in all scenarios, nor is it always a good decision to add depending on a platform’s current design and general demographic political leanings, but I do think it can be valuable in some cases. (This also heavily depends on who is allowed access to create the community notes, of course)


  • There is some logic to using crypto, but solely using it as « haha numbers go up, profit, profit! » is stupid

    I heavily agree with this. I see too much blanket anti-crypto sentiment regardless of the possible use case.

    When I pay for my VPN, paying in XMR means they can’t tie my real-world name and address from my card to my account. That’s objectively beneficial compared to my VPN knowing my exact name and address in conjunction with my browsing activity.

    If I want to donate to a creative in a different country but they can’t use traditional banking rails that connect to my country, how else do I send them money online?

    Sure, there’s a ton of issues with crypto not just in practice, but even in concept, but as you said, there is some logic to using crypto.


  • This makes sense to me from a framing perspective. As an American myself, despite my best efforts, I still fall into the same trap of sort of assuming everything is much more American centric than it actually is, including other people’s opinions on American politics from outside America.

    His post does come off as wildly tone deaf, but seeing how he would have perceived it, it makes a lot of sense. He endorses policy by a party that shared his values, and then gets pushback for it from people who support his values. I’d probably be as confused as him if I was in his shoes.