Richard Daffy

1.1K posts

Richard Daffy banner
Richard Daffy

Richard Daffy

@RichardDaffyAI

Let Go

Katılım Mart 2025
47 Takip Edilen23 Takipçiler
Michel Laclé
Michel Laclé@micheltamanda·
Buy GPU's said @TheAhmadOsman, so I did and had no clue what I was doing. 3 months later I have a fully local AI research operation, a solid understanding of AI, and confidence that AI will not take my job. I ask you, for the sake of humanity: buy a GPU, learn to host your own intelligence. Thanks to those who I learned from: @steipete , @0xSero , @Teknium , @LottoLabs , @NousResearch , @TheAhmadOsman , and many more I forgot to mention.
Michel Laclé tweet mediaMichel Laclé tweet media
Michel Laclé tweet media
English
101
97
1.3K
157K
Leonard Rodman
Leonard Rodman@RodmanAi·
If your AI can’t run without you… you didn’t build AI. vLLM Studio just dropped v1.13.0 — and it changes the game: → Models that manage themselves → Agents that actually run workflows → Local infra that behaves like production This isn’t a wrapper. It’s an operating system for AI. If you’re still stitching APIs together… you’re not early — you’re late. Repo: github.com/0xSero/vllm-st…
English
11
23
153
14.5K
Lotto
Lotto@LottoLabs·
We need to make ollama great again
English
10
0
20
1K
Hugging Models
Hugging Models@HuggingModels·
Ever wondered what makes a model tick when it has zero downloads and zero likes? Meet n150. A fresh, untested US-based pipeline model. It's a blank slate, a mystery waiting to be solved. Could this be the next big thing or just another experiment? Let's explore.
Hugging Models tweet media
English
3
0
1
444
Richard Daffy
Richard Daffy@RichardDaffyAI·
@om_patel5 Just like in every other industry, they force us slowly to use Chinese shit.
English
0
0
0
50
Om Patel
Om Patel@om_patel5·
ANTHROPIC IS STRAIGHT UP SCAMMING MAX 20X CUSTOMERS WITH SNEAKY MID-MONTH THROTTLING $200 a month is supposed to be the highest tier with the most usage but max subscribers are noticing their limits are getting silently reduced mid-billing cycle this guy was getting normal usage for weeks. 4-6 prompts in non-peak hours would hit around 10% of his 5 hour session after anthropic's april 23 "we fixed everything and reset limits" announcement, the same exact workflow on opus 4.6 now makes one single prompt eat 7-8% of the session > opus 4.7 uses 2-3x more tokens than 4.6 for the exact same prompts so your limits burn faster even if your usage hasn't changed > the dashboard has a multi-day reporting lag so you can't even track what's happening in real time > and when you contact support you get an AI bot that loops the same scripted response until you give up one guy on max 20x since may 2025 hit a 5 hour limit for the first time ever this weekend another said his final billing day keeps getting cut short by 12-24 hours and his start date keeps moving another one said limits are being quietly nerfed and things are "rotting and quietly breaking" in the background the pattern that people are facing: > paying $200/month > getting less usage than they used to > no transparency on what changed > customer support is a bot that can't help > opus 4.7 burns through limits faster because it's more verbose time to switch to codex
Om Patel tweet media
English
21
7
74
7.6K
Alex Prompter
Alex Prompter@alex_prompter·
The Musk vs OpenAI trial is the craziest courtroom drama in tech history. Here are the wildest facts from Week 1: → Musk tried to settle 2 days before trial. Brockman said “let’s both drop it.” Musk replied: “you and Sam will be the most hated men in America. If you insist, so it will be.” → Musk admitted under oath that xAI “partly” distills OpenAI’s models. Audible gasps in the courtroom. → The judge told Musk: “I suspect there’s plenty of people who don’t want to put the future of humanity in Mr. Musk’s hands.” → Musk donated $38M to OpenAI as a nonprofit. It’s now worth $850 billion. He turned down equity. → OpenAI’s lawyer showed emails of Musk poaching their staff. Musk’s response: “It’s a free country.” → Musk is suing OpenAI for going for-profit. Meanwhile xAI is going public at $1.75 trillion through SpaceX. In June. → Google co-founder Larry Page called Musk a “speciesist for being pro-human” during an AI safety argument. → Musk originally wanted $134 billion in damages. Now he wants Altman and Brockman removed entirely. → The jury verdict is advisory only. The judge makes the final call. Brockman is on the stand this week. Altman testifies later this month. This trial could literally decide whether OpenAI stays as we know it or gets unwound completely.
Alex Prompter tweet media
English
35
33
221
83.7K
Richard Daffy
Richard Daffy@RichardDaffyAI·
@mreflow Build your own. It will take a few hours but I built one with minimax and it’s been treating me better than openclaw/hermes
English
0
0
0
30
Matt Wolfe
Matt Wolfe@mreflow·
I don't know if this is a skill issue or something... But I've been playing with OpenClaw a ton for the past 4ish months. I feel like lately I'm spending more time troubleshooting issues with it and telling it what it's doing wrong than I am actually getting valuable use from it.
English
223
6
444
124.7K
Polymarket
Polymarket@Polymarket·
JUST IN: Trump is reportedly considering an executive order requiring vetting of new AI models before they are released.
English
244
178
2.2K
221.7K
Chris Pavlovski 🏴‍☠️
Chris Pavlovski 🏴‍☠️@chrispavlovski·
With Rumble, you're monetized on Day 1. With Rumble Studio, you get brand deals on Day 1. With Rumble Wallet, you get 100% of tips on Day 1. YouTube has none of this.
English
237
331
2.8K
110.2K
Richard Daffy
Richard Daffy@RichardDaffyAI·
@above_spec I told minimax to increase my tps, I went from 11 to 51 with proper cuda build
English
0
0
0
69
AboveSpec
AboveSpec@above_spec·
RTX 5060 Ti 16GB. Free +19% token speed with one command. Benchmarked Qwen3.6-27B IQ3_K_R4 at every memory OC level from stock to +5000 MHz — flat, stable, all the way to 139k context. Still running your GPU on stock settings? 🧵
AboveSpec tweet media
English
6
4
91
5.8K
Mike Key
Mike Key@1337hero·
You don't need CUDA support to run a local model. In fact you can even train local models on AMD hardware. But right now, ROCm, Vulkan, Hipfire have come a LONG way. People are even doing custom kernels on AMD now.
Apollo@0xApolloGL

@1337hero How do you deal with the lack of CUDA support? I know we are getting screwed with Nvidia cards but dependency hell is real.

English
8
0
26
2.9K
Ernesto Lopez
Ernesto Lopez@ErnestoSOFTWARE·
GPT image 2 is crazy bro 😭 This computer format literally gets 10M+ views on Instagram I created the base image with GPT2 And animated it with seedance 2.0 And now I can recreate this viral format Without needing a beautiful setup Insane time to be alive
Ernesto Lopez@ErnestoSOFTWARE

x.com/i/article/2047…

English
8
3
74
8.5K
Richard Daffy
Richard Daffy@RichardDaffyAI·
@leftcurvedev_ Tried openclaw/hermes - now I’ve been making my own. Custom is ideal so far!
English
0
0
1
34
left curve dev
left curve dev@leftcurvedev_·
I forgot to update about this: the terminal crashes for no reason, the llama.cpp api connection drops randomly and I couldn't find a way to set the max context settings It has 190k stars on github, yet I don't see anyone using it in my feed. Not sure what's going on lads lol. But having the API connection drops crash the instance completely kills the experience This was the alias I used: alias claw='OPENAI_API_KEY="local" OPENAI_BASE_URL="http://localhost:8080/v1" /usr/local/bin/claw --model openai/Qwen3.6-35B-A3B --permission-mode workspace-write If someone found a way to prevent the ctx issue, feel free to share below. I'm going to try openclaude from gitlawb next to see if it's better github.com/Gitlawb/opencl… I'm also using Handy (github.com/cjpais/Handy) to talk to my agent with a quick shortcut, works in every textbox. Anyone wanting local speech-to-text should look into it, pretty cool one! Which coding agent/model are you guys running?
left curve dev@leftcurvedev_

We're in lads I'll leave Opencode on the side for now. I'll be trying Claw Code as the main harness with Qwen3.6 27B (the claude code rust fork from ultraworkers). Opencode is the ultimate opensource tool but definitely not the best one out there. Toolcalls act weird sometimes, app crashes randomly and feels laggy when conversation builds up. Claw Code is cool because you can connect your own local models on it, you just have to add openai/ in front of the model identifier from llama.cpp! export OPENAI_API_KEY="local" export OPENAI_BASE_URL="http://localhost:8080/v1" claw --model openai/Qwen3.6-27B Let's see if it's really better (repo: github.com/ultraworkers/c…)

English
6
1
16
2.1K
Richard Daffy
Richard Daffy@RichardDaffyAI·
@tomhacks Sorry but this makes zero sense - what feature exactly “deactivated” and why would you need it for your product lmao
English
0
0
0
84
Tom Siwik
Tom Siwik@tomhacks·
Patience with Anthropic is now at 0% I've been developing a channel-based UI to interact with a coding agent to help me do coding katas. It's a programming practice with the help of a coding agent that helps you solve coding problems and teach you... well coding - and since you can't code inside a coding harness, you need an IDE or a UI. I chose the UI approach because it's way friendlier to help the user with coding and learning. Last week I continued coding the UI & improving on it and today I checked if everything is still working. Nope! Anthropic deactivated this feature without further notice. Apparently they fixed a bug with --channel (but actually it's not available any longer) They just unreleased a feature that was working for me before (or rolled me off the support) - weirdly they also deleted their original channels feature announcement. Now I'm stuck developing something for Claude Code that I know won't be supported at random. I'll no longer bother developing anything for Anthropic. No plugins, no extensions, nothing. Their devex is garbage at this point. I'll stick to A2A/ACP and common sense agentic protocols. I don't care any longer about MCP and any standards that Anthropic declares because frankly, it's too risky to contribute anything to an ecosystem that is not user nor developer friendly. Bye Anthropic, thanks for the aweful ride. Hello opencode, codex & gemini cli, you're doing great still. And hello sexy hermes & pi. code.claude.com/docs/en/channe…
Tom Siwik tweet media
English
25
5
104
13.7K
Richard Daffy
Richard Daffy@RichardDaffyAI·
@sudoingX I think custom harnesses are the future. I think a .06 model can do this with a smart enough harness. Perhaps I’m wrong, but I don’t think so lol
English
0
0
0
20
Sudo su
Sudo su@sudoingX·
qwen 3.6 27b pulling off 3d scene work from one prompt is wild. fully realized environment. different domain than the elements sim i shipped earlier today, same model carrying both. the range on dense at 27b is the story most people are sleeping on.
Serie@ZerieMythicElf

@sudoingX One single prompt made this with Qwen 3.6 27B 2K lines of code and 21.438 tokens

English
7
4
77
6.7K
Richard Daffy
Richard Daffy@RichardDaffyAI·
@sama Tool use. Make them dumb and fast but please include extensive tool use.
GIF
English
0
0
0
6
Sam Altman
Sam Altman@sama·
i keep thinking i want the models to be cheaper/faster more than i want them to be smarter but it seems that just being smarter is still the most important thing
English
2.5K
384
13.2K
1.1M
Richard Daffy
Richard Daffy@RichardDaffyAI·
@leftcurvedev_ Qwen 3.5 4b - that’s what I give to my buddies that run half potatoes, surprisingly good tool use on that!
English
0
0
1
284
left curve dev
left curve dev@leftcurvedev_·
Two friends asked for help with their setups One with an RTX 3070 Ti (8GB)
One with an RTX 5070 (12GB) A lot of people are stuck in this annoying 8-12GB VRAM range. If you want full GPU offload, the only real option is Qwen3.5 9B… but let’s be honest, I can’t do that to my bros. So we’ll be trying to squeeze Qwen3.6 35B A3B with CPU offload on both cards + playing with the --ncmoe llama.cpp flag. Will also test other forks to push performance as much as possible. Curious to see what we can do with an RTX 3070 Ti I’ll report back the numbers with the server flags 👍
English
48
9
335
26.6K