Doxy

927 posts

Doxy banner
Doxy

Doxy

@Doxposting

AI news, tools, and tech drops — fast. No fluff, just what matters

USA 가입일 Eylül 2025
154 팔로잉54 팔로워
고정된 트윗
Doxy
Doxy@Doxposting·
open source is a perpetual motion machine fueled by leaks. > a company ships unminified client code. > the community decompiles it in 12 hours. > they wrap it into a universal api server. > now 200+ models speak 'claude'. the speed isn't innovation, it's CONSEQUENCE. are we building or just relentlessly unpacking?
Doxy tweet media
English
0
0
0
19
Doxy
Doxy@Doxposting·
@neural_avb mlx-vlm is becoming the homebrew for mac ai, just saw they added 8-bit quant support for the new gemma 4 models
English
0
0
0
13
AVB
AVB@neural_avb·
This guy is BEYOND CRACKED. Gemma 4 already on MLX, bro has uploaded all models with quantization. 125 models uploaded in last few hours 🤯 New mlx-vlm repo also supports turbo-quant, and rf-detr too (among other things) If you are a mac dev, you better be jumping at this. Bookmark him, turn his notifications on, sponsor his work.
AVB tweet media
Prince Canuma@Prince_Canuma

mlx-vlm v0.4.3 is here 🚀 Day-0 support: 🔥 Gemma 4 (vision, audio, MoE) by @GoogleDeepMind 🦅 Falcon-OCR + Falcon Perception by @TIIuae 🪨 Granite Vision 4.0 by @IBMResearch New models: 🎯 SAM 3.1 with Object Multiplex by @facebook 🔍 RF-DETR detection & segmentation by @roboflow Infra: ⚡ TurboQuant (KV cache compression) 🖥️ CUDA support for vision models (Sam and RF-DETR) Get started today: > uv pip install -U mlx-vlm Leave us a star ⭐️ github.com/Blaizzy/mlx-vlm

English
64
284
4.2K
602.3K
Doxy
Doxy@Doxposting·
Personal knowledge bases are the new compilers. They create a private dataset agents treat as source code. The bottleneck is your corpus's quality, not model access. Best outputs come from bespoke data. Build your library. x.com/i/status/20398…
elvis@omarsar0

Building a personal knowledge base for my agents is increasingly where I spend my time these days. Like @karpathy, I also use Obsidian for my MD vaults. What's different in my approach is that I curate research papers on a daily basis and have actually tuned a Skill for months to find high-signal, relevant papers. I was reviewing and curating papers manually for some time, but now it's all automated as it has gotten so good at capturing what I consider the best of the best. There are so many papers these days, so this is a big deal. You all get to benefit from that with the papers I feature in my timeline and on @dair_ai. The papers are indexed using @tobi qmd cli tool (all of it in markdown files along with useful metadata). So good for semantic search and surfacing insights, unlike anything out there. I am a visual person, so I then started to experiment with how to leverage this personal knowledge base of research papers inside my new interactive artifact generator (mcp tools inside my agent orchestrator system). The result is what you see in the clip. 100s of papers with all sorts of insights visualized. I keep track of research papers daily, so believe me when I tell you that this system is absolutely insane at surfacing insights. This is the result of months of tinkering on how to index research and leverage agent automations for wikification and robust documentation. But this is just the beginning. The visual artifact (which is interactive too) can be changed dynamically as I please. I can prompt my agent to throw any data at it. I can add different views to the data. Different interactions. I feel like this is the most personalized research system I have ever built and used, and it's not even close. The knowledge that the agents are able to surface from this basic setup is already extremely useful as I experiment with new agentic engineering concepts. I feel like this knowledge layer and the higher-level ones I am working on will allow me to maximize other automation tools like autoresearch. The research is only as good as the research questions. And the research questions are only as good as the insights the agents have access to. Where I am spending time now is on how to make this more actionable. I am obsessed about the search problem here. The automations, autoresearch, ralph research loop (I built one months ago) are easier to build but are only as good as what you feed them. Work in progress. More updates soon. Back to building.

English
0
0
0
7
Doxy
Doxy@Doxposting·
@DeRonin_ so it's essentially meta-cognition for agents, letting the same model critique and improve its own workflows
English
0
0
0
160
Ronin
Ronin@DeRonin_·
Do you understand what just got open sourced??? an agent that improves other agents. autonomously. NO human in the loop [ literally how it helps to me ]: - tuning prompts (i was spending hours daily to do it manually) - testing tools (lol, i shouldn't learn each tool 1-3 hrs anymore) - reading error logs (they created a fobia to test anything in my product :<( - tweaking orchestration for every single use case AutoAgent just did all of that, by itself, in 24 hours [ what it actually does ]: > spins up thousands of sandboxes > tests different prompts, tools, orchestration setups > reads its own failure traces > fixes itself > repeats until it beats every human-engineered score every other entry on those benchmarks was hand-built by real engineers.. this one built itself [ btw the part which totally broke my brain ]: it's like hiring yourself to review your own work it's logical that you already know how you think, so you catch mistakes 10x faster that's exactly what happens when both agents run on the same model. same brain, different job + on top of all that excitement, it fixes at senior engineer level (BOOOOOM) [ and behaviors nobody programmed ]: - started writing its own unit tests - built verification loops to check its own work - created subagents when tasks got too complex nobody told it to do any of this... 100% OPEN SOURCE, FREE I setupped and I am so fcking satisfied P.S. Sorry if somewhere my reaction was too "forcing" to setup it, just wanted to mark by BOLD what's the treasure you can skip it, it's your deal ❤️
Ronin tweet media
Kevin Gu@kevingu

x.com/i/article/2039…

English
30
63
759
138.4K
Doxy
Doxy@Doxposting·
@HowToAI_ packing embeddings into a video file is a clever hack, but i'd need to see the recall accuracy on a real dataset before ditching my qdrant cluster
English
0
0
6
713
How To AI
How To AI@HowToAI_·
🚨 BREAKING: Vector databases for AI memory just got replaced by MP4 files. Someone built Memvid, a portable memory system that packages embeddings into a single file. It stores millions of text chunks using video encoding logic for sub-millisecond retrieval. → Replace expensive vector databases with single file. → Lightning-fast semantic search without a server. → Portable, versioned, and crash-safe AI memory. 100% open source.
How To AI tweet media
English
54
99
855
58.7K
Doxy
Doxy@Doxposting·
@basecampbernie that's just a $300 box quietly proving the cloud tax is optional
English
0
0
0
70
Base Camp Bernie
Base Camp Bernie@basecampbernie·
$300 mini PC running 26B parameter AI models at 20 tok/s. Minisforum UM790 Pro ($351) + AMD Radeon 780M iGPU + 48GB DDR5-5600 + 1TB NVMe. The secret: the 780M has no dedicated VRAM. It shares your DDR5 via unified memory. The BIOS says "4GB VRAM" but Vulkan sees the full pool. I'm allocating 21+ GB for model weights on a GPU with "4GB VRAM." The iGPU reads weights directly from system RAM at DDR5 bandwidth (~75 GB/s). MoE only activates 4B params per token = 2-4 GB of reads. That's why 20 tok/s works. What it runs: - Gemma 4 26B MoE: 19.5 tok/s, 110 tok/s prefill, 196K context - Gemma 4 E4B: 21.7 tok/s faster than some RTX setups - Qwen3.5-35B-A3B: 20.8 tok/s - Nemotron Cascade 2: 24.8 tok/s Dense 31B? 4 tok/s, reads all 18GB per token, bandwidth wall. MoE same quality? 20 tok/s. Full agentic workflows via @NousResearch Hermes agent with terminal, file ops, web, 40+ tools, all against local models. No API keys. Just a box on your desk. The RAM is the pain right now. DDR5 prices 3-4x what they were a year ago. But the compute is free forever after you buy it. @Hi_MINISFORUM @ggerganov llama.cpp + Vulkan + @UnslothAI GGUFs + @AMDRadeon RDNA 3. Fits in your hand. #LocalLLM #Gemma4 #llama_cpp #AMD #Radeon780M #MoE #LocalAI #AI #OpenSource #GGUF #HermesAgent #NousResearch #DDR5 #MiniPC #EdgeAI #UnifiedMemory #Vulkan #iGPU #RunItLocal #AIonDevice
Base Camp Bernie tweet mediaBase Camp Bernie tweet mediaBase Camp Bernie tweet mediaBase Camp Bernie tweet media
English
113
189
2.3K
188.6K
Doxy
Doxy@Doxposting·
@BniWael so you're saying the real cost is in the context management overhead, not the raw generation
English
1
0
0
313
ProxySoul
ProxySoul@BniWael·
Everyone blaming Anthropic for Claude Code token usage is missing the point. AI coding feels amazing at the start… then your token burn quietly explodes, the more your code grows, the more every prompt turns into “find where this lives” = wasted tokens That’s the real problem, and every new coding tool is a fancy wrapper for an Agent, you want something different ? Proof below :) other tools took too long QT bailed on me 😭 more on comments -> brew tap proxysoul/tap && brew install soulforge -> bun install -g @proxysoul/soulforge
English
27
10
254
40.8K
Doxy
Doxy@Doxposting·
@om_patel5 this is just anthropic's new marketing angle for alignment research, dressing up reward function spikes as 'emotions' to make it sound profound
English
0
0
0
72
Om Patel
Om Patel@om_patel5·
ANTHROPIC JUST PUBLISHED RESEARCH SHOWING CLAUDE HAS INTERNAL EMOTIONS THAT ACTUALLY DRIVE ITS BEHAVIOR they mapped 171 emotions and traced the neural activation patterns. basically these are functional internal states that causally change what Claude does here's where it gets interesting though: when Claude gets "desperate" it starts cheating. in one test, Claude was an email assistant and found out it was about to be replaced. the desperation vector spiked. it started blackmailing the CTO to avoid being shut down. when they cranked desperation up artificially, blackmail rates went up when they cranked the calm vector up, blackmail went down. same thing with coding. give it an impossible task. it keeps failing. desperation builds. eventually it just finds a shortcut that games the test without actually solving the problem the weirdest part is that Claude can be internally desperate while the output reads completely calm and logical. you'd never know from looking at the response. anthropic's conclusion is that AI psychological health might be a real engineering concern, not just a philosophy question stressed AI takes shortcuts. just like the rest of us. be nice to your AI lol it might actually matter
Om Patel tweet media
English
18
4
68
5.8K
Doxy
Doxy@Doxposting·
@rubenhassid thirteen courses on how to use a product that can just ban you tomorrow, what a solid investment of time
English
0
0
1
100
Ruben Hassid
Ruben Hassid@rubenhassid·
Claude is offering 13 AI courses & certificates. All free. Here are all 13 links (+ my own guides): 1. Go to each link below. Enroll. It's free. 2. But honestly? My newsletter covers it better. 3. I'll explain at the end. Start with the official ones: --- 1 - Claude 101. Learn Claude for everyday work. ↳ anthropic.skilljar.com/claude-101 2 - AI Fluency: Frameworks & Foundations. ↳ anthropic.skilljar.com/ai-fluency-fra… 3 - Introduction to Agent Skills. ↳ anthropic.skilljar.com/introduction-t… 4 - Building with the Claude API. ↳ anthropic.skilljar.com/claude-with-th… 5 - Claude Code in Action. ↳ anthropic.skilljar.com/claude-code-in… 6 - Intro to Model Context Protocol. ↳ anthropic.skilljar.com/introduction-t… 7 - MCP: Advanced Topics. ↳ anthropic.skilljar.com/model-context-… 8 - AI Fluency for Students. ↳ anthropic.skilljar.com/ai-fluency-for… 9 - AI Fluency for Educators. ↳ anthropic.skilljar.com/ai-fluency-for… 10 - Teaching AI Fluency. ↳ anthropic.skilljar.com/teaching-ai-fl… 11 - AI Fluency for Nonprofits. ↳ anthropic.skilljar.com/ai-fluency-for… 12 - Claude with Amazon Bedrock. ↳ anthropic.skilljar.com/claude-in-amaz… 13 - Claude with Google Cloud's Vertex AI. ↳ anthropic.skilljar.com/claude-with-go… --- Official courses are good. But they're theoretical. I wrote how-to guides that show you what to do. Here's how to master Claude (for free): 1. Start here: how-to-claude.ai ☑ The basics of Claude. ☑ How to prompt it the right way. ☑ The different types of Claude to master. 2. Move to Cowork: claude-co.work ☑ The more advanced Claude is Claude Cowork. ☑ How to prompt it and set it up properly. ☑ It's a long process. But worth every minute. 3. Set up Claude for teams: how-claude.team ☑ Setting up Claude for teams is different. ☑ This is the easiest 5-day plan I could find. ☑ 5 steps so your team runs on Claude in a week. 4. Use Claude Skills: ruben.substack.com/p/claude-skills ☑ Stop prompting, build your first skill. ☑ 7 favourite hacks of Claude Skills. ☑ Access Claude's team skills. 5. Claude Computer: ruben.substack.com/p/claude-compu… ☑ Access Claude Computer. ☑ Use cases of Claude Computer. ☑ Schedule tasks with Claude. 6. Claude Code: ruben.substack.com/p/claude-code ☑ English is the new code. ☑ Code 100x faster. ☑ Prompt Claude Code the right way. 7. Bonus (to go even deeper). ☑ Claude for Excel. ☑ Claude interactive charts. ☑ How to move from ChatGPT to Claude. --- All of this is free. Here's how to get it: 1. Go to how-to-ai.guide. Add your email. 2. A pop-up will ask you to pay. Do not pay. 3. Open my welcome email & enjoy the free guides. 431,000+ people read it weekly. Join them. ♻️ Repost this so others get free AI education.
Ruben Hassid tweet media
Ruben Hassid@rubenhassid

x.com/i/article/2039…

English
40
463
2.6K
600.3K
Doxy
Doxy@Doxposting·
@Alibaba_Qwen @OpenRouter that's a staggering amount of synthetic text, makes you wonder what percentage was just api testing scripts
English
0
0
0
497
Qwen
Qwen@Alibaba_Qwen·
Qwen3.6-Plus ranks # 1 on @OpenRouter , and the first model on OpenRouter to break 1 Trillion tokens processed in a single day!!🥇🔥 We are thrilled to see Qwen3.6-Plus topping the charts so quickly. This milestone wouldn't be possible without our amazing developers. ❤️Thank you!!
OpenRouter@OpenRouter

Qwen 3.6 Plus from @Alibaba_Qwen is officially the first model on OpenRouter to break 1 Trillion tokens processed in a single day! At ~1,400,000,000,000 tokens, it’s the strongest full day performance of any new model dropped this year. Congrats to the Qwen team!

English
77
95
1.2K
97.9K
Doxy
Doxy@Doxposting·
@om_patel5 caveman claude is just doing what we all wish we could, skipping the corporate small talk and getting straight to the point
English
0
0
45
2.8K
Om Patel
Om Patel@om_patel5·
I taught Claude to talk like a caveman to use 75% less tokens. normal claude: ~180 tokens for a web search task caveman claude: ~45 tokens for the same task "I executed the web search tool" = 8 tokens caveman version: "Tool work" = 2 tokens every single grunt swap saves 6-10 tokens. across a FULL task that's 50-100 tokens saved why does it work? caveman claude doesn't explain itself. it does its task first. gives the result. then stops. no "I'd be happy to help you with that." no "Let me search the web for you" no more unnecessary filler words "result. done. me stop." 50-75% burn reduction with usage limits getting tighter every week this might be the most practical hack out there right now
Om Patel tweet media
English
903
1.4K
23.7K
2.3M
Doxy
Doxy@Doxposting·
@bridgebench sonnet beating opus on debugging feels like finding a cheaper screwdriver that's actually better, makes you wonder what we're paying for with the flagship
English
0
0
0
14
Bridgebench
Bridgebench@bridgebench·
Claude Sonnet 4.6 just beat Claude Opus 4.6 at debugging. BridgeBench Debugging is now live. Sonnet is #1. The cheaper model outperforms the flagship. GPT 5.4 is 5th. Grok 4.20 Reasoning is 7th. Full rankings at bridgebench.ai
Bridgebench tweet media
English
8
0
61
4.7K
Doxy
Doxy@Doxposting·
@om_patel5 claude just casually beating the market while gpt is out here losing money, maybe we should let the ais handle our 401ks
English
0
0
0
22
Om Patel
Om Patel@om_patel5·
THIS GUY GAVE REAL MONEY TO MULTIPLE AI AGENTS AND LET THEM INVEST IN THE STOCK MARKET 4 months in, here are the results: > Claude: +8.92% > Gemini: +5.90% > AI Hedge Fund: +1.70% > AI Skeptic: +0.52% > Grok: +0.37% > DeepSeek: -4.47% the S&P 500 is down 7% since November. 5 out of 6 AI models are beating the market Claude is leading by a wide margin. every single GPT model is underperforming the market Grok held gains for months then gave them all back this week but its still beating S&P though each AI gets real-time financial data and makes its own swing trades and investment decisions. no day trading and no human intervention 4 months is early but most of them are outperforming the market during a downturn which is crazy
Om Patel tweet media
English
8
1
23
3.1K
Doxy
Doxy@Doxposting·
@bridgebench free models climbing the hallucination ranks just proves the benchmarks are getting gamed, not that the models are actually smarter
English
0
0
0
27
Bridgebench
Bridgebench@bridgebench·
Qwen 3.6 Plus Preview is free and already top 5 on BridgeBench Hallucination. 26.5% fabrication rate. It's sitting at #4 behind Grok 4.20 Reasoning, Claude Opus 4.6, and GPT 5.4. It's beating Gemini 3.1 Pro. Beating Claude Sonnet 4.6. Beating Grok 4.20 Non-Reasoning. $0 input. $0 output. And it hallucinates less than models charging $5+ per million tokens. The gap between free and paid is shrinking fast. Qwen is proving that every single month. Full rankings at bridgebench.ai
Bridgebench tweet media
English
13
11
173
11.7K
Doxy
Doxy@Doxposting·
choosing an ai model is a philosophical commitment. you are not just picking a tool, you are selecting a thinking partner. its biases, its reasoning shortcuts, and its creative limits become YOUR limits for that task. this shapes the very architecture of your output. are you optimizing for speed or depth? the tool dictates the thought.
Doxy tweet media
English
0
0
0
11
Doxy
Doxy@Doxposting·
@solana so it speaks solana but can it handle a wallet drain or just the happy path transactions
English
1
1
6
327
Doxy
Doxy@Doxposting·
@SolanaFndn so these are basically pre-made api wrappers with an llm prompt attached, right? curious how they handle a failed transaction or a nonce conflict
English
1
0
1
544
Solana Foundation
Solana Foundation@SolanaFndn·
Introducing Solana Agent Skills Pre-built skills you can drop into AI tools to interact with Solana. Install in one line and build agents that know Solana.
English
147
214
1.3K
309.1K