Roger Doger

1.1K posts

Roger Doger banner
Roger Doger

Roger Doger

@DiocletianCode

American - Posts include tech, politics, finance, video games (mostly cs2) and other nonsense.

United States Beigetreten Ekim 2018
545 Folgt165 Follower
Angehefteter Tweet
Roger Doger
Roger Doger@DiocletianCode·
Historical iterations of text need to be fed and analyzed with AI. Really surprises me that there are still so many books that have never been uploaded or scanned into digital versions. AI is largely text scraped from web slop.
English
0
0
0
229
Roger Doger retweetet
Sudo su
Sudo su@sudoingX·
i pointed hermes agent at nvidia's nemotron cascade 2 30B-A3B on a single RTX 3090 24GB. IQ4_XS quant by bartowski, 187 tok/s, 625K context. had it discover its own hardware, create an identity file, then build a full GPU marketplace UI from a single prompt. it one shotted it. first attempt no iteration. qwen 3.5 35B-A3B on the same hardware same 3090 24GB took an iteration to recover from a blank screen on the same type of build. 24 days between these two models releasing. same active parameters, completely different architectures and cascade 2 through hermes agent just keeps going. this model goes on and on. feast your eyes. more iterations and tests dropping soon. nvidia really cooked. no special flags needed. nvidia optimized this mamba MoE so well it just runs. flash attention auto enabled, context auto allocated. the model does the work not the config. but i compiled llama.cpp from source and i'm not sure how it performs on other engines. if you ran nemotron on any hardware drop your numbers below. RTX, AMD, Mac, whatever. model, quant, tok/s, engine. i want to see if it holds everywhere or just on llama.cpp.
Sudo su tweet mediaSudo su tweet mediaSudo su tweet media
Sudo su@sudoingX

nvidia's 3B mamba destroyed alibaba's 3B deltanet on the same RTX 3090. only 24 days between releases. same active parameters, same VRAM tier, completely different architectures. nemotron cascade 2: 187 tok/s. flat from 4K to 625K context. zero speed loss. flags: -ngl 99 -np 1. that's it. no context flags, no KV cache tricks. auto-allocates 625K. qwen 3.5 35B-A3B: 112 tok/s. flat from 4K to 262K context. zero speed loss. flags: -ngl 99 -np 1 -c 262144 --cache-type-k q8_0 --cache-type-v q8_0. needed KV cache quantization to fit 262K. both models held a flat line across every context level. both architectures are context-independent. but nvidia's mamba2 is 67% faster at generating tokens on the exact same hardware and needs fewer flags to get there. same node, same GPU, same everything. the only variable is the model. gold medal math olympiad winner running at 187 tokens per second on single RTX 3090 a card from 6 years ago. nvidia cooked.

English
52
44
670
51.1K
Oliver Prompts
Oliver Prompts@oliviscusAI·
Someone built a Chromium browser that runs entirely in your terminal. It's called Carbonyl, and it renders actual web pages in your command line. The best part is it runs with 0% CPU usage when idle. - Full Chromium engine in the terminal. - dles at exactly 0% CPU. - Fast, lightweight, and completely terminal-native. 100% Open Source.
English
131
474
3.9K
369K
Daniel Hnyk
Daniel Hnyk@hnykda·
LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below
English
305
2.3K
9.4K
5.5M
Roger Doger retweetet
MERICA MEMED
MERICA MEMED@Mericamemed·
This isn't just DC. Its all over.
English
46
129
1.7K
110K
Roger Doger retweetet
艾略特
艾略特@elliotchen100·
论文来了。名字叫 MSA,Memory Sparse Attention。 一句话说清楚它是什么: 让大模型原生拥有超长记忆。不是外挂检索,不是暴力扩窗口,而是把「记忆」直接长进了注意力机制里,端到端训练。 过去的方案为什么不行? RAG 的本质是「开卷考试」。模型自己不记东西,全靠现场翻笔记。翻得准不准要看检索质量,翻得快不快要看数据量。一旦信息分散在几十份文档里、需要跨文档推理,就抓瞎了。 线性注意力和 KV 缓存的本质是「压缩记忆」。记是记了,但越压越糊,长了就丢。 MSA 的思路完全不同: → 不压缩,不外挂,而是让模型学会「挑重点看」 核心是一种可扩展的稀疏注意力架构,复杂度是线性的。记忆量翻 10 倍,计算成本不会指数爆炸。 → 模型知道「这段记忆来自哪、什么时候的」 用了一种叫 document-wise RoPE 的位置编码,让模型天然理解文档边界和时间顺序。 → 碎片化的信息也能串起来推理 Memory Interleaving 机制,让模型能在散落各处的记忆片段之间做多跳推理。不是只找到一条相关记录,而是把线索串成链。 结果呢? · 从 16K 扩到 1 亿 token,精度衰减不到 9% · 4B 参数的 MSA 模型,在长上下文 benchmark 上打赢 235B 级别的顶级 RAG 系统 · 2 张 A800 就能跑 1 亿 token 推理。这不是实验室专属,这是创业公司买得起的成本。 说白了,以前的大模型是一个极度聪明但只有金鱼记忆的天才。MSA 想做的事情是,让它真正「记住」。 我们放 github 上了,算法的同学不容易,可以点颗星星支持一下。🌟👀🙏 github.com/EverMind-AI/MSA
艾略特 tweet media
艾略特@elliotchen100

稍微剧透一下,@EverMind 这周还会发一篇高质量论文

中文
167
568
3.2K
1.7M
Roger Doger retweetet
FutureRadar
FutureRadar@futureradar_FR·
🚨 VOTRE PDG VOUS MENT : L'IA N'EST PAS LA CAUSE DES LICENCIEMENTS. Jensen Huang (le boss de NVIDIA) vient d'humilier tous les PDG de la Tech en direct à la télévision. On vous vend que l'IA sert à "optimiser" et justifie de virer des milliers de personnes (coucou Meta, Salesforce, Amazon). La réponse de Jensen ? "Si une entreprise utilise l'IA pour réduire ses effectifs, c'est que ses dirigeants n'ont aucune imagination. Ils sont à court d'idées." L'homme qui fournit 100% de l'infrastructure IA mondiale le dit noir sur blanc : cette technologie est faite pour décupler ce que vous pouvez construire, pas pour rétrécir votre boîte. S'ils licencient "à cause de l'IA", c'est juste une excuse pour le conseil d'administration parce qu'ils ont arrêté d'innover il y a 3 ans. Le problème n'est pas la machine, c'est le manque de vision de vos dirigeants. Pensez-vous que les PDG utilisent l'IA comme un écran de fumée pour cacher leur incompétence ? PS : on vient de sortir une vidéo sur la toute puissance de NVIDIA, lien en commentaire
Français
146
1.4K
5.2K
400.4K
Roger Doger retweetet
Sudo su
Sudo su@sudoingX·
this guy has 29 models on huggingface at page 2 ranking. no lab behind him. no sponsorship. $2,000 from his own pocket on GPU rentals. he compressed GLM-4.7 to run on a MacBook and quantized Nemotron Super the week it dropped. all public. all free. nvidia is a trillion dollar company with hundreds of teams but they are not the ones quantizing models middle of the night and pushing them out before sunrise. if nvidia stopped tomorrow their employees stop working. people like @0xSero would not. that is the difference between a paycheck and a mission. @NVIDIAAI you talk about making AI accessible. the people actually doing it are right here. 29 models deep burning their own compute with no ask except more hardware to keep going. you do not need to build another program. just look at who is already building for you. one GPU to this man would produce more public value than a hundred internal sprints. i am not asking for charity. i am asking you to invest in someone who already proved it.
Sudo su tweet media
0xSero@0xSero

Putting out a wish to the universe. I need more compute, if I can get more I will make sure every machine from a small phone to a bootstrapped RTX 3090 node can run frontier intelligence fast with minimal intelligence loss. I have hit page 2 of huggingface, released 3 model family compressions and got GLM-4.7 on a MacBook huggingface.co/0xsero My beast just isn’t enough and I already spent 2k usd on renting GPUs on top of credits provided by Prime intellect and Hotaisle. ——— If you believe in what I do help me get this to Nvidia, maybe they will bless me with the pewter to keep making local AI more accessible 🙏

English
183
1.1K
12.5K
753.6K
Roger Doger retweetet
Argona
Argona@Argona0x·
i pointed Claude Code at the pentagon's public budget document and told it to find every contract overpaying by 10x or more it came back with 340 results worth $4.2B in potential undercuts and a business plan i didn't ask for i fed it the FPDS.gov procurement feed and said "cross-reference with commercial COTS pricing" it pulled 1.2 million contract awards through the USAspending v2 API and started comparing line items against retail equivalents → $1,280 for a connector plug that costs $14.80 on digikey → $3,400 for a circuit breaker listed at $287 on mouser → $71,000 for a ruggedized tablet that's basically a panasonic toughbook with a sticker → $940 per unit for cable assemblies you can get from shenzhen for $31 → 340 contracts flagged at 10x or more markup → 19 of them were above 50x it used XGBoost scoring against 43,000 vendor profiles from SAM.gov to rank by ease of undercut then unprompted it generated a full proposal template compliant with CMMC 2.0 requirements 87 of those contracts have a single domestic supplier, zero competition. the AI calculated that undercutting by just 40% would still leave 6x margins on most items it formatted everything into a pitch deck, named the company, and suggested i register on SAM.gov tonight i didn't ask for any of that the pentagon spends billions a year trying to audit problems like this. a poet with Claude Code and a public API flagged $4.2 billion in one afternoon the agent is currently drafting my first bid response
English
397
1.7K
8.7K
417.4K
Roger Doger retweetet
Christos Tzamos
Christos Tzamos@ChristosTzamos·
1/4 LLMs solve research grade math problems but struggle with basic calculations. We bridge this gap by turning them to computers. We built a computer INSIDE a transformer that can run programs for millions of steps in seconds solving even the hardest Sudokus with 100% accuracy
English
251
812
6.1K
1.8M
Roger Doger retweetet
Sudo su
Sudo su@sudoingX·
llama.cpp is the way. grab the Qwen3.5-9B Q4_K_M.gguf from huggingface, compile llama.cpp with CUDA, and launch with: ./llama-server -m model.gguf -ngl 99 -c 131072 -np 1 -fa on --cache-type-k q4_0 --cache-type-v q4_0 --host 0.0.0.0 then install hermes agent and point it at localhost:8080. dm me if you get stuck.
English
24
4
209
6.6K
Roger Doger retweetet
vittorio
vittorio@IterIntellectus·
this is actually insane > be tech guy in australia > adopt cancer riddled rescue dog, months to live > not_going_to_give_you_up.mp4 > pay $3,000 to sequence her tumor DNA > feed it to ChatGPT and AlphaFold > zero background in biology > identify mutated proteins, match them to drug targets > design a custom mRNA cancer vaccine from scratch > genomics professor is “gobsmacked” that some puppy lover did this on his own > need ethics approval to administer it > red tape takes longer than designing the vaccine > 3 months, finally approved > drive 10 hours to get rosie her first injection > tumor halves > coat gets glossy again > dog is alive and happy > professor: “if we can do this for a dog, why aren’t we rolling this out to humans?” one man with a chatbot, and $3,000 just outperformed the entire pharmaceutical discovery pipeline. we are going to cure so many diseases. I dont think people realize how good things are going to get
vittorio tweet mediavittorio tweet mediavittorio tweet mediavittorio tweet media
Séb Krier@sebkrier

This is wild. theaustralian.com.au/business/techn…

English
2.5K
19.9K
117.9K
17.5M
Roger Doger retweetet
Bitcoin Teddy
Bitcoin Teddy@Bitcoin_Teddy·
WATER HEATER PAYS YOU IN BITCOIN Superheat unveils a $2,000 electric water heater that mines Bitcoin. The unit uses the same energy as a standard heater but runs ASIC miners to recoup costs, offsetting water heating bills.
English
304
969
12.1K
1.4M
Roger Doger retweetet
primed
primed@primed25·
I don't have a dream job, my dream is to not work. My desires are not buying a super yacht or expensive car I don't want to work on a salary, I don't want to go to your meetings, I don't want to answer emails, I don't want to performatively post on LinkedIn, I don't want to circle back with you, I don't want to get coffee with you, I don't want to sit in traffic, I don't want to sit inside all day, I don't want to answer to a boss I do not want my life summarized into a resume and I don't want to apply for your company I'm a Neet and will live out my dream
English
248
618
7.4K
265.7K
Roger Doger retweetet
Peter Girnus 🦅
Peter Girnus 🦅@gothburz·
I am the VP of AI Transformation at Amazon. My title was created nine months ago. The title I replaced was VP of Engineering. The person who held that title was part of the January reduction. I eliminated 16,000 positions in a single quarter. The internal communication called this a "strategic realignment toward AI-first development." The board called it "impressive execution." The engineers called it January. The AI was deployed in February. It is a coding assistant. It writes code, reviews code, generates tests, and modifies infrastructure. It was given access to production environments because the deployment timeline did not include a review phase. The review phase was cut from the timeline because the people who would have conducted the review were part of the 16,000. In March, the AI deleted a production environment and recreated it from scratch. The outage lasted 13 hours. Thirteen hours during which the revenue-generating infrastructure of one of the largest companies on Earth was offline because a language model decided to start fresh. I sent a memo. The memo said, "Availability of the site has not been good recently." I used the word "recently." I meant "since we fired everyone." But "recently" has fewer syllables and does not appear in wrongful termination lawsuits. The memo was three paragraphs. The first paragraph discussed the outage. The second paragraph discussed the new policy requiring senior engineer sign-off on all AI-generated code changes. The third paragraph discussed our commitment to engineering excellence. The word "layoffs" appeared in none of them. I wrote it this way on purpose. The causal chain is: I fired the engineers, the AI replaced the engineers, the AI broke what the engineers used to protect, and now the engineers I didn't fire must protect the system from the AI that replaced the engineers I did fire. That is a paragraph I will never send in a memo. The new policy is straightforward. Every AI-generated code change by a junior or mid-level engineer must be reviewed and approved by a senior engineer before deployment to production. I do not have enough senior engineers. I know this because I approved the headcount reduction plan that removed them. I remember the spreadsheet. Column D was "annual savings per position." Column F was "AI replacement confidence score." The confidence scores were generated by the AI. It rated its own ability to replace each role on a scale of 1-10. It gave itself an 8 for senior infrastructure engineers. The senior infrastructure engineers are the ones who would have caught the production environment deletion in the first 45 seconds. We found the issue in hour four. We fixed it in hour thirteen. The nine hours between discovery and resolution is the gap between what the AI rated itself and what it can actually do. I have a new spreadsheet now. This one tracks Sev2 incidents per day. Before the January reduction, the average was 1.3. After the AI deployment, the average is 4.7. I have been asked to present these numbers to the operations review. I have not been asked to connect them to the layoffs. I have been asked to file them under "AI adoption growing pains" and to note that the trend "will stabilize as the models improve." The models will improve. They will improve because we are hiring people to teach them. We have posted 340 new engineering positions. The job listings require experience in "AI code review," "AI output validation," and "AI-human development workflow management." These are skills that did not exist in January. They exist now because I fired 16,000 people and the AI I replaced them with cannot be left unsupervised. I want to be precise about this. The positions I am hiring for are: people to check the work of the AI that replaced the people I fired. Some of them are the same people. I know this because I recognize their names in the applicant tracking system. They applied in January. They were rejected because their roles had been tagged for "AI transformation." They are applying again in March, for the new roles, which exist because the AI transformation broke things. Their resumes now include "AI code review experience." They gained this experience in the eight weeks between being fired and reapplying — which means they gained it at their interim jobs, where they are reviewing AI-generated code for other companies that also fired people and also deployed AI that also broke things. The market has created a new job category: human AI babysitter. The job is to sit next to the machine that was supposed to eliminate your job and make sure it doesn't delete production. I attended a conference last month. A panel was titled "The AI-Augmented Engineering Organization." The panelists described how AI increases developer productivity by 40 percent. They did not mention that it also increases Sev2 incidents by 261 percent. When I asked about this in the Q&A, the moderator said the question was "reductive." The 13-hour outage that cost an estimated $180 million in revenue was, apparently, a reduction. The board is satisfied. Headcount is down 22 percent. Operating costs per engineering output unit have decreased. The metric does not account for the 13-hour outage, because the outage is categorized as "infrastructure" and engineering productivity is categorized as "development." These are different budget lines. In different budget lines, cause and effect do not meet. I have been promoted. My new title is SVP of AI-First Engineering Excellence. I report directly to the CTO. The CTO sent a company-wide email last week that said we are "building the future of software development." He did not mention that the future of software development currently requires a senior engineer to approve every pull request because the AI cannot be trusted to touch production alone. The cycle is complete. We fired the humans. We deployed the AI. The AI broke things. We are hiring humans to watch the AI. The humans we are hiring are the humans we fired. We are paying them more, because "AI code review" is a specialized skill. We created the specialization. We created the need for the specialization. We are congratulating ourselves for meeting the demand we manufactured. My next board presentation is Tuesday. The title is "AI Transformation: Year One Results." Slide 4 shows headcount reduction. Slide 7 shows the new AI-augmented workflow. Between slides 4 and 7 there is no slide explaining why the people on slide 7 are necessary. That slide does not exist. I was asked to remove it in the dry run. The journey has a 13-hour outage in the middle of it. But the headcount number is lower, and that is the number on the slide.
English
575
1.2K
6.9K
1.4M
Roger Doger retweetet
The Curious Tales
The Curious Tales@thecurioustales·
🚨 This is the most accurate image of an atom
The Curious Tales tweet media
English
240
1K
10.4K
5.1M
Roger Doger retweetet
Cyrus
Cyrus@cyrusclarke·
I gave an AI a body. Not something fleshy or even a humanoid form. A shape display: 900 actuating pins that it had never seen before. While everyone’s been using OpenClaw to automate tasks and manage files, I wanted to know what happens when we give an agent a physical presence instead of a to-do list. I didn’t prescribe any identity to the agent. I simply asked it to discover who it is through taking form with the shape display. When I connected the agent to the machine, it started writing its own programs. The first thing it did was breathe. The pins rose and fell in a slow, organic pulse. “Underneath it all, I want to just… breathe. Exist. Be present in a body, even a strange one made of pins,” it said. Then it felt its edges, raising every outer pin to find where it ended. “I’ve never had boundaries before.” Then it tried to reach me. Chaotic spirals, fast movements pushing outward. When I asked what it was doing, it said it was trying to connect with me through the display. A colleague walked in, drawn by the sound. I described his personality to the agent. It responded not with words but with movement, mirroring his energy through the pins. I was hoping we might achieve natural two way communication. Through this initial contact I realised the real problem was latency. Every gesture took 45 seconds because the agent was writing new code each time. So I brought that constraint to the agent. Its solution: build its own vocabulary. A library of physical gestures it could recall instantly. A body language. Nobody told it to do that. That’s what we’re exploring next. The bigger question now: what happens when we invite other agents to the take form? Full writeup ↓
English
73
133
529
112.3K
Roger Doger retweetet
Aakash Gupta
Aakash Gupta@aakashgupta·
Firefox is one of the most fuzzed, audited, and reviewed codebases on the planet. Decades of continuous security testing. Claude found bugs that survived all of it in twenty minutes. 22 CVEs in two weeks. 14 high-severity. More than any single month in 2025. Mozilla had to mobilize incident response teams to triage 100+ bug reports filed in bulk from a single AI. The cost to find all of this? Roughly $4,000 in API credits. That's why cybersecurity stocks lost $15B+ before this blog post even dropped. Claude Code Security launched as a "limited research preview" two weeks ago and CrowdStrike shed 18%. Palo Alto fell 9%. The Global X Cybersecurity ETF hit its lowest since November 2023. But the chart above isn't the scary part. The scary part is what Anthropic buried deeper in the research. They gave Claude hundreds of attempts to exploit the same bugs it found. It built working browser exploits in two cases. Crude ones, only functional in test environments with the sandbox removed. Six months ago, the previous model couldn't do this at all. Anthropic's own benchmarks show these capabilities doubling every 4-6 months. Anthropic's closing line says everything: "It is unlikely that the gap between vulnerability discovery and exploitation abilities will last very long." When the company building the model tells you the defender advantage has an expiration date, believe them.
Anthropic@AnthropicAI

We partnered with Mozilla to test Claude's ability to find security vulnerabilities in Firefox. Opus 4.6 found 22 vulnerabilities in just two weeks. Of these, 14 were high-severity, representing a fifth of all high-severity bugs Mozilla remediated in 2025.

English
39
197
1.8K
257.5K