Shawn Thuris

2.8K posts

Shawn Thuris

@Thuris

IT and web consultant @thurisandco. Podcast: https://t.co/iy8vGv0zTn. Data analytics MBA. Sometime recitalist and opera tenor. ~hodrun-solmud on the urbs

East Bay Katılım Temmuz 2008

185 Takip Edilen288 Takipçiler

Shawn Thuris@Thuris·3d

@baketnk_en @0interestrates On GitLab you do merge requests. I appreciate github too but it's not the only way to manage code!

English

american tanuki@baketnk_en·3d

@0interestrates "diff"

English

125

24K

rahul@0interestrates·3d

it's easy to approve zuck's diff, but do you have the courage to request changes on zuck's diff?

English

546.5K

Shawn Thuris@Thuris·4d

youtu.be/DTVQoUs8aWI

YouTube

ZXX

Shawn Thuris@Thuris·5d

Felt that earthquake very clearly in Hayward, slow shaking for about 5 seconds

English

805

Shawn Thuris@Thuris·5d

Today's the 20th anniversary of OMG PONIES. Twen-tieth...

English

Shawn Thuris@Thuris·6d

@pierceboggan @JoeMayo I do rounds of clarifying questions then set GPT 5.4 xhigh or Opus 4.6 high + fast loose on it with Autopilot, then go back and clean up as needed.

English

Pierce Boggan@pierceboggan·6d

@JoeMayo

QME

1.7K

Joe Mayo@JoeMayo·6d

OH: Copilot is a copilot, not an autopilot

English

1.6K

Shawn Thuris@Thuris·30 Mar

@powerbottomdad1 @werbleworble Oh here we go

English

2.3K

sucks@powerbottomdad1·30 Mar

@werbleworble whats mots c

English

9.1K

sucks@powerbottomdad1·30 Mar

been on reta 4 days: they are going to sell 10 trillion dollars of this thing

English

101

2.7K

1.3M

Shawn Thuris@Thuris·30 Mar

@TheticThrone @wokal_distance That was my immediate suspicion

English

9.9K

Thetic@TheticThrone·30 Mar

@wokal_distance That’s not a girl. Its a dude face-swapping. Still a cool person visiting places on a motorcycle

English

141

91.5K

Wokal Distance@wokal_distance·30 Mar

Seeing Japanese girls tweeting about Cherry blossoms is so much better than getting force fed Pakistani men trying to maximize their payout by posting about Massie, Epstein, and Israel.

まる。@marumaru_miyako

昨日は山の方を走ってきましたが桜もどんどん咲いてて走っててもちっとも寒くなくて走るだけで楽しくって春が来ましたね！

English

183

2.7K

62.7K

3.6M

Shawn Thuris@Thuris·30 Mar

@BHolmesDev This reflects my experience pretty much exactly. I hate talking to GPT 5.4, but I love watching it grind through something until it actually works. And I like talking to Opus 4.6, and I dislike having to follow it around and make sure it did everything and did it right.

English

160

Ben Holmes@BHolmesDev·29 Mar

I’ve used Opus 4.6 and GPT 5.4 on a mix of projects since release, and want to break down where I think they uniquely excel. It’s more nuanced than you’d think! Rigor of code - GPT 5.4. It goes the distance validating its work without asking. Opus needs explicit instruction to do this, and even then, it misses more edge cases. Clarity of code - Opus 4.6. Claude is a better communicator, which carries into the code. Variable names are clearer and less mechanical, which improves reviewability. This is very important since code review is the bottleneck for most engineering teams. It also adds the right amount of doc comments. GPT simply never comments or explains its work; it’s like working with an obtuse engineer that wants the solution to speak for itself. Sometimes it does, other times not. Similarly, rigor of plans goes to GPT 5.4, while clarity of plans goes to Opus 4.6. An interesting point though: GPT performs better talking through a strategy without a plan, while Opus needs planning mode to put in any rigor. I find myself forgetting plan mode altogether using GPT 5.4. Quality of research - toss-up. Opus spends longer researching with web search, but GPT spends longer studying the existing codebase. You may think codebase research matters more, but researching how others solve the same problem can be just as important. Maybe more important for greenfield. Quality of conversation - Opus 4.6. It’s just better to talk to, which matters using these things everyday. GPT 5.4 was clearly trained to challenge the user more, which results in a tendency to *always* say you are wrong. I’ve had bizarre interactions where GPT claims something is “not quite right,” the restates exactly what we’ve decided on in the last turn. On a personal level, it’s annoying. On a practical level, it makes iteration on a plan slower. THAT SAID, it takes sufficient pushing for Opus to challenge your thinking in this way. Simply say “I’m impartial” and ask questions to avoid that, as you would a person. Overall winner - Opus to make it work, GPT to make it good. I don’t have a good system of when to switch tools, but on average, I prefer Opus early on and GPT for optimization and discussing architectural decisions. Opus is also better for any design related tasks (but state management in frontend apps is better handled by GPT).

English

140

1.5K

201.8K

Shawn Thuris@Thuris·29 Mar

I work as a grocery store night manager because solo dev/IT work by itself was too unpredictable. On my lunch I get out my laptop. When I get home at 1am I'm up until 3 or 4 doing agentic coding. I earned an MBA a while ago on the foolish assumption it would get just a toe in the door anywhere. I earn enough to survive but I could be adding a lot more value to the world than I am.

English

Kiri@Kyrannio·29 Mar

The hiring process of old seems hilariously broken. I have so many incredible and talented friends looking for work, some who are even working corporate jobs currently and seeking to go even more all in on AI. If you're seeking to hire someone or else job searching, maybe comment below, or if we can all brainstorm some ideas for improvement, that would be great. For those outside of our X bubble especially it seems very rough when it comes to the basic application and interview process as a whole.

English

2.1K

Shawn Thuris@Thuris·27 Mar

@liuqian16 Same thing happening to me right now in Copilot CLI and in VS Code

English

小安@liuqian16·27 Mar

天塌了！！！Copilot 罢工了！！！

中文

Shawn Thuris@Thuris·27 Mar

I hit Copilot rate limiting tonight for the first time. I'd been using Opus 4.6 high in VS Code, probably 8 or 9 turns during an hour, nothing that big. Wouldn't even let me switch to something lighter. I used Gemini 3.1 Pro direct from Google to help me finish up. If I'd been in Copilot CLI trying to troubleshoot a server and this happened though...?

English

Shawn Thuris retweetledi

DEJAN@dejanseo·25 Mar

Implemented Google's TurboQuant paper on Gemma 3 4B with a custom Triton kernel for fused quantized attention. It's real. Results on RTX 4090: 2-bit FUSED: character-for-character identical to fp16 baseline. On every prompt. At 16x theoretical compression. The Triton kernel reads uint8 key indices directly — never materializes fp16 keys. Pre-rotate query once (R is orthogonal so ⟨q, Rᵀ·centroids[idx]⟩ = ⟨R·q, centroids[idx]⟩), then per-position work is just a table lookup + dot. Speed (avg tok/s across 3 prompts): → fp16 baseline: 17.7 → 4-bit fused: 16.5 (-7%) → 2-bit fused: 17.7 (0% — matches baseline) VRAM (KV cache delta): → fp16: 26 MB → 4-bit fused: 4 MB → 2-bit fused: 7 MB The paper's theoretical guarantees hold up completely in practice. Zero accuracy loss, zero speed loss, fraction of the memory. Paper: arxiv.org/abs/2504.19874

Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English

1.1K

130.9K

Shawn Thuris retweetledi

Mitko Vasilev@iotcoi·25 Mar

I just implemented Google’s TurboQuant for vLLM. My USB-charger-sized HP ZGX now fits 4,083,072 KV-cache tokens on GB10. This may be the biggest open inference breakthrough of 2026 so far. Training is the flex. Inference is the forever bill.

English

237

206.9K

Shawn Thuris retweetledi

Wes Bos@wesbos·25 Mar

if a CEO of a company is posting an absolute statement about ai and the future, they are ramping up to launch a feature that does exactly that next week

English

367

25.6K

Shawn Thuris retweetledi

Google Research@GoogleResearch·24 Mar

GIF

English

5.8K

39K

19.2M

Shawn Thuris@Thuris·25 Mar

@witcheer I've got my Copilot subscription connected and using it for all code-related stuff (any personal stuff I still do through OpenRouter).

English

142

witcheer ☯︎@witcheer·24 Mar

Hermes agent v0.4.0. I run this thing 24/7. here's what just changed under my feet. /1/ you can now expose hermes as an OpenAI-compatible API endpoint. /v1/chat/completions. your agent becomes a model. anything that can call an OpenAI API can now talk to your hermes instance like it's a hosted LLM, except it has tools, memory, skills, and cron jobs behind it. there's also a /api/jobs REST endpoint for managing cron jobs programmatically. I have 15 crons. being able to create and modify them through an API instead of through chat changes my automation surface completely. /2/ six new messaging adapters in one release. Signal, DingTalk, SMS via Twilio, Mattermost, Matrix, and a generic webhook adapter. that's on top of the Telegram, Discord, Slack, and WhatsApp that already existed. ten platforms total now. /3/ @file and @url context injection with tab completion. type @ and start typing a filename, tab-complete it, and the file's contents get injected into your message. same for URLs. Claude Code has this. now hermes does too. /4/ context compression got rebuilt from scratch. structured summaries with iterative updates instead of the "summarise everything and throw it away" approach from before. there's token-budget tail protection so the most recent turns survive compression. /5/ four new providers: GitHub Copilot (full OAuth), Alibaba Cloud / DashScope, Kilo Code, and OpenCode Zen/Go. I'm on Z.AI/GLM-5 and this doesn't change my setup directly. but Copilot at 400k context is interesting for anyone with a GitHub subscription looking for a cost-effective agent brain. /6/ /queue lets you stack prompts while the agent is still working. instead of waiting for it to finish, you type your next instruction and it gets queued. in my workflow I'll read a cron output, want to follow up on three things, and used to have to wait between each one. still feels early. still finding edges. but the foundation is getting solid fast.

Teknium (e/λ)@Teknium

Hermes Agent v0.4.0 — 300 merged PRs this week. Biggest release we've done. Background self-improvement, OpenAI Responses API endpoint for your agent, new messaging platforms, new providers, MCP server management, and a lot more.

English

181

13.9K

Shawn Thuris@Thuris·25 Mar

Sonnet has better things to do than review a stupid bash script (5.4 dutifully came in and waded through it)

English

Shawn Thuris@Thuris·24 Mar

@Teknium @danielrmay In Telegram this would be annoying... Could the same thing be accomplished by limiting /model to sessions with no history, ie after a /new?

English

Teknium (e/λ)@Teknium·24 Mar

Hermes was built originally around openrouter and originally only accepted openrouter. Seems like a vestigial bug but will be addressed asap /modal mid convo has historically been buggy and may be removed so people use the proper `hermes model` command if we can't get this thing right

English

1.6K

Daniel May@danielrmay·24 Mar

i was excited to use hermes until i ran into an unfortunate bug where it silently ships data to openrouter instead of your chosen local model github.com/NousResearch/h… ??? watching very closely to see how quickly this critical is resolved

English

1.6K

Shawn Thuris@Thuris·24 Mar

@Kyrannio Running jobs with the --marble_hornets flag set, wow

English

Kiri@Kyrannio·24 Mar

Why do I find this so genuinely hilarious

English

588

Keşfet

@baketnk_en @0interestrates @pierceboggan @JoeMayo @powerbottomdad1 @werbleworble @TheticThrone @wokal_distance