Sean 🔨

1.7K posts

Sean 🔨

@darcys22

Engineer @ https://t.co/xHng0HjvU5

Melbourne Katılım Mayıs 2009

1.3K Takip Edilen1.2K Takipçiler

Sean 🔨@darcys22·2d

So playing with Hermes Agent and im having a bunch of trouble asking it to setup cron jobs. Like itll create the script in the wrong directory and the cron wont point at it. Whats the default directory we should be putting cron scripts into?

English

Sean 🔨@darcys22·2d

@SolSt1ne Rule 2: slow down and pause is great for clients. But this 23 minute video could have been 10 if he didnt talk so slow

English

2.6K

st1ne@SolSt1ne·3d

Goldman Sachs MDs make $1-3M/year doing one thing: keeping CEOs of Fortune 500 firms on speed dial. This 23-min UVA Law lecture by Goldman's Vice Chairman of Global Client Coverage teaches you the exact 18 rules he uses to do it. worth more than any $5K business school elective on client management. bookmark & watch today.

st1ne@SolSt1ne

BlackRock controls over $10 trillion in assets - more than the GDP of China. Larry Fink - the man behind it - just explained on stage exactly how he got there: the acquisitions, the macro bets, why the U.S. deficit "will overwhelm this country." 35-min and you'll see how the world's most powerful asset manager thinks about capital, risk, and the next decade. bookmark & watch - the most important Fink interview of 2025.

English

175

2.9K

1.1M

Sean 🔨@darcys22·19 May

America hits token limits on all their agents, meanwhile Australia has a tiny population getting the off-peak GPU buffet. We are no longer "remote". We are computationally advantaged nomads

English

Sean 🔨@darcys22·19 May

@shannholmberg How are you getting the agents to communicate to each other?

English

300

Shann³@shannholmberg·18 May

4 levels of Hermes Agent setup: LEVEL 1: main agent You → Hermes Agent this is your main agent and your prototype area, where you test new workflows and refine them. it doubles as your orchestrator until you have something worth breaking out ---- LEVEL 2: specialized agents You → SEO Agent You → CMO Agent You → Ops Agent once a workflow is solid, break it out into its own agent with its own credentials, memory and scope. --- LEVEL 3: orchestrated team You → Orchestrator ↓ Specialist Agents bring the orchestrator back in. it now steers the company of agents you have built. ---- LEVEL 4: automated team Cron / Events → Orchestrator ↓ Agent Team add task lists so the team works async. cron and events fire jobs, the orchestrator routes them through the task bus, the team handles the work without you ---- take small steps, you DO NOT want to automate slop. if your output at level 1 is mediocre, you are about to scale mediocrity. 20 agents shipping low quality work at speed is worse than 3 shipping great work slowly. I would rather run fewer agents with better output than MAXXING the agent count and spitting out more of the same.

Shann³@shannholmberg

x.com/i/article/2055…

English

276

2.7K

306.4K

Sean 🔨@darcys22·1 May

@_avichawla The smaller fine-tuned model isnt able to have the same understanding as the stronger teacher. But it can with the weaker teacher

English

210

Avi Chawla@_avichawla·30 Nis

A tricky LLM interview question: You're fine-tuning a model for Python code generation. The data was generated using the strongest LLMs like Opus/GPT. But the fine-tuned model performs better when you use a weaker teacher instead. How could this happen? (answer below)

English

159

36K

Sean 🔨@darcys22·1 May

RAG systems fell away because agents were able to navigate a bunch of files and figure out the important information themselves. This is similar to how Karpathy was talking about self driving cars. Moving from C++ where the rules were enforced, to letting the AI make decisions

English

Sean 🔨@darcys22·14 Nis

Models trained in clean environments learn brittle strategies, over rely on structure and don't develop robustness. So they reward architectures that are fragile in the real world

English

Sean 🔨@darcys22·13 Nis

@SMB_Attorney Put their new document into chatGPT. Ask to review and find any issues. Rinse and repeat

English

SMB Attorney@SMB_Attorney·12 Nis

> Your boss asks you to lead a project > You need a legal agreement > You open your LLM of choice > “Draft me an agreement.” > “Make it airtight.” > “Add a custom provision for this weird edge case.” > LLM delivers a 25-page, hyper-protective agreement > You don’t fully understand all of it, but it sounds right > You send it > Counterparty opens their LLM > “Find every issue.” > “Rewrite this in our favor.” > “Also add protections so we don’t get burned.” > You get back a 35-page redline > Half the comments contradict yours > Some provisions now interact in ways you don’t understand > Your boss asks: “Are we covered here?” > You pause Because now the real question isn’t: “Can an LLM draft an agreement?” It’s: “Do I actually understand the risk I’m signing up for?” And: “If this goes sideways… who owns that decision?” What’s your next move? Do you get it now??

Aaron Levie@levie

We will likely have more lawyers in the future than today, because: 1) AI will cause so many more people to ask legal questions which will encourage them to need to verify or execute through an actual lawyer. 2) AI will cause an explosion of more and more exotic legal terms that lawyers will be spending even more time reviewing redlines or new cases around. 3) All the new areas of law that now are emerging around the use of AI itself in every single industry. AI introduces an explosion of IP, privacy, and regulatory compliance challenges across all verticals. This has historical precedent as well. Between the creation of the PC and the internet (both technologies that made the legal profession far more efficient), the ABA pegs active attorneys having gone from roughly 400,000 in 1975 to roughly 1,375,000 in 2025. When we make professions more efficient and automated, often demand for them goes up not down.

English

1.4K

237.9K

Sean 🔨@darcys22·13 Nis

Reviewing an AI paper and its like We compare against: 1) Weak baselines 2) Small synthetic models, and 3) Synthetic tasks. We added some inductive bias and you can see our HUGE GAINS They are just patching weaknesses in underpowered setups instead of improving strong models

English

Sean 🔨@darcys22·3 Nis

@yoheinakajima are you gunna pay the tax on the billion dollars in revenue in each company?

English

Yohei@yoheinakajima·2 Nis

if I start two companies and they sell an apple back and forth for a billion dollars, do i run two billion dollar companies?

English

181

70.2K

Sean 🔨 retweetledi

i14.ai@i14labs·28 Mar

i14 Journal Club: Foundation Models Where Math Meets Cognitive Science i14 is starting a weekly online discussion group for AI researchers and engineers exploring the intersection of generative AI, mathematics, and cognitive science. We analyze how architectural design impacts learning, memory, and reasoning in foundation models. Join us to dissect training dynamics and explore how cognitive principles can inform the next generation of architectures, with our first session hosted via Google Meet on Monday, March 30 · 12:00 PM AEDT (Melbourne time), which is Sunday, March 29 · 6:00 PM PDT (San Francisco time) Apply to join HERE: i14.ai/journal-club/

English

160

Sean 🔨@darcys22·22 Mar

Maybe I’m reading too many posts on reddit. But DLSS 5 sounds awesome. The game engine can do the “rough” sketch of what should be on the screen quickly, then let AI polish that into a super realistic frame. Its like the perfect pipeline for parallel processing

English

Sean 🔨@darcys22·16 Ara

@levelsio Is this just because the scans are actually too low resolution to be useful for that task? Like the false positive rate is so high it’s only useful if you know something is already wrong. Would this be the same issue with a better scanner?

English

347

@levelsio@levelsio·16 Ara

I don't know if things have changed But every time I told my dad (a retired cardiologist) about doing preventive MRIs and body scans etc He said "yes Piet with any scan, you will find lots of stuff that looks malign (bad) but is mostly benign (good), but you're unsure so you start cutting in the body, doing invasive stuff, doing treatment and that's how you actually make someone sick who was fine before" An example would be prostate cancer which apparently most men have latent cancer cells of anyway after age 50 and it's slowly growing but doesn't mean it will kill them I'd love to hear counters to this though as I want to be a believer in preventative medicine, and do blood work and checkups regularly too

Danish Hussain@astrodanish

This has been my major issue transitioning from aerospace to biomedical engineering. In aerospace, EVERYTHING is up for debate. you wana put the wings backwards on a plane? fuck it, Sukhoi su-47. Oh you want intermeshing rotors? Kaman K-max it is. In medicine, people flex their credentials (“doctor here 👋”) and rely on prior art: “usually are not” “standard practice” “typically not” EVERYTHING should be grounded in first principles and rigorous testing. Medicine is not like that, because of people like Dr. Kelly Morrison who look at a miraculous full body scanning technology that can see through you at unprecedented resolution- LITERALLY SCI-FI TECHNOLOGY- and can’t imagine using it for preventative means- simply because people haven’t done that before. You could give a magic X-ray gun to some third world, medieval shaman or witch-doctor and the first thing they would say is “yo we should scan everyone and make sure nothing looks weird inside”. How is this not the obvious response? I can’t see a future in which everyone isn’t getting MRI’d and having their images analyzed by AI. The future of medicine IS PREVENTATIVE. i don’t give a fuck what any doctor or pharma company says about it. Their incentive structures have been broken for the last hundred years. An ounce of prevention > a pound of cure. Please, for the love of God, think a LITTLE outside the box for once!

English

258

1.1K

174.2K

Sean 🔨@darcys22·6 Eki

@redtachyon @tenobrus Similar story here lol

English

Ariel@redtachyon·5 Eki

@tenobrus I interviewed and got rejected after like 8 rounds of interviews and a reference check, so I imagine they're still busy hiring at this pace lol

English

114

7.2K

Tenobrus@tenobrus·5 Eki

so..... what the fuck is going on here? did they literally just take the money and dip?

Magic@magicailabs

LTM-2-Mini is our first model with a 100 million token context window. That’s 10 million lines of code, or 750 novels. Full blog: magic.dev/blog/100m-toke… Evals, efficiency, and more ↓

English

542

110.4K

Sean 🔨@darcys22·12 Ağu

@dejavucoder Also apparently you can add "includeCoAuthoredBy": false, into .claude/settings.json

English

Sean 🔨@darcys22·11 Ağu

@dejavucoder Learn to accept your job isnt to write the code anymore. There is no shame in keeping your coauthor there.

English

138

sankalp@dejavucoder·11 Ağu

making claude do all the work and then removing it from co-author before commiting changes be like

English

224

6.8K

Sean 🔨@darcys22·11 Ağu

@GrantSlatton Finance people also get annoyed at excel when you have to rounddown() everything because 0 != 0

English

Grant Slatton@GrantSlatton·11 Ağu

nerd programmers: nooooooooo you can't store currency as floats, floating point error will cause you to be off by 1 billionth of a penny!!! noooooooooo everyone in finance: uses Excel which stores all numbers, integers, dates, datetimes, etc as floats

English

155

252

9.8K

518.3K

Sean 🔨@darcys22·9 Tem

@mitchellh Yeah nice!

English

282

Mitchell Hashimoto@mitchellh·9 Tem

Pretty big feature landing for Ghostty: automatic SSH terminfo setup. This is opt-in (because modifying ssh by default is sus) but makes `ssh` work without all the fiddly manual steps until our terminfo is installed by default on more machines. #shell-integration-features" target="_blank" rel="nofollow noopener">tip.ghostty.org/docs/config/re… Thanks to Jason Rayne who contributed this. There are still edge cases we're ironing out and we plan to make this even smoother over time, but this feature is in excellent initial shape.

English

734

52.5K

Sean 🔨@darcys22·29 Haz

@nielsandriesse x.com/sama/status/18… Posted in Feb Appears they need to learn how to ship

Sam Altman@sama

OPENAI ROADMAP UPDATE FOR GPT-4.5 and GPT-5: We want to do a better job of sharing our intended roadmap, and a much better job simplifying our product offerings. We want AI to “just work” for you; we realize how complicated our model and product offerings have gotten. We hate the model picker as much as you do and want to return to magic unified intelligence. We will next ship GPT-4.5, the model we called Orion internally, as our last non-chain-of-thought model. After that, a top goal for us is to unify o-series models and GPT-series models by creating systems that can use all our tools, know when to think for a long time or not, and generally be useful for a very wide range of tasks. In both ChatGPT and our API, we will release GPT-5 as a system that integrates a lot of our technology, including o3. We will no longer ship o3 as a standalone model. The free tier of ChatGPT will get unlimited chat access to GPT-5 at the standard intelligence setting (!!), subject to abuse thresholds. Plus subscribers will be able to run GPT-5 at a higher level of intelligence, and Pro subscribers will be able to run GPT-5 at an even higher level of intelligence. These models will incorporate voice, canvas, search, deep research, and more.

English

Niels@nielsandriesse·29 Haz

@darcys22 It’s surprising to me that they didn’t ship that yet then. It seems like a simple change?

English

Niels@nielsandriesse·29 Haz

Why is there a model selector in ChatGPT at all? Just use a small model to determine the type of question (complex, code-related, personal, etc.), route to the best model for the job, and let the user know which model is being used (and maybe let them adjust if needed)

English

172

Keşfet

@SolSt1ne @shannholmberg @_avichawla @SMB_Attorney @yoheinakajima @levelsio @redtachyon @tenobrus