Sabitlenmiş Tweet
Michael
3.5K posts

Michael
@mslshao
Unapologetically 🇨🇦| Staff SWE | Asian | aspiring traveler/📸/writer/hiker/EDMartist🎶 | @uWaterloo CS' 14 alum | #ModelY owner | @PlayCrucible world champ
Seattle, WA Katılım Mart 2016
142 Takip Edilen76 Takipçiler
Michael retweetledi

> be me, applied scientist at amazon
> spend 6 months building ML model that actually works
> ready to ship
> manager asks "but does it Dive Deep?"
> show him 37 pages of technical documentation
> "that's great anon, but what about Customer Obsession?"
> model literally convinces customers to buy more stuff they don't need
> "okay but are you thinking Big Enough?"
> mfw I am literally increasing sales
> okay lets ship it
> PM says there's not enough Disagree and Commit
> we need to disagree about something
> team spends 2 hours debating whether the config file should be YAML or JSON
> engineering insists on XML "for backwards compatibility"
> what backwards compatibility, this is a new service
> doesn't matter, we disagree and commit to XML
> finally get approval to deploy
> "make sure you're frugal with the compute costs"
> model runs on a potato, costs $2/month
> finance still wants a cost breakdown
> write 6-pager about why we need $2/month
> include bar raiser in the review
> bar raiser asks "but can we do it for $1.50? we need to be Frugal"
> spend another month optimizing to hit $1.50
> ready to deploy again
> VP decides we need to "Invent and Simplify"
> requests we rebuild the entire thing using a new framework
> framework doesn't exist yet
> "show some Ownership and build it yourself"
> 3 months later, framework is half done
> org restructure happens
> new manager says this doesn't align with team goals anymore
> project cancelled
> model never ships
> manager gets promoted to L8 for "successfully reallocating resources"
> team celebrates with 6-pager retrospective about what we learned
> mfw we delivered on all 16 leadership principles
> mfw we delivered nothing else
> amazon.jpg
English
Michael retweetledi
Michael retweetledi

@jiayuan_jy ? what do you do while your LLM agent is writing all your code
English
Michael retweetledi

Michael retweetledi

ICE Confirms Agents Do Not Have Faces Beneath Masks theonion.com/ice-confirms-a…
English
Michael retweetledi

Trump: ‘Another Thing Epstein And I Never Did Is Play Nude Charades’ theonion.com/trump-another-…
English
Michael retweetledi

Victor Wembanyama Reports To Training Camp Having Added 25 Pounds Of Hair theonion.com/victor-wembany…
English

How is this ok, @AskAmex? I have been a customer for 10 years, I have a clearly fraudulent charge from Mississippi, and you force me to pay $14k for something I didn't even buy? How is this ok? I'm not even in the same city, state, or country where this happened (never been!)


English

@karpathy I've been making my "default" (assumed?) prompts be more pointed, so that unless I ask for a generic research or comprehensive task to take place, it'll stay focused on the one area I asked about.
It somewhat helps that I never jumped ahead beyond Sonnet 3.7.
It's nice here.
English

I'm noticing that due to (I think?) a lot of benchmarkmaxxing on long horizon tasks, LLMs are becoming a little too agentic by default, a little beyond my average use case.
For example in coding, the models now tend to reason for a fairly long time, they have an inclination to start listing and grepping files all across the entire repo, they do repeated web searchers, they over-analyze and over-think little rare edge cases even in code that is knowingly incomplete and under active development, and often come back ~minutes later even for simple queries.
This might make sense for long-running tasks but it's less of a good fit for more "in the loop" iterated development that I still do a lot of, or if I'm just looking for a quick spot check before running a script, just in case I got some indexing wrong or made some dumb error. So I find myself quite often stopping the LLMs with variations of "Stop, you're way overthinking this. Look at only this single file. Do not use any tools. Do not over-engineer", etc.
Basically as the default starts to slowly creep into the "ultrathink" super agentic mode, I feel a need for the reverse, and more generally good ways to indicate or communicate intent / stakes, from "just have a quick look" all the way to "go off for 30 minutes, come back when absolutely certain".
English
Michael retweetledi

Trump Urges Supporters To Move On From Societal Disdain For Pedophilia theonion.com/trump-urges-su…
English
Michael retweetledi

🤔 Wondering what MCP is? AWS Labs released MCP servers for interacting with your databases! Learn more 👉 go.aws/44OJk5d
English

@SDOTbridges Can you stop letting boats through? There's a ton of cars waiting to cross that this backup is creating aftereffects.
English

#WBOps Symphony Express is currently on standby. Stay tuned for updates.
English

#WBOps Peak Express is currently on standby. Stay tuned for updates.
English
Michael retweetledi
Michael retweetledi

Curious about what happened with DynamoDB in 2024?
From performance improvements to new developer tools, we've been busy making your database experience better.
Read our year in review to explore all the innovations. 👉 go.aws/4jJYSh1
English
Michael retweetledi













