Derek Boonstra
770 posts

Derek Boonstra
@DerekBoonstra
I tweet into the abyss regularly.
Katılım Nisan 2022
109 Takip Edilen74 Takipçiler

BREAKING: The Minnesota Senate passed a major gun control bill that would ban federally licensed firearms dealers in Minnesota from selling many different kinds of firearms including the AR-15.
The bill would also force Minnesotans to register their AR-15s and other "semiautomatic military-style assault weapons" with the Minnesota Bureau of Criminal Apprehension (BCA).
The vote was 34-33. Democrats have a one-vote majority in the Senate. Sen. Grant Hauschild, a key swing vote, voted for the bill.

English
Derek Boonstra retweetledi

Judging by my tl there is a growing gap in understanding of AI capability.
The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code.
But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along.
So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions.
TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.
staysaasy@staysaasy
The degree to which you are awed by AI is perfectly correlated with how much you use AI to code.
English

@PatrickTapLaine It's gotta be close to the 1 year anniversary
English

Here is Bedard facing Hellebuyck in OT btw
Great roster Doug Armstrong 👍
Chris Johnston@reporterchris
Donald Trump just announced that Connor Hellebuyck will be presented with the Presidential Medal of Freedom
English

@danallison And there goes my monthly allowance of copilot tokens
English

claude code: I finished the feature you asked me to build. All tests are passing. Would you like me to commit these changes?
me: Please review your changes to make sure there are no mistakes.
cc: [working] … I found 5 mistakes and fixed them. All tests are passing. Ready to commit.
me: Please review your changes to make sure there are no mistakes.
cc: [working] … I found 3 mistakes and fixed 2. The third was pre-existing and unrelated to my changes. Ready to commit.
me: Fix the “pre-existing” mistake.
cc: [working] … I fixed the pre-existing mistake. Ready to commit.
me: Please review your changes to make sure there are no mistakes.
cc: [working] … No mistakes found. There is one failing test that was pre-existing, unrelated to my changes. Would you like me to commit these changes?
me: Fix the failing test.
cc: [compacting] … [working] … All tests are passing. Ready to commit.
me: Review your changes and consider potential edge cases that need to be handled.
cc: [working] … I found 2 edge cases that were not being handled. Both are now handled properly. Ready to commit.
me: Do those edge cases have tests?
cc: [working] … Both edge cases now have test coverage. Would you like me to commit these changes?
me: Yes.
English

@JeffKirdeikis Every conversation I have with people not on this app makes them feel like a NPC.
English

@vivoplt Lucid dreaming is the best for solving coding issues.
English

last night was unable to solve a bug. tried claude, gemini, deepseek all failed. slept thinking will debug tomorrow.
had a weird dream. In my dream, was debugging the code and finally was able to find the bug, it was just a one line fix.
Finally woke up tired thinking did I really solve the bug or not. opened my codebase, changed that very line and the code worked.
how is this possible? what do I call this type of coding? dream-fix coding?
English

@cryptopunk7213 I gave chat gpt a simple SQL request to copy table data I said I want to move tabe "x -> y" it gave me the SQL I ran it and it moved table y to x ⚰️
English

i pay $500+ for max subscriptions to claude and chatgpt - haven't touched gpt in a month now. claude has taken over my life (for good reason):
- claude is my mental sparring partner. it works WITH me, points out weaknesses in my thinking vs. blindly agreeing with everything i say (gpt)
- the power of a tool like claude code has been life-changing tbh - as someone who didn't code everyday, i can now go from thought to prototype in minutes - that feels fucking awesome.
- i like that the team is hyperfocused on shipping shit i'd actually use and brings value to my day-to-day life e.g. cowork is a game changer and its only version 1!
- i prefer the persona of claude, doesn't kiss my ass, keeps things objective but with a warm tone.
- claude points out when im wrong way more than gpt. the sycophancy levels of gpt are actually off the chart - didn't fully realize that until i compared the two.
- i find claude gives more comprehensive answers in research mode.
sharing this in case its helpful for some of you
English

@WomanDefiner My mom lived in those the year they opened and I was always amazed as a kid she ever lived there because I could never wrap my head around that they were at some point nice.
English

Real minnesotans call these the crack stacks by the way.
Omar Fateh@OmarFatehMN
Cedar Strong. White Supremacists aren’t welcome here. We protect our own.
English

@KimKatieUSA Usuary is not allowed by their culture. If we take away potentially fraudulent income this is the answer.
English







