Zsolt Ero

898 posts

Zsolt Ero

@hyperknot

Building https://t.co/CUfyhT0Ura and https://t.co/GTLrvnmS0h Writing on https://t.co/irgNrwubhY Loves paragliding

Europe Entrou em Temmuz 2012

1K Seguindo2.5K Seguidores

Zsolt Ero@hyperknot·2d

@arvidkahl @OpenRouter :flex endpoint when?

English

263

Arvid Kahl@arvidkahl·2d

If you do AI inference via OpenAI’s API, you should use the flex tier for half price. My requests always try to use flex tier first, and on 429 / 500 errors, I use the default service tier. 95% of my requests are flex. 2 tries flex, then fall back to standard. Massive cost cut.

English

172

19.2K

Zsolt Ero retweetou

NIK@ns123abc·3d

🚨 MICROSOFT ABOUT TO SUE OPENAI & AMAZON >be microsoft >invest $1B in openai >gets exclusive azure cloud deal >invest another $10B+ >gets rights to 49% of profits +IP >Azure goes brrrrrr >Altman lies to board, quietly launches ChatGPT >board fires him for being a lying manipulative snake >Satya goes to war for Altman. saves his entire career >Altman retvrns in 5 days >immediately purges everyone who purged him >full control. no oversight. thanks Satya! >fast forward to 2025 >OpenAI restructures from non-profit to PBC >MSFT $13.8B is now worth $135B. 10x return >plus 27% of OpenAI >but gives up cloud exclusivity + profit share >KEEPS API clause >all API calls contractually MUST route through Azure >Satya thinks life is good lol >5 months later >Sam Altman becomes strong enough to betray you >"raises $110B round" >doesn't need satya daddy's money anymore >announces $50B deal with AMAZON >$138B in AWS cloud commitments >amazon and openai claim they built some cope called a "Stateful Runtime Environment" >Microsoft lawyers hmmm >Altman: it's not what it looks like. i can totally explain >so it's technically not an API call because it's "stateful" >and it's a... "Runtime Experience" >totally di!erent thing >pls ignore the TCP packets lol >Microsoft engineers look at the SRE architecture >"THIS IS NOT TECHNICALLY POSSIBLE without violating the contract." *Satya finds out he's been cucked* Microsoft exec literally tells FT: "We know our contract. We will sue them if they breach it." >AWS quietly gives employees a memo on which words are legally safe lmao >can say: "powered by" or "enabled by" or "integrates with" OpenAI >cannot say: "enables access to" or "calls on" ChatGPT >also cannot suggest frontier models are "available on AWS" Microsoft: "If Amazon and OpenAI want to take a bet on the creativity of their contractual lawyers, I would back us, not them." Scam Altman strikes AGAIN.

Financial Times@FT

Microsoft weighs legal action over $50bn Amazon-OpenAI cloud deal ft.trib.al/6LZe39E

English

474

1.6K

14.3K

2.1M

Zsolt Ero@hyperknot·13 Mar

@Rasmic @romainhuet They usually say the one you give is better. Ask them in a new conversation which one is better from the two.

English

Micky@Rasmic·12 Mar

I got both gpt-5.4 and opus 4.6 to generate a plan... I gave gpt's plan to opus and it admitted that it was a better plan lol

English

585

67.7K

Zsolt Ero@hyperknot·3 Mar

@OfficialLoganK 3.1 is unusable for creative writing. We have an email drafting pipeline carefully tuned over months, which worked perfectly on 3.0. 3.1 is unusable, it's a codemaxxed model like some of the GPT 5 series. Where 3.0 writes 4 beautiful paragraphs, 3.1 writes 4 bullet points.

English

281

Logan Kilpatrick@OfficialLoganK·3 Mar

PSA: we are turning down Gemini 3 Pro next Monday March 9th. You can upgrade to 3.1 Pro Preview which improves on lots of the things folks gave feedback about on the first Gemini 3 rev. Please keep the feedback coming : )

English

267

1.8K

591.3K

Zsolt Ero retweetou

Justin Starner@Justin_Starner·14 Şub

I am excited to announce that BigBoy Charging is now officially powered by OpenFreeMap! This latest addition, combined with the incredible mapcn, completes my goal of being able to provide this service for free to all EV drivers 🔋⚡ Huge thanks to @hyperknot and @sainianmol16 for making this dream of mine a reality 🫶

English

341

Zsolt Ero retweetou

Sam Rose@samwhoo·1 Şub

Reminder before Sonnet 5 drops: SWE-bench tests a model’s ability to fix small Python bugs in 12 repos in one-shot with appropriate context fed to it. It’s not a measure of agentic coding ability. I wrote in detail about a bunch of benchmarks and what they mean, link below.

English

5.6K

Zsolt Ero@hyperknot·29 Oca

@benhylak ??

457

ben@benhylak·29 Oca

it's silly that google charges ~50% more when you call a model through vertex

English

14.6K

Zsolt Ero@hyperknot·16 Ara

@itsolelehmann UK was it

English

142

Ole Lehmann@itsolelehmann·16 Ara

what is the country in europe with the most pro-entrepreneur/pro-growth mindset? talking about general sentiment, not tech bubble sentiment in these countries

English

228

122

48.7K

Zsolt Ero retweetou

Neal Agarwal@nealagarwal·10 Ara

Made a site comparing the sizes of living things :) The great Julius Csotonyi spent 5 months painting over 60 illustrations for the site, no ai used

English

336

6.3K

51.9K

1.1M

Zsolt Ero@hyperknot·9 Ara

@somewheresy Half of their business is API, where they cannot insert ads otherwise all workloads would break.

English

∿@somewheresy·9 Ara

this is going to continue to drive so many people in tech to local models, and when distributed inference is solved, communities providing endpoints for their eucliean-distanced makerspace

Kalshi@Kalshi

JUST IN: Google to launch ads on Gemini

English

2.4K

Zsolt Ero@hyperknot·7 Ara

I think the "you are an expert in ..." hasn't been useful for quite some time, the results are indeed the same. What is still not explored though is putting the result back into the same model by "An LLM answered X for the prompt Y, evaluate it". Interestingly this manages to find critical points missed in the first run.

English

Benjamin De Kraker@BenjaminDEKR·7 Ara

If you ran tests, I don't think there would be a measurable quality or accuracy difference in the responses between "what do you think" (with context of what you're discussing) vs. "What would someone who xyx think". It's a neat concept to understand but I highly doubt there is any measurable difference in results (with a modern, Thinking generation SOTA LLM.)

English

440

Benjamin De Kraker@BenjaminDEKR·7 Ara

Counterpoint: It's fine. An LLM, especially a Thinking model, has no problem extrapolating "what do you think" into the more zoomed-out version of "what would someone who ___ likely think." (Which Karpathy basically says later in the post.) Saying "you" to an LLM is fine.

Andrej Karpathy@karpathy

Don't think of LLMs as entities but as simulators. For example, when exploring a topic, don't ask: "What do you think about xyz"? There is no "you". Next time try: "What would be a good group of people to explore xyz? What would they say?" The LLM can channel/simulate many perspectives but it hasn't "thought about" xyz for a while and over time and formed its own opinions in the way we're used to. If you force it via the use of "you", it will give you something by adopting a personality embedding vector implied by the statistics of its finetuning data and then simulate that. It's fine to do, but there is a lot less mystique to it than I find people naively attribute to "asking an AI".

English

2.2K

Zsolt Ero@hyperknot·7 Ara

This is the state of the art on ChatGPT when asked to create a map (topic was hot springs in Cyprus). I guess interactive maps are safe for now.

English

451

Zsolt Ero@hyperknot·3 Ara

Finally it zero-shot a Python script, which generated this beautiful dashboard and run an optimization problem in the console (which I should copy-and-paste back). For the first time, I feel an LLM really understands it's an LLM and communicates like a partner for the first time. Opus 4.5 is something special.

English

234

Zsolt Ero@hyperknot·3 Ara

It drew a comparison chart in ASCII with the exact values of the Physics problem we are discussing (I didn't ask for it, yet it's amazing, even with the spacing bug!)

English

247

Zsolt Ero@hyperknot·3 Ara

I've been working with AI models daily since ChatGPT came out, and there is something new in Opus 4.5 what I haven't seen in anything else before. Something in Opus 4.5 feels almost as big of a step as OpenAI o1 was. I'm asking about a hypothetical experiment of making a vented paragliding airbag, which doesn't exist yet, and for the first time, I feel an LLM can "imagine" the world and communicate how physics behaves by making ASCII drawings and Python simulations. GPT-5.1 is better in doing calculations internally, but it doesn't recognize it's own limitations. Opus 4.5 doesn't even try, it makes a Python script instead, and asks me to run it and give back the results. I didn't even ask for either illustrations or a script or mentioned Python or coding! Some examples:

English

302

Zsolt Ero@hyperknot·2 Ara

@Noahpinion Why is it good? Do you mean it'll make people use the app less and less over time and hopefully find better sources / move conversations offline?

English

443

Noah Smith 🐇🇺🇸🇺🇦🇹🇼@Noahpinion·2 Ara

You can tell this app is Dead Internet because you don't get people wandering into conversations anymore. Even the pile-ons are obviously just foreign influence ops running on fumes. Everyone's trapped in their algorithmic feed -- it's TikTok without video. (and that's good)

English

214

17.1K

Zsolt Ero@hyperknot·2 Ara

@mitsuhiko @badlogicgames @ExaAILabs or @p0

Zsolt Ero@hyperknot·2 Ara

@mitsuhiko @badlogicgames or @ExaAILabs

Mario Zechner@badlogicgames·2 Ara

Want your coding agent to use Google for web searches, pull down readable page content as markdown, and get full observability WTF is inserted from the web into your agent's context? Clone this repo and point your agent at the README.md github.com/badlogic/agent…

English

103

13.9K

Descobrir

@arvidkahl @OpenRouter @Rasmic @romainhuet @OfficialLoganK @benhylak @itsolelehmann @somewheresy