Sabitlenmiş Tweet
Andre Infante
24.2K posts

Andre Infante
@AndreTI
Making games and robots and sometimes other things. (Formerly: 1X, Meta)
Bay Area Katılım Ocak 2009
354 Takip Edilen1.5K Takipçiler

@ChadNotChud x86 assembly has many of the same problems as these esoteric languages but is definitely in the dataset. So you could try that to see if it's a data problem. I bet it does better than brainfuck, but not a lot better.
English

@ChadNotChud Yeah, I think humans would also do way worse on this benchmark. Even given a meaningful amount of time to practice. A better benchmark would be one that randomizes superficial aspects of the language within the space of reasonable languages that a person might use.
English

in fairness, if you asked me to find the longest common subsequence in brainfuck I would shoot you
Lossfunk@lossfunk
🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵
English

@AussieFuckery @shadowmight1 @kmcnam1 It performed similarly to physicians as a whole and worse than expert physicians in a meta-analysis of studies that date back to *2018* (pre GPT-2). Doing as well as typical physicians for ~free is a huge deal and the large time window predictably worsens the AI's comparison.
English

@AndreTI @shadowmight1 @kmcnam1 This one literally says in the opening paragraph that AI performed significantly worse then expert physicians.
You actually didn't read this did you?
English

@AndreTI @shadowmight1 @kmcnam1 >claims wide margins
>literally no statistical difference at all, entirely within margin of error, paper itself literally says the results are non-significant
You didn't even fucking read it
English

@AussieFuckery @shadowmight1 @kmcnam1 *Sigh*.
That's the AI + physician group vs the AI plus conventional resources group. The AI plus physician group didn't do any better because the physicians didn't actually use it. If you read past the introduction, the AI on its own (no physician) outperformed both groups.

English

@AndreTI @shadowmight1 @kmcnam1 Except your articles say "no difference" and "significantly worse" so my statement was entirely true. There is not a single article that states AI is better then physicians. You've done the opposite - provided a paper proving AI is objectively worse in every way
English
Andre Infante retweetledi

@shadowmight1 @AussieFuckery @kmcnam1 I'm sure that's true to an extent, but some people will read these threads who are *not* die hard partisans, so if someone posts something that is literally not true ("not a single peer reviewed article"), it's worth providing evidence that this is not the case.
English

@AndreTI @AussieFuckery @kmcnam1 You can't argue with AI luddites. Twitter told them that AI is bad and evil and will kill us all and steal all jobs and destroy the water.
They look at the tool that excels at pattern recognition, then they look at a task of recognizing patterns, and all they see is mecha Satan.
English

@NigelKBaker @DJSnM Yeah, impossible is the wrong objection. We've been putting computers in space for a long time. The issue is entirely economic competitiveness.
English

Rather than arguing based on 'feels' or asking AI to make up an answer that sounds plausable, I actually did some math on cooling data centers in space, while explaining the basics of thermal balance.
patreon.com/posts/cooling-…
Shame I overestimated the size of the spacecraft, but, the numbers are at least in the right ball park.
English

@kmcnam1 If the dog could write working software and pass most graduate level college courses, I think that's a pretty different situation.
English

@AussieFuckery @shadowmight1 @kmcnam1 Here's a meta-analysis showing comparable performance between generative AI tools and physicians. And note that meta-analyses are inherently downward biased compared to the cutting edge because most of the data is older models.
nature.com/articles/s4174…
English

@AussieFuckery @shadowmight1 @kmcnam1 Here's a study where ChatGPT alone outperforms physicians using Google *and* physicians using ChatGPT by wide margins at diagnostic tasks.
jamanetwork.com/journals/jaman…
English

@0xglitchbyte AI coding models are still not that smart and it's tempting to use them at an excessive level of abstraction that causes issues. But if you can't get time savings *and* correctness out of them, it's a skill issue, not a permanent feature of the technology.
English

@0xglitchbyte Taken literally, this thesis implies that delegation in software can never happen and every line of code should be manually implemented by the CTO. After all, how could he possibly communicate intent via emails or design documents? Only code will do.
English

@NigelKBaker @DJSnM To some extent, I suspect Musk believes it makes sense because it's necessary to create a nominal synergy between SpaceX and xAI, so the more successful company can bail out the less successful.
English

@NigelKBaker @DJSnM It's not impossible, but it's expensive. Power is like 6% of the all in cost of running a GPU over its lifetime, iirc. And launching into space increases all of your other costs by a lot more than the 6% savings from free solar.
English

@wg_alen I'm sure it's a lot of money, but it also seems like he's just having fun doing them.
English

Jim Carrey being tired of Hollywood but coming back just to play Eggman is just so funny to me
The Hollywood Handle@HollywoodHandle
Jim Carrey is reportedly set to return in ‘SONIC. 4.’ Currently shooting his scenes. (Via: @DanielRPK)
English

@kbean511 @KelseyTuoc The point is to stop US citizens from voting, which is why if you propose a compromise that achieves the goal but doesn't disenfranchise any citizens, the chuds freak out. Because disenfranchising people was the whole point.
English

@kbean511 @KelseyTuoc Millions of Americans don't have a valid ID they could vote with, and it's wrong to disenfranchise them or charge them money to rectify this. The fact that fixing this is a problem for people kind of proves that election integrity was never the point of this.
English

This is the correct stance. Photo voter ID is a good idea. The SAVE act is a bad idea because it's *not* just photo voter ID, but photo voter ID is a good idea. Also probably advantages Dems politically, but it's also good on the merits.
Steven Dennis@StevenTDennis
Schumer tells press Democrats are *not* opposed to photo voter ID.
English








