Bill Leoutsakos (@Bi11Leou) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

Just built an open-source real-time AI assistant using the newest gpt-realtime API from @OpenAI (watch the demo below) Check out the repo and feel free to reach out for any questions & take ideas from it to build SOTA real-time voice agents! github.com/BillLeoutsakos… Here are some of the challenges I faced while building it, how I solved them and what I learned: 👇 (also pls like / comment / repost, it's my first launch on Twitter and I wanna go a bit viral 😁)

English

12

5

33

6.2K

Bill Leoutsakos@Bi11Leou·15h

@Polymarket Years??? Why did bro say that 😭. The stock market is cooked rn

English

0

1

8

Bill Leoutsakos@Bi11Leou·1d

@AndrewCurran_ The thing is... gpt 4.5 was a huge model but the result underperformed. Maybe the methods have advanced a lot since then

English

0

2

190

Andrew Curran@AndrewCurran_·1d

Three weeks ago there were rumors that one of the labs had completed its largest ever successful training run, and that the model that emerged from it performed far above both internal expectations and what people assumed the scaling laws would predict. At the time these were only rumors, and no lab was attached to them. But in light of what we now know about Mythos, they look more credible, and the lab was probably Anthropic. Around the same time there were also rumors that one of the frontier labs had made an architectural breakthrough. If you are in enough group chats, you hear claims like this constantly, and most turn out to be nothing. But if Anthropic found that training above a certain scale, or in a certain way at that scale, produces capabilities that sit far above the prior trendline, then that is an architectural breakthrough. I think the leaked blog post was real, but still a draft. Mythos and Capybara were both candidate names for the new tier, though Mythos may now have enough mindshare that they end up keeping it. The specific rumor in early March was that the run produced a model roughly twice as performant as expected. That remains unconfirmed. What is confirmed is that Anthropic told Fortune the new model is a 'step change,' a sudden 2x would certainly fit the definition. We will find out in April how much of this is true. My own view is that the broad shape of this is correct even if some of the numbers are wrong. And if it is substantially accurate, then it also casts OpenAI's recent restructuring in a new light. If very large training runs are about to become essential to staying in the game, then a lot of their recent decisions, like dropping Sora, make even more sense strategically. For the public, this would mean the best models in the world are about to become much more expensive to serve, and therefore much more expensive to use. That will put pressure on rate limits, pricing, and subscription plans that are already subsidized to some unknown degree. Instead of becoming too cheap to meter, frontier intelligence may be about to become too expensive for most of humanity to afford. Second-order effects; compute, memory, and energy are about to become much more important than they already are. In the blog they describe the new model as not just an improvement, but having 'dramatically higher scores' than Opus 4.6 in coding and reasoning, and as being 'far ahead' of any other current models. If this is the new reality, then scale is about to become king in a whole new way. It would also mean, as usual, that Jensen wins again.

English

171

306

3.9K

836.7K

Bill Leoutsakos@Bi11Leou·1d

How me and the boys be chilling in 🇬🇷 before big release

English

0

1

54

Bill Leoutsakos@Bi11Leou·1d

@ns123abc This guy built agi btw

English

0

39

NIK@ns123abc·1d

Dario Amodei’s sister loved stuffed animals so much that her fiancé proposed via a movie of her dolls coming to life Dario wore a panda suit to their wedding Their clique at openai then became “the pandas”

English

58

31

1.1K

111.2K

Bill Leoutsakos@Bi11Leou·1d

"Getting in is the hard part" is what everyone including me thought after i got in. Its utter nonsense. The university exams and content are 100x harder than any test you ll take befpre uni. In highschool I was hitting 90 and 100% with modest preparation, now i am grinding and hoping for a 60 or a 70%. If you are in a stem course in a good uni, doing the course is a pain in the ass.

English

0

254

vas@vasuman·1d

There is no alpha in graduating from college. Getting in is the hard part; pretty much everyone graduates. Therefore the optimal path is to get into the best college you can, then drop out immediately while using that name brand as signal and getting real world experience at a very fast paced and high growth job opportunity. Or idk just have fun while making friends and memories like a normal college student. Just a thought.

English

27

3

316

76.7K

Bill Leoutsakos@Bi11Leou·1d

@simdotai 50cm

1

0

1

22

Bill Leoutsakos@Bi11Leou·1d

@j_dekoninck @hyhieu226 Damn gpt 5.4 brutally terramogged previously #1 ranked LLM Opus 4.6 here

English

0

1

111

Jasper Dekoninck@j_dekoninck·1d

Last year, models miserably failed on USAMO 2025. This year, GPT-5.4 scores an amazing 95%, essentially saturating the benchmark. Yes, LLMs still make many mistakes, but overall, one can be nothing but amazed at what they are achieving and how steep progress in AI4Math is.

English

27

67

578

65.4K

Bill Leoutsakos@Bi11Leou·1d

@creepydotorg If they put 1% of this creativity in a job interview they would have been employed a long time ago

English

0

3

239

Creepy.org@creepydotorg·1d

German climate activists stood on melting blocks of ice with ropes around their necks, warning that “time is running out.”

English

3.1K

1.1K

28K

9.4M

Bill Leoutsakos@Bi11Leou·2d

@vasuman I think bro is beyond cooked

English

0

2

303

vas@vasuman·2d

> the rumors are not true > *comments off*

Karun Kaushik@karunkaushik_

Over the past week, you may have seen an anonymous post about Delve. While we responded to it in a day, we want to provide more details about what’s true, what's not, and some changes we’ve made. There’s one question behind everything: did Delve fabricate compliance evidence or issue fraudulent audit reports? No. We did not. → Delve is an AI compliance platform that connects customers with independent auditors. We are not an auditor, just as tax preparation software is not an accountant. We have never signed an audit report. → Using default templates for our customers, just like any other compliance platform, is not “faking evidence.” These are meant to serve as a starting point for customers. → Delve does have automation in the platform, with 600+ automated integration tests, an AI Copilot to guide customers through compliance, AI code scanning, and more. -- We built Delve to accelerate innovation by bringing AI to compliance. In doing that, we pushed hard on automation. However, we now realize we didn’t provide enough clarity about what is automated, what is customer-provided, and what is independently audited. We have been working relentlessly to make improvements over the last week. -- On our auditor network: Delve connects customers with independent auditors. Some customers choose their own auditors, but many use firms in our network. Questions have been raised about some of those firms, including ones used by other platforms. Going forward we will set a higher bar in how our auditor relationships are structured and how the process is experienced by customers. Delve is rebuilding our auditor network, removing firms that don’t meet our standards, and offering complimentary re-audits and penetration tests to every customer. On platform templates for our customers: Delve provides default templates, just like many other platforms, for policies, board meetings, risk assessments, and more. These are designed to be starting points only. We should have been more explicit about how they are meant to be reviewed and customized by customers. We are making that indisputably clearer within the platform. On draft audit reports: Third-party auditors are responsible for independently reviewing all evidence and issuing final reports. We built automation that interacts closely with independent audit workflows to help expedite the process on behalf of our customers. However, this contributed to confusion about where automation ends and independent judgment begins. From now on, Delve will no longer automate these parts of the process. Furthermore, customers have a direct line of communication with their auditor to enhance transparency in any audit communications. -- We started Delve because we went through compliance ourselves and saw how slow, expensive, and manual it was. To anyone that wants to sit down and discuss our product philosophy and improvements, please reach out and let’s chat about it.

English

15

16

1.1K

96.8K

Bill Leoutsakos@Bi11Leou·2d

Why did bro limit the answers tho 🤔

Karun Kaushik@karunkaushik_

Over the past week, you may have seen an anonymous post about Delve. While we responded to it in a day, we want to provide more details about what’s true, what's not, and some changes we’ve made. There’s one question behind everything: did Delve fabricate compliance evidence or issue fraudulent audit reports? No. We did not. → Delve is an AI compliance platform that connects customers with independent auditors. We are not an auditor, just as tax preparation software is not an accountant. We have never signed an audit report. → Using default templates for our customers, just like any other compliance platform, is not “faking evidence.” These are meant to serve as a starting point for customers. → Delve does have automation in the platform, with 600+ automated integration tests, an AI Copilot to guide customers through compliance, AI code scanning, and more. -- We built Delve to accelerate innovation by bringing AI to compliance. In doing that, we pushed hard on automation. However, we now realize we didn’t provide enough clarity about what is automated, what is customer-provided, and what is independently audited. We have been working relentlessly to make improvements over the last week. -- On our auditor network: Delve connects customers with independent auditors. Some customers choose their own auditors, but many use firms in our network. Questions have been raised about some of those firms, including ones used by other platforms. Going forward we will set a higher bar in how our auditor relationships are structured and how the process is experienced by customers. Delve is rebuilding our auditor network, removing firms that don’t meet our standards, and offering complimentary re-audits and penetration tests to every customer. On platform templates for our customers: Delve provides default templates, just like many other platforms, for policies, board meetings, risk assessments, and more. These are designed to be starting points only. We should have been more explicit about how they are meant to be reviewed and customized by customers. We are making that indisputably clearer within the platform. On draft audit reports: Third-party auditors are responsible for independently reviewing all evidence and issuing final reports. We built automation that interacts closely with independent audit workflows to help expedite the process on behalf of our customers. However, this contributed to confusion about where automation ends and independent judgment begins. From now on, Delve will no longer automate these parts of the process. Furthermore, customers have a direct line of communication with their auditor to enhance transparency in any audit communications. -- We started Delve because we went through compliance ourselves and saw how slow, expensive, and manual it was. To anyone that wants to sit down and discuss our product philosophy and improvements, please reach out and let’s chat about it.

English

0

3

766

Bill Leoutsakos@Bi11Leou·2d

@cursor_ai Plot twist: Cursor drops continual learning before the labs

English

0

26

Cursor@cursor_ai·3d

Earlier this week, we published our technical report on Composer 2. We're sharing additional research on how we train new checkpoints. With real-time RL, we can ship improved versions of the model every five hours.

English

97

122

1.6K

454.1K

Bill Leoutsakos@Bi11Leou·2d

@scaling01 Claude 4.5 was already phenomenal... imagine what Claude 5 is gonna be. Its over for openai and google

English

0

304

Lisan al Gaib@scaling01·2d

APRIL IS GOING TO BE SICK GPT-5.5 CLAUDE 5 MYTHOS DEEPSEEK-V4

English

120

154

3.8K

158.5K

Bill Leoutsakos@Bi11Leou·2d

Chatgpt reachout final boss 🤦‍♂️

English

0

62

Bill Leoutsakos@Bi11Leou·3d

@DEhnts All the growth beforehand came from building houses anyway. After the 70s we never invested in anything important in tech or otherwise.

English

0

89

Dirk Ehnts@DEhnts·4d

Greece used to be at 95 percent of the EU average when it comes to GDP per capita. Now it stands at 68%. Those that talk about a "Greek recovery" need to get their facts straight. Greece stabilized at a low level of economic activity, then grew a bit, but it did not "recover".

EU_Eurostat@EU_Eurostat

The preliminary 2025 results show that gross domestic product (GDP) per capita — expressed in purchasing power standards — ranged between 68% of the EU average in 🇬🇷Greece and 🇧🇬Bulgaria and 239% in 🇱🇺Luxembourg. Read more 👉link.europa.eu/94N43x

English

52

389

1.2K

57K

Bill Leoutsakos@Bi11Leou·4d

@chatgpt21 Now they will see what it has, put it in the training data, and in a year from now we ll all be like wow arc agi 3 is saturated thats insane

English

0

107

Chris@chatgpt21·4d

WOW! Models preform HORRIBLY on ARC AGI 3 Gemini 3.1 pro 0.37% GPT 5.4 (High) 0.26% Opus 4.5 (Max) 0.25% I wonder how long It’ll take for this benchmark to be solved

English

145

85

1.7K

165.5K

Bill Leoutsakos@Bi11Leou·4d

@FT Desperatemaxxing

Português

0

1

49

Financial Times@FT·5d

The billionaire’s legal team said that because Chancellor Kathaleen McCormick had liked a post celebrating his recent legal defeat and thereby created ‘a perception of bias against Mr. Musk in these cases, recusal is necessary and warranted’. ft.trib.al/6h9vFAH?

English

95

135

987

261.3K

Bill Leoutsakos@Bi11Leou·4d

@Drakon1c Wait wtf, so how much is it in august 💀

English

1

0

1

10

Adarsh@Drakon1c·4d

@Bi11Leou come ends

English

1

0

14

Bill Leoutsakos@Bi11Leou·4d

Pov: You moved to athens but the weather is the same as Cambridge 🤦‍♂️. When is it gonna start heatmaxxing here, when i was a kid it had like 25 degrees by the start of april at least.

English

1

0

1

48

Bill Leoutsakos@Bi11Leou·4d

The 2 most goated companies ngl

Disclose.tv@disclosetv

NEW - Anduril and Palantir to develop software to run Trump’s Golden Dome antimissile shield — WSJ

English

0

1

31

Bill Leoutsakos@Bi11Leou·5d

@TheBTCTherapist +getting sued by microsoft

English

0

684

The ₿itcoin Therapist@TheBTCTherapist·5d

- OpenAI shutting down Sora - Disney backing out from OpenAI deal - OpenAI backlash from Pentagon deal - $11.5 billion in quarterly losses - $207 billion funding gap - No profitability before 2030 The bubble is bursting

DiscussingFilm@DiscussingFilm

OpenAI is shutting down its AI video slop-making platform Sora.

English

379

6K

65K

2.8M

Bill Leoutsakos

Keşfet