LDJ

466 posts

LDJ

@ldjconfirmed

Currently: Doing some stuff with AI. Prev founding team of: @NousResearch (2023) and @TTSLabsAI (2020) DM for interesting conversations.

S4 انضم Mart 2021

709 يتبع6K المتابعون

تغريدة مثبتة

LDJ@ldjconfirmed·16 Nis

Moores law created AI to save itself.

English

10.6K

LDJ@ldjconfirmed·2d

@kelstar_ @NousResearch @OpenRouter Not sure why that would impact the rankings in this way. But further update now; Hermes Agent is now #1 on the monthly global charts too, and is so far ahead that it has more usage than OpenClaw, Kilo Code and Claude Code combined.

English

Vlad Gaevsky@kelstar_·13 May

@ldjconfirmed @NousResearch @OpenRouter Do you think it has anything to do with the codex subscription being hyped and so good right now?

English

Nous Research@NousResearch·9 May

Hermes Agent is now #1 on the Global @OpenRouter token rankings. While our journey together has just begun, we'd like to take this opportunity to thank our contributors, supporters, and users for all they have done to get us this far.

English

439

727

7.2K

LDJ@ldjconfirmed·25 May

@willdepue x.com/ldjconfirmed/s…

LDJ@ldjconfirmed

GODMAAX is the new FAANG G = Google(Deepmind) O = OpenAI D = Deepseek M = Meta A = Anthropic A = Alibaba(Qwen) X = XAI

QME

LDJ@ldjconfirmed·12 May

@NousResearch @kelstar_ @OpenRouter Already now #1 on the weekly chart too😎

English

Nous Research@NousResearch·9 May

@kelstar_ @OpenRouter It's the daily global chart as clearly indicated in the screenshot. Don't worry, we'll do an all time #1 post when we get there too!

English

5.5K

LDJ@ldjconfirmed·28 Nis

@anko_979 @BlackHC It’s a common misconception that it was pulled, it wasn’t, it’s placement in the code of conduct was simply changed from the preface of the code of conduct to the ending section of the code of conduct where it still exists today.

English

AnKo@anko_979·28 Nis

@BlackHC It's almost like this fundamental slogan was pulled for some reason

English

133

Andreas Kirsch 🇺🇦@BlackHC·28 Nis

I'm speechless at Google signing a deal to use our AI models for classified tasks. Frankly, it is shameful. For HR, I'm not speaking on behalf of Google but in my personal capacity, quoting public information from a well-sourced article of a reputable publication

English

214

202

1.3K

253K

LDJ@ldjconfirmed·28 Nis

Roughly related but I don't think they intend to use any GDPval score as evidence for AGI being achieved. They say in the paper themselves that it's largely work that is only a time-horizon of a few hours and the context is much more assisted than a real job. GDPVal is also far easier and more saturated than something like RemoteLaborIndex which comprises of real Upwork tasks (but still not typical employment positions) Current GDPVal SOTA is over 80% Current RemoteLaborIndex SOTA is less than 5%

English

Dan McAteer@daniel_mac8·28 Nis

@ldjconfirmed @deredleritt3r It's also related to GDPval too

English

prinz@deredleritt3r·27 Nis

As to the timeline for OpenAI declaring AGI, recall that Amazon has committed to invest $35B in OpenAI upon the earlier of: (1) OpenAI IPO, or (2) OpenAI meeting "specified milestones" - which Reuters reported means OpenAI declaring AGI. Importantly, Amazon's commitment to invest $35B *expires* on December 31, 2028. Timing for the IPO depends heavily on market conditions. Thus, if Reuters' reporting was accurate, OpenAI must have felt very confident that AGI will be declared within the next ~2 years.

English

149

26.2K

LDJ@ldjconfirmed·28 Nis

@daniel_mac8 @deredleritt3r In Microsofts October 2025 blog post about their latest partnership terms with OpenAI: "Once AGI is declared by OpenAI, that declaration will now be verified by an independent expert panel."

English

Dan McAteer@daniel_mac8·27 Nis

@deredleritt3r true! hard for me to imagine how you can base an agreement on such vague language (unless, as you mention Prinz, it's more precise in the private agreement)

English

LDJ@ldjconfirmed·28 Nis

@deredleritt3r @daniel_mac8 "Economically valuable work" is further defined by people internally at OpenAI as the jobs tracked by the US bureau of labor statistics. So I suppose it's a majority of those jobs that they mean.

English

prinz@deredleritt3r·27 Nis

Not just "most"! What is "outperforming humans" - do we mean the average human, a Ph.D, a child...? What is "economically valuable work" - jobs or tasks? And do we include physical labor or not? Some of these things may be explained further in the agreement, but again that's never been made publicly available.

English

169

LDJ@ldjconfirmed·28 Nis

@deredleritt3r In Feb 2026, Sam Altman said at a Stanford hackathon: “If you are a sophomore now, you will graduate into a world with AGI in it" Sophomores in Feb 2026 are set to graduate around mid-2028. I believe this is the first and only time he's stated such a near-term AGI prediction.

English

510

LDJ@ldjconfirmed·10 Nis

@ChaosEmergent @haider1 GPT-4.5 started training ~may 2024, almost exactly 2 years ago now. (Based on official OpenAI statements that mentioned starting training on their new next generation model at the time, along with corroboration from WallStreetJournal and others)

English

155

Nikhil Shinday@ChaosEmergent·10 Nis

@haider1 wait then what was gpt-4.5 I assumed that the 5-series were distilled from 4.5 then RLVR'd as smaller models

English

Haider.@haider1·10 Nis

greg brockman recently confirmed that "spud" is openai first new pre-train model in two years since gpt-5.x models seem to build on gpt-4o/4-turbo, if openai RL can push a weaker base model like gpt-4o close to gpt-5.4-x-high-level intelligence then openai clearly has a secret sauce

English

1.1K

97.2K

LDJ@ldjconfirmed·21 Mar

@zephyr_z9 That quote is not true to what he said. His statement was directly opposite of the what you created within your quotations. Here is his actual quote about that topic: "Even by 2028, I don’t expect that we’ll get systems as smart as people in all ways"

English

183

Zephyr@zephyr_z9·20 Mar

"AI systems will be as smart as humans in all ways by 2028" Multimodality will still be a bottleneck

prinz@deredleritt3r

New interview with Jakub Pachocki in the MIT Technology Review: - The automated AI researcher (planned for 2028) is described as a "multi-agent" system, and will be able to "tackle problems that are too large or complex for humans to cope with". This is a clear indication that OpenAI expects the automated AI researcher to be superhuman at AI research. - But it won't be used only for AI research. "In theory, you would throw such a tool any kind of problem that can be formulated in text, code or whiteboard scribbles." This includes math, physics, biology, chemistry, "or even business and policy dilemmas". - Pachocki: "I think we are getting close to a point where we'll have models capable of working indefinitely in a coherent way just like people do... we will get to a point where you... have a whole research lab in a data center." - Saving the world by solving its hardest problems is the stated mission of all the top AI firms. "Pachocki says OpenAI now has most of what it needs to get there." - The automated AI research intern (targeted for September 2026) will be able to take on tasks "that would take a person a few days". Consider what this means with regard to METR's time horizon. - Pachocki: "If we really wanted to, we could build an amazing automated mathematician. We have all the tools, and I think it would be relatively easy. But it's not something we're going to prioritize now... there's much more urgent things to do. We are much more focused now on research that's relevant in the real world." - Pachocki does not expect that AI systems will be as smart as humans in all ways by 2028, "but I don't think it's absolutely necessary... you don't need to be as smart as people in all their ways in order to be very transformative."

English

124

16.7K

LDJ@ldjconfirmed·17 Mar

@juristr L9: You have the AI itself write the optimal coordination layer on the fly for spawning, routing and managing agents programmatically, in the way that works best for a given project and your preferences.

English

106

Juri Strumpflohner@juristr·16 Mar

What's your AI adoption level? (according to Steve Yegge)

English

287

989

2.2M

LDJ@ldjconfirmed·13 Mar

@otium33 @BasedBiohacker @bryan_johnson He has already publicly talked about results of his personal peptide experimentation prior to doing his shroom experiments.

English

otium@otium33·13 Mar

@BasedBiohacker @bryan_johnson I’m honestly very confused why he is taking shrooms but not peptides

English

1.8K

BasedBiohacker@BasedBiohacker·13 Mar

bryan johnson was NOT tracking 100+ biomarkers, doing blood transfusions with his son and turning down sex to sleep at 8:30 when he was building braintree/ venmo he was dogging it on cheeseburgers, stimulants and zero sleep he was fat. unhappy. suffered from existential doubt. the state of madness that births greatness. he only got into longevity because he lost purpose when he sold his company to paypal for $800 million so no bro, you should not be comparing yourself to nor model your life after what lizard johnson is doing. you can focus on that when you sell your biz for just short of a billion fucking dollars. focus on output and results, not optimizing for 50 years from now. yes - don't nuke your brain with amphetamines and coke, but also don't get so neurotic about health that you sacrifice your potential. max output. max results. you can track your nighttime boners and do total plasma exchanges when you have 100 mill.

English

527

98.6K

LDJ@ldjconfirmed·9 Mar

@haider1 It's been confirmed that some devs outside of OpenAI had early access to GPT-5.4 for atleast "a few weeks" prior to public release. Exhibit A:

Pietro Schirano@skirano

This model is absolutely insane. I’ve been using it for a few weeks, and it’s the first model that made the impossible feel possible for me. Particularly the pro version , it’s capable of solving even the hardest problems.

English

6.1K

Haider.@haider1·9 Mar

openai probably didn't have the model just sitting there for a while instead, gpt-5.4 was likely a checkpoint from a model that was still training, released when they felt pressure even logan from google told me in DM that: "google doesn't hold models back because the competition is high"

English

530

83.8K

LDJ@ldjconfirmed·8 Mar

@thebasedcapital I think it did well. It lasted nearly 3 full years.

English

1.2K

basedcapital@thebasedcapital·8 Mar

@ldjconfirmed benchmarks built to prove ai can't do something have a short shelf life. the gaia authors probably expected it to last years, not months. at some point you have to stop moving the goalposts and accept that general capability is here even if it's not agi

English

1.5K

LDJ@ldjconfirmed·8 Mar

In November 2023, Yann LeCun, Thomas Wolf and others from Meta and Huggingface created a benchmark called GAIA, which described itself as: "A benchmark for General AI Assistants that, if solved, would represent a milestone in AI research." Most of the problem solutions were kept private, not released online. It proposed 466 "real-world questions that require a set of fundamental abilities such as reasoning, multi-modality handling, web browsing, and generally tool-use proficiency." On the hardest level, the average human score was 87%, while the leading systems scored less than 3%. 10 months later OpenAI released O1-preview, reaching ~30% on that level. Now in 2026 the human baseline for the hardest level has officially been surpassed, the best agent systems are now scoring 88.9% on GAIAs hardest level (level 3).

English

798

78.7K

LDJ@ldjconfirmed·8 Mar

@ThomasScialom Unfortunately I can’t find any human baselines for GAIA 2.

English

1.2K

Thomas Scialom@ThomasScialom·8 Mar

@ldjconfirmed There is Gaia 2 now which is a very good eval for environments a la Openclaw

English

LDJ@ldjconfirmed·8 Mar

@xundecidability @WaveTheoryAI The difference here is that GAIA is real world questions involving highly specific information that exists amongst human civilization across a diverse set of modalities, not an abstract puzzle.

English

xundecidability@xundecidability·8 Mar

@WaveTheoryAI @ldjconfirmed "designed" is aspirational, its possible their design was defeated. ARC also aspired to this, but the labs soon learned to beat it. O1 was the first model they fine-tuned on it, but they didn't release that version because it was dumber at everything else.

English

118

LDJ@ldjconfirmed·8 Mar

@nithin_k_anil The human baseline score was also matched/surpassed by GPT-5 and Gemini-3-Pro working together without any specialized orchestrator in the loop, and only scored ~2% below the top score by Nvidia. I imagine Opus 4.6, GPT-5.4 and Gemini-3.1 together would get an even better score.

English

194

LDJ@ldjconfirmed·8 Mar

The current highest level 3 score was achieved by Nvidia, leveraging a multi-agent system that includes Nvidias own tool orchestrator model. It scores 89.8% on Lvl 3 (even higher than the 88.9% typo I wrote above) The public leaderboard can be seen here: huggingface.co/spaces/gaia-be…

English

6.5K

LDJ@ldjconfirmed·27 Şub

This is the best AI-generated short film esque media I’ve seen thus far. Not just visually, but actual storytelling to go with it too.

Next on Now@next_on_now

1:1 is coming, will you even be ready?

English

681

اكتشف

@kelstar_ @NousResearch @OpenRouter @willdepue @anko_979 @BlackHC @deredleritt3r @daniel_mac8