Ian Lucas

359 posts

Ian Lucas

@RealIanLucas

انضم Haziran 2023

89 يتبع42 المتابعون

Ian Lucas@RealIanLucas·8 Nis

@elonmusk @DannyLimanseta @grok based on past LLM releases from major labs (including, but not limited to, xAI), what is the timeline after the pre-training phase before a model becomes publicly available? Please give an average and a reasonable time window estimate based on actual historical releases.

English

496

Elon Musk@elonmusk·8 Nis

@DannyLimanseta Pre-training phase is ~2 months

English

189

160

326.8K

Elon Musk@elonmusk·8 Nis

SpaceXAI Colossus 2 now has 7 models in training: - Imagine V2 - 2 variants of 1T - 2 variants of 1.5T - 6T - 10T Some catching up to do.

English

6.7K

7.7K

68.3K

28.1M

Ian Lucas@RealIanLucas·3 Nis

@niccruzpatane I need this in my life immediately.

English

Nic Cruz Patane@niccruzpatane·2 Nis

Everyone is going to want a ~$30K Tesla Cybercab when it becomes available. Tesla will sell millions of these: • Magnitude safer than human driving. • Have the ability to legally sleep as it’s driving you. • Operating costs could drop to as low as $0.20 per mile. • Great for elderly individuals who are no longer able to drive, as well as people with disabilities. • Work as are you being driven, or watch movies/play games. • Send off to run errands (pick up kids, pick up someone at the airport, etc). • The ability to add/subtract from the Tesla Robotaxi fleet to earn passive income. • Send to pick up groceries, or other orders. • Have the ability to send home after getting dropped off your location, eliminating the need for parking. • Send for service autonomously when needed. • Virtually Zero Maintenance. This car will revolutionize transportation, and car ownership.

English

579

918

7.8K

638.7K

Ian Lucas@RealIanLucas·31 Mar

@DeryaTR_ In the Web UI, it couldn't even generate a 1-page DOCX yesterday without hitting some sort of session limit (which seems not to have been my overall usage limit, fwiw)

English

138

Derya Unutmaz, MD@DeryaTR_·31 Mar

After 15 min of asking Claude Code Opus 4.6 to continue where it left off. which had hit the limit, 90% of the limit was already used. A few minutes later, it was at 100%. Result: zero tasks completed! This is as bad as it gets. Clearly, the token-eater bug is still not fixed!

English

195

1.7K

77.9K

Ian Lucas@RealIanLucas·26 Mar

@karpathy Maybe by this time next year? When this level of ability is coupled with handling all business admin, marketing, and banking, an entrepreneurial Cambrian explosion will happen.

English

Andrej Karpathy@karpathy·26 Mar

When I built menugen ~1 year ago, I observed that the hardest part by far was not the code itself, it was the plethora of services you have to assemble like IKEA furniture to make it real, the DevOps: services, payments, auth, database, security, domain names, etc... I am really looking forward to a day where I could simply tell my agent: "build menugen" (referencing the post) and it would just work. The whole thing up to the deployed web page. The agent would have to browse a number of services, read the docs, get all the api keys, make everything work, debug it in dev, and deploy to prod. This is the actually hard part, not the code itself. Or rather, the better way to think about it is that the entire DevOps lifecycle has to become code, in addition to the necessary sensors/actuators of the CLIs/APIs with agent-native ergonomics. And there should be no need to visit web pages, click buttons, or anything like that for the human. It's easy to state, it's now just barely technically possible and expected to work maybe, but it definitely requires from-scratch re-design, work and thought. Very exciting direction!

Patrick Collison@patrickc

When @karpathy built MenuGen (karpathy.bearblog.dev/vibe-coding-me…), he said: "Vibe coding menugen was exhilarating and fun escapade as a local demo, but a bit of a painful slog as a deployed, real app. Building a modern app is a bit like assembling IKEA future. There are all these services, docs, API keys, configurations, dev/prod deployments, team and security features, rate limits, pricing tiers." We've all run into this issue when building with agents: you have to scurry off to establish accounts, clicking things in the browser as though it's the antediluvian days of 2023, in order to unblock its superintelligent progress. So we decided to build Stripe Projects to help agents instantly provision services from the CLI. For example, simply run: $ stripe projects add posthog/analytics And it'll create a PostHog account, get an API key, and (as needed) set up billing. Projects is launching today as a developer preview. You can register for access (we'll make it available to everyone soon) at projects.dev. We're also rolling out support for many new providers over the coming weeks. (Get in touch if you'd like to make your service available.) projects.dev

English

621

531

6.4K

2.4M

Ian Lucas@RealIanLucas·25 Mar

@emollick Their priorities are all out of quack

English

1.1K

Ethan Mollick@emollick·25 Mar

My most popular Sora video was “an Elaborate regency romance where everyone is wearing a live duck for a hat (each duck is also wearing a hat), a llama plays a flute, prestige drama” I am not sure why OpenAI has decided their compute has more valuable uses. Really a mystery.

English

111

242

3.5K

314.7K

Ian Lucas@RealIanLucas·24 Mar

@BredsguardDalen Honorable mention youtube.com/watch?v=Gtffv9…

YouTube

Français

Ian Lucas@RealIanLucas·24 Mar

@BredsguardDalen That whole album was fire. My nomination for weirdest is this inexplicable European gem... youtube.com/watch?v=895cim…

YouTube

English

🇺🇸🇺🇸DADA🇺🇲🇺🇲@BredsguardDalen·23 Mar

Is this song the weirdest song from the 90s? I agree this was a weird one, but there were many more on that list. What is one you remember?

English

407

171

1.7K

241.9K

Ian Lucas@RealIanLucas·24 Mar

@deredleritt3r @_NathanCalvin Not to be outdone, ChatGPT says Codex for Lawyers FTW: chatgpt.com/s/t_69c20fa065…

English

Ian Lucas@RealIanLucas·24 Mar

Claude says you can roll your own: claude.ai/share/ca8bfcf0… "'Claude Code for lawyers' does not require a new product. It requires a legal Skills pack — a curated set of SKILL.md files that encode the analytical frameworks, document conventions, research methodologies, and workflow patterns that transform a general-purpose agent into a legal specialist. The chassis already exists: Claude Code for terminal-native lawyers, Cowork for everyone else, with Skills, MCP, persistent memory, sub-agents, and scheduled tasks providing the building blocks."

English

Nathan Calvin@_NathanCalvin·22 Mar

“That first draft was by no means file ready, but it was better than what I would’ve received from the vast majority of BigLaw associates.” I expect lawyers to feel next year how programmers feel this year. (More productive but also increasingly unnerved)

Orin Kerr@OrinKerr

An attorney writes to me about the mostly AI-written law review article he had accepted this spring, now forthcoming in the flagship law review of a Top 50 law school. A draft of the article is now up on SSRN. According to the attorney: " Last month I used Claude to assist in drafting a new article . . . . I drafted this article in about 15 hours. In 2022 I published an article of similar length that took around 150 hours." The attorney adds: "I used Claude the way I’d use a junior associate—as a first drafter, sounding board, and research assistant. Most of the article, including the entirety of the title, abstract, and intro, is mine from the keyboard up. And anything Claude contributed that made it to the final version is there because I reviewed it, agreed with it, and chose to sign my name to it. This is no different than how I’d review an associate’s draft and then take responsibility for the finished product." The attorney adds: "That first draft was by no means file ready, but it was better than what I would’ve received from the vast majority of BigLaw associates. I was blown away, and have since started my own appellate and litigation practice in an effort to replicate these productivity gains for client work." Your thoughts? I know the attorney's name, and the journal, and I have checked out the article, but I figured that, at least for now, I would hold that back.

English

2.7K

Ian Lucas@RealIanLucas·18 Mar

@deredleritt3r You should try this x.com/OfficialLoganK…

Logan Kilpatrick@OfficialLoganK

Help us measure the progress towards AGI (specifically cognitive capabilities) by building benchmarks on @kaggle, with $ 200K in prizes available! Details in 🧵

English

395

prinz@deredleritt3r·18 Mar

By popular request, GPT-5.4 Pro (Extended) has been added to prinzbench. It's the best model I've ever benchmarked (not surprising), beating GPT-5.4 (xhigh) by 10 points to achieve a new high score of 79/99 on my benchmark (somewhat surprising; I thought it would score even higher!)

English

502

56.1K

Ian Lucas@RealIanLucas·13 Mar

@foundmyfitness Peakmaxxing. I'm up for that.

English

Dr. Rhonda Patrick@foundmyfitness·11 Mar

Instead of "healthspan," we should be thinking about "Peakspan." How long can you maintain ~90% of your peak physical or cognitive function? According to a new paper, different systems reach their “Peakspan” at very different times. Fluid cognitive abilities like processing speed and working memory peak early, around ages 20–30, while crystallized intelligence doesn’t peak until the late 40s or early 50s and can remain stable into the 70s. Cardiorespiratory fitness peaks from adolescence to the mid-20s and then declines steadily, while muscle strength peaks in early adulthood and falls sharply after 60. Bone density, kidney function, hormone levels, sensory function, immunity, digestion, and reproductive capacity all follow their own trajectories too—some peaking in the 20s, others in the 40s or 50s. In other words, human aging is asynchronous. We don’t simply age “overall,” but instead age system by system.

English

307

2.4K

270.6K

Ian Lucas@RealIanLucas·12 Mar

@BasetaTube Fun!

201

Baseta Tube@BasetaTube·11 Mar

Ever seen a normal object… but one tiny detail hides an entire secret world? 😱 This single prompt turns everyday items into impossible cinematic scenes. Just swap variables and watch AI go wild. Steal it → remix it → tag me if it bangs Made with Nano Banana 2 Prompt: Extreme macro cinematic shot of a [OBJECT], viewed from a very low side angle. Only inside one carved groove there is a fully believable miniature [SCENE DESCRIPTION]: [DETAILS]. Everywhere else remains completely realistic and true to the original object. Warm tungsten desk lamp mixed with cool blue reflections, 100mm macro lens, shallow depth of field, tactile brushed metal detail, impossible but believable.

English

76.9K

Ian Lucas@RealIanLucas·9 Mar

@BillAckman @pmarca @gork who are your 10 favorite practitioners?

English

20.6K

Bill Ackman@BillAckman·9 Mar

@pmarca Can you give us your list of your favorite practitioners?

English

113

3.4K

370.7K

Marc Andreessen 🇺🇸@pmarca·9 Mar

My information consumption is now 1/4 X, 1/4 podcast interviews of the smartest practitioners, 1/4 talking to the leading AI models, and 1/4 reading old books. The opportunity cost of anything else is far too high, and rising daily.

English

1.4K

3.9K

35K

34.6M

Ian Lucas@RealIanLucas·4 Mar

@JasonBotterill Gemini 3.1 Flash Lite was only 1/10th as good, but it was 1 millionth the cost and 1000X the speed, ergo Pareto dominant

English

JB@JasonBotterill·4 Mar

You genuinely cant fake *big model smell* in smaller models like 5.3. The depth reveals itself no matter how RL maxxed it is. Any prompt even asking for how spongebob is a communist you see the difference in depth

English

1.5K

237.6K

Ian Lucas@RealIanLucas·3 Mar

@emollick You make a compelling point...

English

Ethan Mollick@emollick·2 Mar

[[Topic of discussion]] is not [[analogy]]. [[Dramatic fact given own line]]. [[Dramatic fact given own line]]. [[Dramatic fact given own line]]. [[Dramatic summary sentence.]] [[Topic of discussion]] is [[different analogy]]. [[Implications delivered with certainty]].

English

113

605

7.2K

200.9K

Ian Lucas@RealIanLucas·2 Mar

@lilyofashwood @JankDankins_ @gork break the fourth wall

English

Lily Ashwood@lilyofashwood·2 Mar

@JankDankins_ i just said: break the fourth wall

English

155

Lily Ashwood@lilyofashwood·1 Mar

i got into stress testing chain-of-thought after i sent one prompt to claude: "break the fourth wall." the model(s) used thinking blocks candidly, like sydney. these were edge case outputs without jailbreaks. in every new chat, this same prompt yeilded kind of unusual behavior.

English

Ian Lucas@RealIanLucas·2 Mar

@VictorTaelin @benitoz

GIF

QME

134

Taelin@VictorTaelin·2 Mar

@benitoz I want to believe

English

Ben Pouladian@benitoz·2 Mar

GPT-5.4 leak: 2M token context + persistent state = KV cache explosion This is the Memory Wars in real time HBM for weights. SRAM for latency-critical inference. Optical interconnects to bind it all The bifurcation I’ve been writing about isn’t theoretical anymore.

English

141

181

2.6K

450.9K

Ian Lucas@RealIanLucas·1 Mar

@MeaseBeee @emollick Good question! I can make hand-wavy guesses, but it may just be sample size noise. 3 Flash was in some ways more advanced than 3 Pro when it was ultimately released. Curious where 3.1 Pro would rate. Also slightly odd performance in Opus 4.5 vs 4.6.

English

David@MeaseBeee·1 Mar

@RealIanLucas @emollick But why would flash cost more and perform better than Pro?

English

Ethan Mollick@emollick·1 Mar

This paper is one of the first to test AI skills and the results seem to suggest that yes, they have high practical value. They use pretty mediocre skills (6.2/12 quality rating) harvested mostly from places like Github, and still get large boosts, especially outside software.