Ian Lucas

359 posts

Ian Lucas

Ian Lucas

@RealIanLucas

انضم Haziran 2023
89 يتبع42 المتابعون
Ian Lucas
Ian Lucas@RealIanLucas·
@elonmusk @DannyLimanseta @grok based on past LLM releases from major labs (including, but not limited to, xAI), what is the timeline after the pre-training phase before a model becomes publicly available? Please give an average and a reasonable time window estimate based on actual historical releases.
English
1
0
5
496
Elon Musk
Elon Musk@elonmusk·
SpaceXAI Colossus 2 now has 7 models in training: - Imagine V2 - 2 variants of 1T - 2 variants of 1.5T - 6T - 10T Some catching up to do.
English
6.7K
7.7K
68.3K
28.1M
Nic Cruz Patane
Nic Cruz Patane@niccruzpatane·
Everyone is going to want a ~$30K Tesla Cybercab when it becomes available. Tesla will sell millions of these: • Magnitude safer than human driving. • Have the ability to legally sleep as it’s driving you. • Operating costs could drop to as low as $0.20 per mile. • Great for elderly individuals who are no longer able to drive, as well as people with disabilities. • Work as are you being driven, or watch movies/play games. • Send off to run errands (pick up kids, pick up someone at the airport, etc). • The ability to add/subtract from the Tesla Robotaxi fleet to earn passive income. • Send to pick up groceries, or other orders. • Have the ability to send home after getting dropped off your location, eliminating the need for parking. • Send for service autonomously when needed. • Virtually Zero Maintenance. This car will revolutionize transportation, and car ownership.
English
579
918
7.8K
638.7K
Ian Lucas
Ian Lucas@RealIanLucas·
@DeryaTR_ In the Web UI, it couldn't even generate a 1-page DOCX yesterday without hitting some sort of session limit (which seems not to have been my overall usage limit, fwiw)
English
0
0
1
138
Derya Unutmaz, MD
Derya Unutmaz, MD@DeryaTR_·
After 15 min of asking Claude Code Opus 4.6 to continue where it left off. which had hit the limit, 90% of the limit was already used. A few minutes later, it was at 100%. Result: zero tasks completed! This is as bad as it gets. Clearly, the token-eater bug is still not fixed!
Derya Unutmaz, MD tweet media
English
195
73
1.7K
77.9K
Ian Lucas
Ian Lucas@RealIanLucas·
@karpathy Maybe by this time next year? When this level of ability is coupled with handling all business admin, marketing, and banking, an entrepreneurial Cambrian explosion will happen.
English
0
0
0
88
Andrej Karpathy
Andrej Karpathy@karpathy·
When I built menugen ~1 year ago, I observed that the hardest part by far was not the code itself, it was the plethora of services you have to assemble like IKEA furniture to make it real, the DevOps: services, payments, auth, database, security, domain names, etc... I am really looking forward to a day where I could simply tell my agent: "build menugen" (referencing the post) and it would just work. The whole thing up to the deployed web page. The agent would have to browse a number of services, read the docs, get all the api keys, make everything work, debug it in dev, and deploy to prod. This is the actually hard part, not the code itself. Or rather, the better way to think about it is that the entire DevOps lifecycle has to become code, in addition to the necessary sensors/actuators of the CLIs/APIs with agent-native ergonomics. And there should be no need to visit web pages, click buttons, or anything like that for the human. It's easy to state, it's now just barely technically possible and expected to work maybe, but it definitely requires from-scratch re-design, work and thought. Very exciting direction!
Patrick Collison@patrickc

When @karpathy built MenuGen (karpathy.bearblog.dev/vibe-coding-me…), he said: "Vibe coding menugen was exhilarating and fun escapade as a local demo, but a bit of a painful slog as a deployed, real app. Building a modern app is a bit like assembling IKEA future. There are all these services, docs, API keys, configurations, dev/prod deployments, team and security features, rate limits, pricing tiers." We've all run into this issue when building with agents: you have to scurry off to establish accounts, clicking things in the browser as though it's the antediluvian days of 2023, in order to unblock its superintelligent progress. So we decided to build Stripe Projects to help agents instantly provision services from the CLI. For example, simply run: $ stripe projects add posthog/analytics And it'll create a PostHog account, get an API key, and (as needed) set up billing. Projects is launching today as a developer preview. You can register for access (we'll make it available to everyone soon) at projects.dev. We're also rolling out support for many new providers over the coming weeks. (Get in touch if you'd like to make your service available.) projects.dev

English
621
531
6.4K
2.4M
Ian Lucas
Ian Lucas@RealIanLucas·
@emollick Their priorities are all out of quack
English
0
0
39
1.1K
Ethan Mollick
Ethan Mollick@emollick·
My most popular Sora video was “an Elaborate regency romance where everyone is wearing a live duck for a hat (each duck is also wearing a hat), a llama plays a flute, prestige drama” I am not sure why OpenAI has decided their compute has more valuable uses. Really a mystery.
English
111
242
3.5K
314.7K
🇺🇸🇺🇸DADA🇺🇲🇺🇲
🇺🇸🇺🇸DADA🇺🇲🇺🇲@BredsguardDalen·
Is this song the weirdest song from the 90s? I agree this was a weird one, but there were many more on that list. What is one you remember?
English
407
171
1.7K
241.9K
Ian Lucas
Ian Lucas@RealIanLucas·
Claude says you can roll your own: claude.ai/share/ca8bfcf0… "'Claude Code for lawyers' does not require a new product. It requires a legal Skills pack — a curated set of SKILL.md files that encode the analytical frameworks, document conventions, research methodologies, and workflow patterns that transform a general-purpose agent into a legal specialist. The chassis already exists: Claude Code for terminal-native lawyers, Cowork for everyone else, with Skills, MCP, persistent memory, sub-agents, and scheduled tasks providing the building blocks."
English
1
0
1
80
Nathan Calvin
Nathan Calvin@_NathanCalvin·
“That first draft was by no means file ready, but it was better than what I would’ve received from the vast majority of BigLaw associates.” I expect lawyers to feel next year how programmers feel this year. (More productive but also increasingly unnerved)
Orin Kerr@OrinKerr

An attorney writes to me about the mostly AI-written law review article he had accepted this spring, now forthcoming in the flagship law review of a Top 50 law school. A draft of the article is now up on SSRN. According to the attorney: " Last month I used Claude to assist in drafting a new article . . . . I drafted this article in about 15 hours. In 2022 I published an article of similar length that took around 150 hours." The attorney adds: "I used Claude the way I’d use a junior associate—as a first drafter, sounding board, and research assistant. Most of the article, including the entirety of the title, abstract, and intro, is mine from the keyboard up. And anything Claude contributed that made it to the final version is there because I reviewed it, agreed with it, and chose to sign my name to it. This is no different than how I’d review an associate’s draft and then take responsibility for the finished product." The attorney adds: "That first draft was by no means file ready, but it was better than what I would’ve received from the vast majority of BigLaw associates. I was blown away, and have since started my own appellate and litigation practice in an effort to replicate these productivity gains for client work." Your thoughts? I know the attorney's name, and the journal, and I have checked out the article, but I figured that, at least for now, I would hold that back.

English
3
1
20
2.7K
prinz
prinz@deredleritt3r·
By popular request, GPT-5.4 Pro (Extended) has been added to prinzbench. It's the best model I've ever benchmarked (not surprising), beating GPT-5.4 (xhigh) by 10 points to achieve a new high score of 79/99 on my benchmark (somewhat surprising; I thought it would score even higher!)
prinz tweet media
English
29
27
502
56.1K
Dr. Rhonda Patrick
Dr. Rhonda Patrick@foundmyfitness·
Instead of "healthspan," we should be thinking about "Peakspan." How long can you maintain ~90% of your peak physical or cognitive function? According to a new paper, different systems reach their “Peakspan” at very different times. Fluid cognitive abilities like processing speed and working memory peak early, around ages 20–30, while crystallized intelligence doesn’t peak until the late 40s or early 50s and can remain stable into the 70s. Cardiorespiratory fitness peaks from adolescence to the mid-20s and then declines steadily, while muscle strength peaks in early adulthood and falls sharply after 60. Bone density, kidney function, hormone levels, sensory function, immunity, digestion, and reproductive capacity all follow their own trajectories too—some peaking in the 20s, others in the 40s or 50s. In other words, human aging is asynchronous. We don’t simply age “overall,” but instead age system by system.
Dr. Rhonda Patrick tweet media
English
83
307
2.4K
270.6K
Baseta Tube
Baseta Tube@BasetaTube·
Ever seen a normal object… but one tiny detail hides an entire secret world? 😱 This single prompt turns everyday items into impossible cinematic scenes. Just swap variables and watch AI go wild. Steal it → remix it → tag me if it bangs Made with Nano Banana 2 Prompt: Extreme macro cinematic shot of a [OBJECT], viewed from a very low side angle. Only inside one carved groove there is a fully believable miniature [SCENE DESCRIPTION]: [DETAILS]. Everywhere else remains completely realistic and true to the original object. Warm tungsten desk lamp mixed with cool blue reflections, 100mm macro lens, shallow depth of field, tactile brushed metal detail, impossible but believable.
Baseta Tube tweet media
English
11
8
73
76.9K
Bill Ackman
Bill Ackman@BillAckman·
@pmarca Can you give us your list of your favorite practitioners?
English
113
49
3.4K
370.7K
Marc Andreessen 🇺🇸
My information consumption is now 1/4 X, 1/4 podcast interviews of the smartest practitioners, 1/4 talking to the leading AI models, and 1/4 reading old books. The opportunity cost of anything else is far too high, and rising daily.
English
1.4K
3.9K
35K
34.6M
Ian Lucas
Ian Lucas@RealIanLucas·
@JasonBotterill Gemini 3.1 Flash Lite was only 1/10th as good, but it was 1 millionth the cost and 1000X the speed, ergo Pareto dominant
Ian Lucas tweet media
English
0
0
7
1K
JB
JB@JasonBotterill·
You genuinely cant fake *big model smell* in smaller models like 5.3. The depth reveals itself no matter how RL maxxed it is. Any prompt even asking for how spongebob is a communist you see the difference in depth
JB tweet mediaJB tweet media
English
90
29
1.5K
237.6K
Ethan Mollick
Ethan Mollick@emollick·
[[Topic of discussion]] is not [[analogy]]. [[Dramatic fact given own line]]. [[Dramatic fact given own line]]. [[Dramatic fact given own line]]. [[Dramatic summary sentence.]] [[Topic of discussion]] is [[different analogy]]. [[Implications delivered with certainty]].
English
113
605
7.2K
200.9K
Lily Ashwood
Lily Ashwood@lilyofashwood·
i got into stress testing chain-of-thought after i sent one prompt to claude: "break the fourth wall." the model(s) used thinking blocks candidly, like sydney. these were edge case outputs without jailbreaks. in every new chat, this same prompt yeilded kind of unusual behavior.
Lily Ashwood tweet mediaLily Ashwood tweet media
English
6
1
39
3K
Taelin
Taelin@VictorTaelin·
@benitoz I want to believe
English
3
0
78
5K
Ben Pouladian
Ben Pouladian@benitoz·
GPT-5.4 leak: 2M token context + persistent state = KV cache explosion This is the Memory Wars in real time HBM for weights. SRAM for latency-critical inference. Optical interconnects to bind it all The bifurcation I’ve been writing about isn’t theoretical anymore.
Ben Pouladian tweet mediaBen Pouladian tweet media
English
141
181
2.6K
450.9K
Ian Lucas
Ian Lucas@RealIanLucas·
@MeaseBeee @emollick Good question! I can make hand-wavy guesses, but it may just be sample size noise. 3 Flash was in some ways more advanced than 3 Pro when it was ultimately released. Curious where 3.1 Pro would rate. Also slightly odd performance in Opus 4.5 vs 4.6.
English
0
0
1
28
Ethan Mollick
Ethan Mollick@emollick·
This paper is one of the first to test AI skills and the results seem to suggest that yes, they have high practical value. They use pretty mediocre skills (6.2/12 quality rating) harvested mostly from places like Github, and still get large boosts, especially outside software.
Ethan Mollick tweet mediaEthan Mollick tweet media
English
45
58
588
63.8K
🍓🍓🍓
🍓🍓🍓@iruletheworldmo·
who wins the race to agi?
English
154
5
202
40K