alex rudloff

401 posts

alex rudloff

@alexrudloff

Business + Product + Tech + Art

Beach in FL / Mountains in NC Katılım Ekim 2006

884 Takip Edilen6.5K Takipçiler

alex rudloff@alexrudloff·14h

same on x

Craig Weiss@craigzLiszt

90% of all LinkedIn content looks like it’s been piped through chatGPT

English

alex rudloff@alexrudloff·16h

I’m not going to blindly believe any llm intelligence breakthrough unless its made by an actress with one GitHub commit

English

alex rudloff@alexrudloff·18h

This is the same thing driving much of the cynicism around AI, too

Austin Campbell@austincampbell

Much of the disillusionment in crypto is the process of realizing they did the 5% of the work that was easy (whitepapers, marketing to retail), and the other 95% of the work (building a secure, well-governed, well-designed, usable financial system) is really fucking hard.

English

alex rudloff retweetledi

James Chanos@RealJimChanos·1d

Umm, isn’t that 100% of US GDP in four years…?

ARK Funds@ARK_Funds

The robotaxi market could hit $34 trillion by 2030. Who captures most of that value? @TashaARK outlines ride-hail economics in our Q1 webinar. Watch now!

English

124

993

92.2K

alex rudloff@alexrudloff·1d

This is becoming a familiar pattern. Thanks codex.

English

alex rudloff retweetledi

Paul Graham@paulg·2d

Solution: New companies.

ᴅᴀɴɪᴇʟ ᴍɪᴇssʟᴇʀ 🛡️@DanielMiessler

x.com/i/article/2050…

English

166

363

3.7K

467K

alex rudloff retweetledi

dealign.ai@dealignai·3d

the secret? native macos memory compression. used with the perfect timing u can compress unused routed experts up. im now getting a literal 70% decrease in ram usage for literally every single model with no hit to coherency and speed. i have a feeling this is going to be really big.

English

1.1K

alex rudloff retweetledi

Lisa Forte@LisaForteUK·5d

Learning lessons from Jurassic Park

English

873

10.4K

570.5K

alex rudloff retweetledi

Sandro@pupposandro·4d

We just released something new: Luce PFlash Long-context prefill is a silent killer for throughput speed. llama.cpp takes ~257 seconds to prefill 128K tokens of Qwen3.6-27B on a single RTX 3090. So we tried to solve the problem. A small Qwen3-0.6B drafter loads in-process, scores token importance across the whole prompt, and the heavy 27B target only prefills the spans that matter. 128K prompt in 24.8 seconds, ~10.4x faster TTFT, NIAH retrieval preserved at every measured context. It is a clean C++/CUDA port of FlashPrefill wired through Block-Sparse Attention, with a custom Qwen3-0.6B BF16 forward so drafter and target share one ggml allocator. The whole thing is a single daemon command (compress) in front of the existing dflash spec-decode stack. More details here: github.com/Luce-Org/luceb…

GIF

Sandro@pupposandro

x.com/i/article/2050…

English

707

113K

alex rudloff retweetledi

AboveSpec@above_spec·5d

"You need a 24 GB GPU for serious local LLMs in 2026." Everyone repeats this. It's not true anymore. Just ran a 35B-parameter model on an RTX 4060 Ti 8 GB: • 41 tok/s at 16k context • 24 tok/s at 200k context Recipe + benchmarks below 🧵

English

134

232

2.8K

273.7K

alex rudloff@alexrudloff·4d

meeting recordings are useful, but a total legal shit show lies just over the horizon

Rohan Paul@rohanpaul_ai

Reid Hoffman, co founder of LinkedIn on AI-driven meeting analysis. "Basically every organization should be saying, we're recording all of our meetings, and we're running an AI on the recording of the meetings, not just for the transcript, but also to do all of the suggested follow-ups. It's like, hey, did you mentioned this, you should probably let Nikolai know and make sure that that's the case, or, you should make sure that you get approval from Satya on the following thing, or this other group is doing this. All of that kind of thing is already here the technology is there to go." --- From "Norges Bank Investment Management" YT Channel (link in comment)

English

alex rudloff@alexrudloff·4d

This is such a simple thing, and I don’t know why it’s taken years to say it. But many of us have gone through all the stages of an AI induced existential crisis only to come out the other side with much less fear and far more excitement, so Sam deserves some grace too

Sam Altman@sama

we want to build tools to augment and elevate people, not entities to replace them.

English

alex rudloff@alexrudloff·4d

prompt → drafter scores tokens → keep top 5% → target dflash spec-decode prefills compressed prompt → diffusion drafter proposes blocks → target verifies

Sandro@pupposandro

x.com/i/article/2050…

English

172

alex rudloff retweetledi

Daniel Jeffries@Dan_Jeffries1·5d

All the Doomers and hawks are lining up behind this distillation "attack" farce because they want to see open source banned. It's really as simple as that. They want to take away your right to choose, and take away businesses' rights to fine tune and make your products cheaper and better. The end state here, if we let these short-sighted people win, is a horrible place for America: They will look to ban Chinese models under the guise of national security grounds and conveniently leave only proprietary American companies standing in the USA. The only real "attack" happening is closed source companies attacking open source the same way Microsoft once tried to attack Linux to create regulatory capture. If you can't win in the market, win in Washington is their strategy. Do not be fooled by this regulatory capture and saber rattling nonsense. It's a bait and switch. The goal is to rob you of choice. That's it. These short sighted policies will make America weaker, not stronger. These are the very folks whose shoot-us-in-the-foot policies lost NVIDIA 100% of the market share in China, driving it to basically 0%, while kickstarting the moribund Chinese chip ecosystem. It was dead in the water, and now it's awakened from its deep slumbers. Old state sponsored dinosaurs are reborn as emerging chip powerhouses. The demand for Chinese chips is accelerating and it will only get stronger. When Jensen is proven right a few years from now (he's the best long term thinker in business today) and you have hundreds of cheap Chinese models running optimized on Chinese chips and those models are now hard to run on NVIDIA hardware you can thank these folks. If you're banned in the USA from using these models and these chips, do you think the rest of the world will be? Nope. They'll happily adopt the cheaper, faster, good enough models that we kickstarted with our short-sightedness. 1 billion people in the west will be banned and using closed/gated/sluggish/censored/surveilled models that destroy your privacy while 6 billion other people use the now dominant Chinese ecosystem and your NVIDIA retirement shares lose money. When you can't use open source anymore because it gets banned for Americans, you can thank these short-sighted, foolish folks. When your API bills is a billion dollars and burns your budget in three months instead of 12, you can thank these folks. When all your personal intimate, personal data flows threw a few tight gateways and choke points mandated by law, you can thank these folks.

Chris McGuire@ChrisRMcGuire

Sorry but that just isn’t true—distillation attacks are illicit activity, not an industry standard. They are against the terms of service of all frontier AI labs. There is a reason OpenAI, Anthropic, and Google all put out reports warning about it: none of them do it.

English

154

47.9K

alex rudloff@alexrudloff·4d

id say 95% of my timeline are cosplaying a weird sci fi fantasy, and the other 5% are super productive but too busy to post as much

Andrew Steinwold@AndrewSteinwold

Is AI actually helping us solve problems, or are we just addicted to the slot-machine dopamine hit of the prompt box?

English

alex rudloff@alexrudloff·5d

@iraszl @jun_song I very much enjoy watching you explain how local LLMs work and what’s capable to someone like @jun_song

English

Ivan Raszl@iraszl·5d

@alexrudloff @jun_song It’s based on discussion with people who tried running locally. Not AI. The issue apparently is that you can’t dedicate all the memory in your mac to the LLM, because at the very least you need to run the OS and typically a few other apps.

English

131

Ivan Raszl@iraszl·6d

Thinking of running Local LLM on a new MBP? Here is the level of intelligence you can get with various memory configurations on open models: 🐹 16–24GB RAM → ≈ GPT-3.5 🐕 32–48GB RAM → ≈ higher-end GPT-3.5 🐅 64GB RAM → ≈ lower-end GPT-4 🐉 96–128GB RAM → ≈ mid-tier GPT-4 All still below newer GPT or Claude models.

English

171

38.6K

alex rudloff@alexrudloff·5d

@jun_song @iraszl The training data of the llm who wrote this post is way old

English

112

송준 Jun Song@jun_song·6d

@iraszl I don’t think so 🤔

English

3.4K

alex rudloff retweetledi

Gergely Orosz@GergelyOrosz·6d

OpenClaw - the agentic software spreading like wildfire - was built on top of Pi, a minimalist, self-modifying agent. I sat down with Pi's creator, @badlogicgames and longtime Pi user (+ the creator of Flask) @mitsuhiko to talk Pi, and their (very grounded!) takes on building with AI. Timestamps: 00:00 Intro 07:30 How Mario, Armin, and Peter Steinberger met 15:15 How 30 dev teams use AI agents: learnings 21:50 The importance of judgment 24:26 Challenges when non-engineers write code 28:30 Downsides of over-automation 32:18 Pi 48:09 OpenClaw + Pi 50:54 “Clankers” 57:32 Open source and AI 1:00:22 Complexity as the enemy 1:02:50 Building an AI-native startup 1:11:52 “Slow the F down” 1:16:40 MCPs vs. CLI 1:25:03 Predictions and staying up to date • YouTube: youtu.be/n5f51gtuGHE • Spotify: open.spotify.com/episode/1fDw9c… • Apple: podcasts.apple.com/us/podcast/bui… Brought to you by: • @statsig – ⁠ The unified platform for flags, analytics, experiments, and more. statsig.com/pragmatic • @SonarSource — The makers of SonarQube, the industry standard for code verification and automated code review. Try it out for yourself. sonarsource.com/plans-and-pric… • @WorkOS – WorkOS gives you APIs to ship enterprise features – SSO, directory sync, RBAC, audit logs – in days, not months. Visit WorkOS.com to learn more. --- Three parts I found especially interesting in this discussion: 1. New trend: AI makes it harder for senior engineers to reject pointless complexity. Historically, senior engineers kept software complexity at bay simply by saying “no” a lot. But Armin observes that these days, more junior engineers and product managers deploy agent-scripted counterarguments when a senior colleague kicks an idea to the curb. This makes decision-making exhausting, and more bad ideas make it into production as a result. 2. It should be MUCH easier to build specialized tools for specific tasks. Different projects need different harness types because, as Mario points out, the same hammer is not ideal for every single construction job. As such, Pi is built with the goal of allowing the creation of specialized harnesses. It can modify itself so that a user can create the bespoke harness needed for any task. Mario believes it’s a preview of how self-modifiable software might look in the future. 3. Automation bias is one of the biggest risks of working with AI agents. Once devs confirm that an AI agent can produce acceptable code, they start to review its output less often, even though agents can – and do! – produce slop. Mario advises being far more sceptical with agents, and cautions that the quality of their output isn’t guaranteed, however well they performed previously.

YouTube

English

115

1.2K

169.5K

alex rudloff retweetledi

Aaron Levie@levie·29 Nis

Will keep saying this, but software jobs aren’t going away. Agents are the single biggest form of leverage for anyone technical in history. Probably has never been a better time to be technical in terms of being able to accomplish something solo, in a team, or company. We think that most of the world’s software has already been built and that agents will just reduce work from an existing pie. In fact, we are about to experience 100X more software than before. Think about how many apps you regularly use that need to get better. How many legacy on prem systems that have to get replatformed for the cloud. How many SMBs never could hire developers. How many security issues are about to be uncovered and need to get patched. How many IT organizations are about to bring automation to workflows they never could have automated. How much data is about to processed and connected in most organizations. This is all what the agents will be working on. And every one of those agents will need a person to kick them off, manage their work, orchestrate them, and get their output into a workable and useful form. That person will generally need to be technical (or become technical quickly), and this will create a huge amount of opportunity for anyone up to the task.

Shay Boloor@StockSavvyShay

$AMZN AWS CEO pushed back on the idea that AI is killing software jobs by saying Amazon is hiring as many developers as ever. He said AI agents are “exploding” across every industry & moving faster than expected changing the developer job rather than eliminating it.

English

628

104.8K

alex rudloff retweetledi

Brooks Otterlake@i_zzzzzz·29 Nis

This is just like being alive in the 1600s when they got good at making complicated clocks and deduced that every complicated thing in the universe probably functioned exactly like a clock

Dwarkesh Patel@dwarkesh_sp

There's a quadrillion-dollar question at the heart of AI: Why are humans so much more sample efficient compared to LLM? There are three possible answers: 1. Architecture and hyperparameters (aka transformer vs whatever ‘algo’ cortical columns are implementing) 2. Learning rule (backprop vs whatever brain is doing) 3. Reward function @AdamMarblestone believes the answer is the reward function. ML likes to use pretty simple loss functions, like cross-entropy. These are easy to work with. But they might be too simple for sample-efficient learning. Adam thinks that, in humans, the large number of highly specialised cells in the ‘lizard brain’ might actually be encoding information for sophisticated loss functions, used for ‘training’ in the more sophisticated areas like the cortex and amygdala. Like: the human genome is barely 3 gigabytes (compare that to the TBs of parameters that encode frontier LLM weights). So how can it include all the information necessary to build highly intelligent learners? Well, if the key to sample-efficient learning resides in the loss function, even very complicated loss functions can still be expressed in a couple hundred lines of Python code.

English

107

13.1K

805.7K

Keşfet

@iraszl @jun_song @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA