Benjamin Wiley

2K posts

Benjamin Wiley

Benjamin Wiley

@prof_wiley

Prof at Duke | CTO, Sparta Biomedical

Durham, NC Katılım Mart 2013
659 Takip Edilen838 Takipçiler
Benjamin Wiley
Benjamin Wiley@prof_wiley·
@garrytan Dope I also hate the playwright mcp, made a separate agent just to use it but cli is better, look forward to trying this
English
0
0
0
86
Garry Tan
Garry Tan@garrytan·
MCP sucks honestly It eats too much context window and you have to toggle it on and off and the auth sucks I got sick of Claude in Chrome via MCP and vibe coded a CLI wrapper for Playwright tonight in 30 minutes only for my team to tell me Vercel already did it lmao But it worked 100x better and was like 100LOC as a CLI
Morgan@morganlinton

The cofounder and CTO of Perplexity, @denisyarats just said internally at Perplexity they’re moving away from MCPs and instead using APIs and CLIs 👀

English
431
212
3.8K
1.3M
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
I've learned a ton from @reinerpope and @MikeGunter_ Very bullish on these guys
Reiner Pope@reinerpope

We’re building an LLM chip that delivers much higher throughput than any other chip while also achieving the lowest latency. We call it the MatX One. The MatX One chip is based on a splittable systolic array, which has the energy and area efficiency that large systolic arrays are famous for, while also getting high utilization on smaller matrices with flexible shapes. The chip combines the low latency of SRAM-first designs with the long-context support of HBM. These elements, plus a fresh take on numerics, deliver higher throughput on LLMs than any announced system, while simultaneously matching the latency of SRAM-first designs. Higher throughput and lower latency give you smarter and faster models for your subscription dollar. We’ve raised a $500M Series B to wrap up development and quickly scale manufacturing, with tapeout in under a year. The round was led by Jane Street, one of the most tech-savvy Wall Street firms, and Situational Awareness LP, whose founder @leopoldasch wrote the definitive memo on AGI. Participants include @sparkcapital, @danielgross and @natfriedman’s fund, @patrickc and @collision, @TriatomicCap, @HarpoonVentures, @karpathy, @dwarkesh_sp, and others. We’re also welcoming investors across the supply chain, including Marvell and Alchip. @MikeGunter_ and I started MatX because we felt that the best chip for LLMs should be designed from first principles with a deep understanding of what LLMs need and how they will evolve. We are willing to give up on small-model performance, low-volume workloads, and even ease of programming to deliver on such a chip. We’re now a 100-person team with people who think about everything from learning rate schedules, to Swing Modulo Scheduling, to guard/round/sticky bits, to blind-mated connections—all in the same building. If you’d like to help us architect, design, and deploy many generations of chips in large volume, consider joining us.

English
11
9
321
60K
Benjamin Wiley
Benjamin Wiley@prof_wiley·
Limits of the LLMs , As good as it gets
English
0
0
0
107
Benjamin Wiley
Benjamin Wiley@prof_wiley·
Not worth it. DRAM is trivially cheap and light — 1 TB costs ~$2,500 and weighs ~1 kg, negligible on a 1,500 kg V3 satellite. The vacuum delay-line gives you ~42 GB across a 10-satellite ring, but at 83 ms loop latency vs. nanoseconds for onboard DRAM. The engineering cost for ring-mode firmware, WDM upgrades, and deterministic timing would run $50–500M to replace $10K of memory per satellite. For models too large for one satellite, you just shard across satellites using the ISLs as a normal network interconnect, which SpaceX is already planning for their orbital data centers. The one interesting angle is raw bandwidth — a 4 Tbps ISL matches HBM bandwidth — but you still need to source the weights from somewhere (flash, DRAM) before injecting them into the beam, so you’ve just relocated the bottleneck. Energy savings from avoiding DRAM refresh (~300W for 1 TB) are real but modest against transceiver and amplifier power at each relay node. The better investment is the same engineering dollars spent on space-hardened HBM or flash-pipelined inference (Carmack’s actual practical recommendation) directly on each orbital datacenter satellite.
English
0
0
0
116
Elon Musk
Elon Musk@elonmusk·
@ID_AA_Carmack Interesting idea. You could slow down light even more and increase data stored per km by using higher refractive index materials. Or just use vacuum, which costs nothing, over a longer distance … 🤔
English
1.2K
854
10.6K
1.3M
John Carmack
John Carmack@ID_AA_Carmack·
256 Tb/s data rates over 200 km distance have been demonstrated on single mode fiber optic, which works out to 32 GB of data in flight, “stored” in the fiber, with 32 TB/s bandwidth. Neural network inference and training can have deterministic weight reference patterns, so it is amusing to consider a system with no DRAM, and weights continuously streamed into an L2 cache by a recycling fiber loop. The modern equivalent of the ancient mercury echo tube memories. You would need to pipeline a bunch of them to implement modern trillion parameter models, but fiber transmission may have a better growth trajectory than DRAM does today, so it might someday become viable. Much more practically, you should be able to gang cheap flash memory together to provide almost any read bandwidth you require, as long as it is done a page at a time and pipelined well ahead. That should be viable for inference serving today if flash and accelerator vendors could agree on a high speed interface.
English
468
694
10.2K
1.7M
Benjamin Wiley retweetledi
Oliur
Oliur@UltraLinx·
Can you read 900 words per minute? Try it.
English
4.9K
29.6K
212.1K
31.5M
Scott Adams
Scott Adams@ScottAdamsSays·
A Final Message From Scott Adams
Scott Adams tweet mediaScott Adams tweet media
English
13.2K
32.2K
193K
43M
Benjamin Wiley
Benjamin Wiley@prof_wiley·
“Authority is a red herring.”-Claude
English
0
0
0
60
Benjamin Wiley
Benjamin Wiley@prof_wiley·
Saw alot of posts with Wispr Flow. Tried it, deleted it. Then I set up the free voice dictation tool on my Mac. Loving it.
English
0
0
1
79
Jeffrey Emanuel
Jeffrey Emanuel@doodlestein·
My answer to this very reasonable and important question from the great @badlogicgames about my agent coding process: “Could you share a real world project + the plan you came up with? I've been building software for over 25 years, and I was never able to do "hyper-waterfall", as in: preplan everything to a detail level that allows mechanical execution like that. As you work on a project, problems you didn't anticipate pop up. I don't understand how that is solved.”
Jeffrey Emanuel@doodlestein

Here’s a recent example from my cass memory project (see quoted post for the whole process which I posted about live as I did it): github.com/Dicklesworthst… Once you’ve implemented the entire plan (after turning it into beads and so forth), you have a version 1 that should be usable if you’ve done things well. It will probably require some bug fixes and UI polishing, but that’s just part of my workflow. Then after using it, you might decide that you missed some things or have ideas for other features. Well, then you create another big plan, like I did here in the same project; nothing says you can only ever do one plan and then that’s it: github.com/Dicklesworthst…

English
5
0
16
5.4K
Jeffrey Emanuel
Jeffrey Emanuel@doodlestein·
I've been working on various web apps using Claude Code and Codex from a remote Ubuntu machine via SSH, and every time I see a problem or glitch on my iPhone browser on the live site, I like to take a screenshot of it on my iPhone. If you're using CC or Codex locally on your Mac, it's not too bad to use that. But if the machine is a remote one, it's super annoying to deal with. I was wasting time sending myself the images on Telegram or email and then manually copying them over to the remote machine using scp (Cyberduck on Mac). Finally, I realized that the cumulative time wasted on this exceeded the time to make a custom CLI tool just for this, so introducing giil (get_icloud_image_link). You can get it here: github.com/Dicklesworthst… Even though it's a simple thing, I tried to do it in the slickest and best possible way. Hopefully you'll like all the polishing.
Jeffrey Emanuel tweet media
English
26
9
156
26.3K
Jeffrey Emanuel
Jeffrey Emanuel@doodlestein·
I really pulled out all the stops for the new version of beads_viewer (bv). The original version of bv was made in a single day and was just under 7k lines of Golang. This new version is… 80k lines. I added an insane number of great features. Your agents will love it (you, too).
Jeffrey Emanuel tweet mediaJeffrey Emanuel tweet mediaJeffrey Emanuel tweet mediaJeffrey Emanuel tweet media
English
41
35
590
37.7K
Benjamin Wiley
Benjamin Wiley@prof_wiley·
and when you think about it, automation is just a way to increase efficiency
English
0
0
1
74
Benjamin Wiley
Benjamin Wiley@prof_wiley·
There are only three paths to greater physical wealth: (1) automation - doing more with less labor, (2) efficiency - doing more with less materials or energy inputs, (3) expansion - more material/energy inputs.
English
1
0
2
81
Benjamin Wiley
Benjamin Wiley@prof_wiley·
To have continuous improvement, you first need a process
English
0
0
5
132
Benjamin Wiley
Benjamin Wiley@prof_wiley·
while (alive) { try(); experience(); reflect(); learn(); }
English
0
0
2
92