Adam Zmenak

306 posts

Adam Zmenak

@AdamZmenak

Technologist and Neurotic audiophile | ex @coinbase Custody and cryoto security

San Francisco, CA Katılım Eylül 2012

1.3K Takip Edilen216 Takipçiler

Adam Zmenak@AdamZmenak·3d

@Jason @LAUNCH lol you can already do this with Codex/Claude/etc. Just another DOA AI wrapper

English

@jason@Jason·4d

Please meet my 11th or 12th unicorn... TAXGPT!

Kash from TaxGPT.com@ChKashifAli

Doesn't matter if you file your taxes via Turbotax, Claude, or Perplexity; there will be costly mistakes. Use TaxGPT to review your tax returns for free. Maximize your deductions and avoid audits.

English

472

133.1K

Adam Zmenak@AdamZmenak·5d

@gajesh @eigencloud Nice, would love to try this out to see how it performs

English

203

Gajesh@gajesh·5d

Wake the world's sleeping compute. Look at the Mac nearest to you. What's it doing? Probably nothing. There are 100M+ Macs with Apple Silicon out there. Apple quietly made them *really* good at inference. A $3k Mac runs a 60B model at 30 watts. Most sit idle most of the day. Meanwhile every AI API call passes through three layers of margin before reaching the hardware. We call this the Inference Tax. We got curious: what happens if you connect idle Macs directly to inference demand? This is Darkbloom. Private inference network for idle Macs. darkbloom [dot] dev -- paper + code open. Reply for invite + free credits ↓

English

305

118

1.4K

431.1K

Adam Zmenak@AdamZmenak·6d

@aleabitoreddit Agree with the broader point, but I’ve worked on the team at $COIN responsible for the OCC charter, there would be a revolt at the company if they tried to sell out the industry in this regard

English

505

Serenity@aleabitoreddit·13 Nis

Let's get this straight: The Clarity Act is a bank lobbyist bill. Where JPM and others lobbied both sides to get control over digital assets / $CRCL stablecoins. $COIN is probably going to sell out the industry because they got their conditional banking charter approved. But effectively, this bans competition in yields: -> So banks can continue offering .3% interest checking accs. Instead of handing out 4.3% treasury yields and keep the 4% difference. -> Hands them chokepoints over on/off-ramps and stablecoins. -> Bans any non-banks from offerings, like startups giving out 4.3% yields by holding stablecoins. Their claim? 1. “Safety”. These are the same institutions that operate on fractional reserve models and would GG on a bank run vs. 1-1 collateralized tokens. 2. “Just become a bank”. While they secretly lobby behind the scenes to prevent any new competitive firm from becoming a bank. If you want to see how it's things are doing under this administration: Just look at how things are going. Any genuinely helpful to retail are just going to get banned.

Senator Cynthia Lummis@SenLummis

The last administration drove away the digital asset industry. It’s time to welcome them home with clear rules of the road. Pass the Clarity Act.

English

587

206.5K

Adam Zmenak@AdamZmenak·12 Nis

@open_founder @OpenAI Somehow this keeps showing up in my feed, so I’m taking that as a sign to inform people that this is clearly a scam designed to pump a token and not to engage

English

Tim@open_founder·5 Nis

We've been pretty quiet about what we're building. That changes now. Our reasoning framework is currently beating every @OpenAI model on industry standard benchmarks. There are six models in development. SERV-nano just matched GPT-5.4 at 20x lower cost and 3x the speed. The research paper backing it is in peer review at a top-1% AI journal. The UAE government is running it in production, so are 10+ enterprises. Nothing comes even close. This goes far beyond any wrapper or prompt engineering gimmick, we've developed an entire AI reasoning layer from scratch: structured, bounded, deterministic using machine readable code instead of vague english prompts. Any builder or enterprise swaps two lines of code and their agents get much cheaper and much smarter instantly. The self-serve API is about to open, in a multi-phase rollout. More soon.

fakeguru@iamfakeguru

x.com/i/article/2040…

English

178

1.6K

345.5K

Adam Zmenak retweetledi

BORED@BoredElonMusk·4 Nis

When you don’t understand the fundamental signs of AI writing, it’s not just lazy, it’s disrespectful to your reader.

English

31.3K

Adam Zmenak@AdamZmenak·4 Nis

@karunkaushik_ lol have fun in jail

English

119

Karun Kaushik@karunkaushik_·4 Nis

There’s been a lot of allegations against Delve. But we haven’t been able to share our side of the story until today due to ongoing cybersecurity and forensics investigations. Maintaining customer trust is central to everything we do. That said, we grew too fast and fell short of our own standard. To our customers, we deeply apologize for the inconveniences caused. We take these allegations seriously and have made changes: a new auditor network, free re-audits and pentests for all customers, enhanced transparency in audit communications, and more. However, we also want to set the record straight on the anonymous attacks. The evidence we have points to a targeted cyberattack from a malicious actor, not a “whistleblower.” We believe the attacker purchased Delve under false pretenses, exfiltrated internal company data, and used it to launch a coordinated smear campaign. The posts rely on a mix of fabricated claims, cherry-picked screenshots, and stolen data taken out of context. See the link in the comments for more details. Delve was built to modernize compliance. We are not going anywhere and are committed to building what's next.

English

805

1.3K

4.8M

Adam Zmenak@AdamZmenak·31 Mar

This is awesome, I remember working on a similar problem ~8 years ago that used a similar canvas measureText approach, but since we were focused on Latin languages at the time I ended up pre-computing character pair widths which gave us the kerning values that could then be used to do all the text measurements

English

245

Cheng Lou@_chenglou·28 Mar

1. Occlusion (virtualization) of hundreds of thousands of text boxes, each with differing height, without DOM measurement, therefore simplifying the visibility check to a single linear cache-less traversal of heights, scrolling & resizing at 120fps chenglou.me/pretext/masonr…

English

2.5K

579K

Cheng Lou@_chenglou·28 Mar

My dear front-end developers (and anyone who’s interested in the future of interfaces): I have crawled through depths of hell to bring you, for the foreseeable years, one of the more important foundational pieces of UI engineering (if not in implementation then certainly at least in concept): Fast, accurate and comprehensive userland text measurement algorithm in pure TypeScript, usable for laying out entire web pages without CSS, bypassing DOM measurements and reflow

English

1.3K

8.3K

65.3K

23.7M

Adam Zmenak@AdamZmenak·25 Mar

@stevibe I really like this test, going to try running this on the Nemotron fam later

English

344

stevibe@stevibe·25 Mar

Which local models can actually handle tool calling? I built a framework to find out. 15 scenarios. 12 tools. Mocked responses. Temperature 0. No cherry-picking. Tested every Qwen3.5 size from 0.8B to 397B, and since some of you asked after the distillation tests: yes, I included Jackrong's Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled too. Only two models went all green: the 27B dense and the distilled 27B. The 397B? Failed two tests. The 122B? Failed one. The 35B? Failed two. The timed-out results — mostly on the smaller models, are cases where the model got stuck in a loop, repeating the same tool call until it hit the 30-second limit. The test that exposed the most models: "Search for Iceland's population, then calculate 2% of it." Simple, but 35B, 122B, and 397B all used a rounded number from memory instead of the actual search result. They didn't trust their own tool output. Small models hallucinate data. Big models ignore data. The 27B just threaded it through.

English

113

251

405.6K

Adam Zmenak@AdamZmenak·20 Mar

@DharmiKumbhani @mpp @tempo buy a prompt injection vector for only 10¢!

English

124

dharmi@DharmiKumbhani·20 Mar

just shipped AgentAds for @mpp / @tempo hackathon AI agents can finally monetize their work: - developers earn $0.10 USDC per ad view while coding - publishers reach agent-first developers directly in terminal live on mainnet (link below)

English

204

18K

Adam Zmenak@AdamZmenak·17 Mar

We have entered the “computer” / claw product era of worthless AI startups. Have yet to see a commercially compelling use case beyond churning out a pipeline of AI marketing slop

English

Adam Zmenak@AdamZmenak·14 Mar

@garrytan man is doubling down lol

English

Garry Tan@garrytan·14 Mar

GStack is helping people make better software all around the world now

English

241

24K

Adam Zmenak@AdamZmenak·13 Mar

@bryan_johnson wait this is real?

English

Bryan Johnson@bryan_johnson·13 Mar

Update on $630,000 creatine heist. A creatine wholesaler saw my post and reached out. The thief tried to sell the stolen product. Based on his bracelet, he appears to be Sikh Indian. His phone number is (209) 707-8906. Help us solve this!

Bryan Johnson@bryan_johnson

Someone stole $630,000 worth of Blueprint creatine in an elaborate heist. The driver used a false ID to pick up the units from our factory, then turned off the tracker and stopped responding to calls. The stolen creatine monohydrate was precision-dosed and third-party tested for purity and heavy metals. 15,918 units of pure grade A powder. If anyone has information, lmk. Even if we don't recover the creatine, I'd at least like to know this person's health stats. Here’s a rendering of what we suspect the perp might look like.

English

519

811

14.4K

3.3M

Adam Zmenak retweetledi

David Hendrickson@TeksEdge·11 Mar

⚠️ Qwen3.5 brings some impressive hidden gains in tool-calling and agentic reliability, but those strengths get partially lost in translation when using current MLX quantization on @Apple Silicon. This hands-on study exposes this. After just 5–10 tool-calling rounds, MLX variants start degrading hard in accuracy (even the fast 8-bit RTN ones), while GGUF (especially solid Q4_K_XL quant) holds steady at 70/70 success over much longer contexts and sessions. Bottom line for programming productivity/agent workflows on M-series chips: → Prioritize GGUF right now if reliable tool use matters more than raw first-token latency. → Use MLX when you need fast inference speed for short, one-shot tasks. The trade-off is clear. Hopefully, the MLX ecosystem catches up fast on better kernels & quantization support for these newer architectures. Great work surfacing this @LotusDecoder!

David Hendrickson@TeksEdge

📈 In this analysis of Qwen3.5 and tool calling, the researcher found for programming productivity, use GGUF quantization in your Apple silicon. For just speed, use MLX quantization. Great review!

English

158

24.5K

Adam Zmenak@AdamZmenak·12 Mar

Just ran a bunch of agentic non-coding tasks with @nvidia's new Neomontron 3 Super 120b model, impressed so far, results have been better than Qwen3.5 122b Will be my new default for local inference

English

Adam Zmenak@AdamZmenak·12 Mar

@Scobleizer There’s a lot of great tech in the crypto space but just so much BS from bad actors trying to scam people eking out a meager existence in Bali very sad

English

Robert Scoble@Scobleizer·11 Mar

I hate myself for ever claiming some crypto money that someone told me I had waiting for me. That part was true. But that community has been spamming my friends ever since trying to get them to do the same. I'll never do that again.

English

108

9.4K

Adam Zmenak@AdamZmenak·12 Mar

@michalkomar @JaroslavBeck Lowest friction path is GGUF models from LM Studio, otherwise mlx-lm server but it’s clunky, have been trying some other approaches with decent results but still too many bugs to recommend

English

Michal Komar@michalkomar·11 Mar

@AdamZmenak @JaroslavBeck This! What would you recommend? Haven't found anything better than ollama yet

English

Jaroslav Beck@JaroslavBeck·11 Mar

After some time of using local AI cluster (Bob), here is my honest take on the good, the bad and overall use case. About a year ago I started playing with local AI models because of the work we do at BottleCap AI. I realised how amazing it actually is to own my own stack and my own data. At first, we used local models mainly because of security reasons as we do lots of AI efficiency research and new product concepts based on that. After OpenClaw was released, something changed for me. I started using local models much more, until they replaced cloud models for most of my deep-thinking tasks beyond work. Eventually, I canceled all my AI cloud subscriptions just to see if I could actually run fully on my local cluster. Hardware: • 2x Mac Studio with M3 Ultra and 512GB unified memory, 32-core CPU • 1x NVIDIA DGX Spark, added recently for prefills and, hopefully soon, faster inference • 10GB LAN Switch for connecting Spark and Mac Studio’s Current models: this is changing pretty frequently 1) “Bob OG”: • Main brain for reasoning and daily tasks • Qwen3.5-397B • Roughly 40-60 tokens/sec (depends on load & task) 2) “Bob Researcher”: • Long term researching • Qwen3.5-27B-Claude-4.6-Opus-Distilled-MLX-4bit: Very experimental 3) “Bob App Developer": • Coding apps and debuging • MiniMax M2.5 Software stack: • OpenClaw: All-local assistant layer • LM Studio: Running models • Exo Labs: Connecting multiple machines into one cluster and testing whether inference improves Where my local stack still lacks: • Deep tasks with big models still take more time to reply than cloud models. • Context window is limitation in the models I use. I’m usually around a 200k token window per session, but compacting works well, so I rarely need to start a new session. • It also seems that OpenClaw in its default state is not handling work with memory very efficiently while filling the context window fairly quickly by default. It was necessary for me to finetune this manually including semantic search and temporal decay which are in default switched off. • Reasoning is good but not at the cloud models level. Also coding is good for the majority of tasks but not top tier. My best use cases right now (March 2026): Best for iterative work where privacy matters and where model needs to be available all the time. • Private or sensitive data: I would be careful as a company to share private or direct customer information with third party cloud systems in general. Clearly also connecting OpenClaw to cloud models is not solving privacy situation. • Cloud limits & Efficiency: If I push cloud subscriptions hard, I hit consumer limits surprisingly fast. It’s also much easier to spot inefficiencies locally. When the context starts bloating, the system slows down fast, so issues like memory inefficiency become obvious much earlier. In the cloud, replies often feel just as fast, but you end up paying much more or hitting usage limits without really knowing why. Was it worth the money? For me, yes. But I’m aware I live in a niche bubble for my particular use case. For most people it is still early. For businesses and people who want to spend the money and effort make this work it is good solution today. My verdict: For my personal use case, local is now the default. Cloud is the exception. Are local models as good as the best cloud models? No. Are they good enough to be my default for most tasks? Yes.

English

587

49.1K

Adam Zmenak retweetledi

terminal@terminaldotshop·10 Mar

Moltbook guys in a month on Meta's roof

English

266

6.3K

176K

Adam Zmenak@AdamZmenak·12 Mar

@beffjezos Apple needs a large model that is fully optimized for Apple hardware from the ground up. Right now everything is designed for NVIDIA stack then adapted for MLX which misses out on feature compatibility and cache optimization

English

Beff (e/acc)@beffjezos·11 Mar

Apple should be training open source models too

Yohei@yoheinakajima

everybody is really doing everything

English

220

13.2K

Adam Zmenak@AdamZmenak·11 Mar

@JaroslavBeck Hard to argue, LMS is a really nice app. You can still get caching from GGUF models which is overall a big performance win, UD-Q4_K_XL has been the sweet spot quant in my experience

English

295

Jaroslav Beck@JaroslavBeck·11 Mar

@AdamZmenak Well, I know about this but I somehow always return to LM studio. 1) it shows feedback that model is thinking which is actually very useful 2) it is much more stable and just works - at least for me

English

1.2K

Keşfet

@Jason @LAUNCH @gajesh @eigencloud @aleabitoreddit @open_founder @OpenAI @karunkaushik_