Asriel H

990 posts

Asriel H

@asrlhhh

Building foundation model for electronics | Hiring founding ML Eng (DM me!) | Stanford 🌲 & Cal 🐻

Palo Alto, CA Beigetreten Eylül 2021

500 Folgt346 Follower

Asriel H@asrlhhh·2h

@BohuTANG fugu coding能力和claude相比怎么样？

中文

Bohu@BohuTANG·12h

不要轻易使用 Claude code 做 Eval，很容易破产

中文

734

Asriel H@asrlhhh·13h

@BohuTANG 主要SF 这边卖 RL 训练数据已经相当商业化了。湾区大概有二三十家做 RL environment 的 startup给A社打工。不知道国内有没有类似的产业链给智谱等补补血

中文

330

Bohu@BohuTANG·16h

抽空做了个 Eval,评估下最近比较火的 GLM-5.2 能力先说结论: GLM-5.2 的模型能力(理解 + 写代码)不输 Opus 4.6,真正的差距在"如何在真实环境里高效干活"——而这恰恰是最吃 harness / RL 训练的部分，也是 Cursor Composer 重度使用的方式。harness 调好了,这模型的威力还能再放出不少。方法:同一道 Rust bug(serde_json #979),给 evot / claude-code / pi 三个 agent ,分别换 GLM-5.2 和 Opus 4.6 两个模型跑,对比它们的执行轨迹。结果: 两个模型、6 个 session 全部 PASS,GLM-5.2 在"做对"这件事上没问题,bug 理解和最终代码质量都过关。但代价差了一个量级: • Opus 4.6:三个 agent 齐刷刷 18 轮收工,~80s • GLM-5.2:38 / 43 / 61 轮,慢约 7 倍拆 trace 看清楚了,差距不在懂不懂 bug,而在 agentic 执行这层: 1. 环境处理是软肋——GLM 有十几轮卡在 cargo 上反复试错(CARGO_HOME、cargo metadata、翻 serde 源码),Opus 一轮就绕过去了 2. 不收敛、爱反复验证——写完还手搓临时测试文件建了又删,thinking 字符数是 Opus 的几十倍 3. 但代码品味在线——GLM 最终的 fix 反而更克制地道,一个 match 合并三个 case,注释把设计意图讲清楚了,比 Opus 那版还干净

中文

21.4K

Asriel H@asrlhhh·14h

@bdsqlsz 80% are doubao under the hood from what I heard

English

青龍聖者@bdsqlsz·14h

They use AWS enterprise credits to get a discount of about 70%. If discount is less than 70%, then it’s clearly using GLM 5.2 or some other model.

Greg Kamradt@GregKamradt

obvious in retrospect but I had no idea there was a black market for tokens

English

3.2K

Asriel H@asrlhhh·18h

@kyleichan Do those 4B 2B models even have an appetite that big?

English

685

Kyle Chan@kyleichan·20h

This is a really massive unauthorized distillation campaign. For comparison, reported from Anthropic: - DeepSeek: 150,000 exchanges - Moonshot: 3.4 million - MiniMax: 13 million - Alibaba: 28.8 million

Chubby♨️@kimmonismus

Anthropic claims: Alibaba continues to distill Claude on a large scale to train Qwen. Via Bloomberg Anthropic is accusing Alibaba-linked operators of running a massive campaign to illicitly access Claude through nearly 25,000 fraudulent accounts. According to Bloomberg, Anthropic claims the campaign generated 28.8 million Claude exchanges between April and June, targeting capabilities like software engineering and agentic reasoning. The company says this is part of a broader pattern of “adversarial distillation,” where Chinese labs allegedly harvest outputs from US frontier models to train rival systems at a fraction of the cost. Lets see how good Qwen 3.8 will be, probably FABLEous good.

English

131

521

173.4K

Asriel H@asrlhhh·18h

@amarifields_ Ngl nowadays in most B2B spaces talents only become important when the accesses are the same

English

103

Amari Fields@amarifields_·22h

the craziest thing about startups is that two founders can have the same talent and completely different outcomes because one got access

English

6.3K

Asriel H@asrlhhh·19h

@jaigulati_ @ArjChi What’s your peptide stack

English

104

Jai Gulati@jaigulati_·1d

hai im jai :3 i just moved to SF. im interested in AI B2B SaaS, peptides, and corgi cafe. dm me to hang out <3

English

7.9K

Asriel H@asrlhhh·1d

@OpenAI @Broadcom Which parts of the chip design workflow does GPT actually help with today? Is it RTL, schematic, place-n-route, firmware, or something else?

English

OpenAI@OpenAI·1d

We’ve designed and built our first AI chip: Jalapeño. Designed from the ground up by OpenAI and brought to production with @Broadcom, Jalapeño is purpose-built for the LLM workloads powering ChatGPT, Codex, the API, and future agentic products. Chips are foundational to the AI economy. Building our own expands our full-stack platform from products to models to infrastructure, and will help us scale intelligence, serve more people, and expand access to AI.

English

1.3K

2.3K

21.6K

5.6M

Asriel H@asrlhhh·1d

@jietang Is Zai planning to buy any RL environments? Quite a few startups in SF building and selling them right now

English

1.2K

jietang@jietang·1d

5.2 could be better with more RL ...

青龍聖者@bdsqlsz

Deepswe's benchmark results are my own experience. I've used all models, GLM 5.2 ≈ Claude Opus 4.6–4.7. Kimi 2.7 code more like inference optimization. Looking forward to K3. Doubao-seed 2.1 Pro around 37% ≈ Gemini 3.5 Flash. code are quite weak, but visual are strong.

English

1.2K

152.7K

Asriel H@asrlhhh·1d

@JPoehnelt I tried it the moment it launched and loved. But later I found out Google released several agent products offering overlapping functionality just not in CLI but packaged in different UI. Some live in Gemini app, some in AI Studio, and others sitting in Chrome.

English

Justin Poehnelt@JPoehnelt·2d

Two months ago I was fired by Google for creating the Google Workspace CLI. It went viral, hit #1 on Hacker News, gained thousands of GitHub stars and many thousands of actual users in just a couple days. It was an incredible, confusing journey, from directors and leaders asking what they could learn from the tool to getting grilled by legal about why the Google logo and brand colors are on the Google Workspace GitHub code repositories. I think the cause was that Workspace and certain leaders (and projects) were afraid of being disrupted. But the fear wasn't specific to my CLI, it was a broader fear in what agents meant for Workspace. Either way, the irony of my termination was the announcement at Google Cloud Next two days before I was fired that an official Workspace CLI was coming. I want this out there because it is easier for me to explain my story and it is an experience I want to fully own. It's also part of my healing. Nearly 7 years at Google was an incredible opportunity for me and I was fortunate to have wonderful teammates and a manager that fully supported me through these last few months. Thank you.

English

604

14.3K

4.5M

Asriel H@asrlhhh·1d

@gurvansh99 “How do we bring on a third cofounder without actually giving them the cofounder title?”

English

585

33.4K

G@gurvansh99·1d

Founding Engineer role Would you take it?

English

223

1.6K

391.6K

Asriel H@asrlhhh·1d

The best feature LinkedIn has invented for century. If I catch you playing that in my notifications, I know you’re not a serious person

English

104

Asriel H@asrlhhh·1d

@andrewarruda The world is about to find out just how massive Baidu’s PDF library really is. From primary school all the way through college, I never had to worry about paying for any textbooks, novels, journals, or pretty much any written material all bc of it

English

217

andrew arruda@andrewarruda·2d

what data trained this model?

Baidu Inc.@Baidu_Inc

3B total parameters & 500M activated, yet powerful enough to transcribe 40+ pages in one pass while keeping context intact. Meet Unlimited OCR!

English

1.3K

Asriel H@asrlhhh·1d

@hiddnest Wait I heard they shut down the company already?

English

1.8K

Chanhee@hiddnest·2d

fuck you

English

343

157.3K

Asriel H@asrlhhh·2d

@aditabrm What about semiconductor. We still have the hardest PDFs across all verticals lol

English

292

Adit@aditabrm·2d

i'm hosting a small dinner with the best founders in difficult vertical ai spaces on 7/1- finance, healthcare, legal, and more. it'll be great to share notes on building trust, enterprise sales, and more! dm me or comment for the invite

English

17.2K

Asriel H@asrlhhh·2d

Seedance 2.5 is gonna wipe out half the data-labeling and teleoperation startups out there by evolving straight into a world model itself

English

146

Asriel H@asrlhhh·3d

@IndiainNewYork When will there be one in SF

English

499

India in New York@IndiainNewYork·3d

The Consulate General of India, New York, invites you to the Indian Mango Festival - a celebration of the King of Fruits. Savor two of India’s finest varieties, Kesar & Langra, handpicked at peak ripeness. Time Out Market, Union Square June 23, 2026 | 12–5 PM Come taste the best mangoes in the world. @MEAIndia @IndianEmbassyUS @IndianDiplomacy

English

139

2.1K

395.3K

Asriel H@asrlhhh·3d

Any visibility into which closed/open models were included? If this is GPT-5.5, Opus 4.8 (or even Fable itself), and GLM-5.2, then the result is more underwhelming than inspiring.

Sakana AI@SakanaAILabs

Fugu stands shoulder-to-shoulder with leading models like Fable and Mythos across the industry's most rigorous engineering, scientific, and reasoning benchmarks. Read the full blog: sakana.ai/fugu-release Beyond Bigger Models: Why are Orchestration Models the Next Frontier Progress in AI has been driven largely by giant, monolithic models. But the most powerful systems of the future will be collaborative ecosystems. Today, this orchestration is no longer just a technical optimization. It has become a geopolitical and operational imperative. For an organization or a nation, relying on a single company's model for critical infrastructure, finance, or governance is a material vulnerability. This risk is no longer a hypothetical possibility, but a reality. As we have seen with recent export controls imposed on models like Fable and Mythos, access can disappear overnight. Collective intelligence is the practical hedge against this concentration of power. Because Fugu orchestrates an underlying pool of swappable agents, it simply routes around vendor restrictions. By orchestrating the world’s models, we are delivering the resilient blueprint required for true AI sovereignty.

English

252

Asriel H@asrlhhh·5d

@KranenKyle Curious whether the team is also thinking about kernel-level hillclimbing in near term

English

146

Kyle Kranen@KranenKyle·5d

We feel remarkably close to auto-generating SOTA LLM inference engines to target single model single Pareto point deployments using some set of validated primitives (kernels, block manager, etc)! Seems very hill-climbable.

English

3.2K

Asriel H@asrlhhh·5d

@PeterHndrsn I think many people have tried and failed this before. The verification in RL env is much harder for creative tasks than for coding tasks.

English

106

Peter Henderson@PeterHndrsn·6d

Seriously thinking of starting an RL env company with the sole purpose of fixing the slop/neuralese writing style in frontier model outputs. Then shutting it down once its public benefit purpose is fulfilled. So tired of the slop.

English

210

19.1K

Asriel H@asrlhhh·18 Haz

Hear me out what if this is just their way to announce their new video gen model

Midjourney@midjourney

A technical dive inside our new "Midjourney Scanner"

English

135

Entdecken

@BohuTANG @bdsqlsz @kyleichan @amarifields_ @jaigulati_ @ArjChi @OpenAI @Broadcom