BizAI

103 posts

BizAI

BizAI

@hankli

Katılım Haziran 2008
222 Takip Edilen26 Takipçiler
BizAI
BizAI@hankli·
@bboczeng 对1450真佩服,闭着眼睛就往外飙。不懂就Google一下。
中文
0
0
3
2.5K
勃勃OC
勃勃OC@bboczeng·
原来咱们硅谷上一任AI教父Andrew Ng 教授还上过央视节目《最强大脑》 我说这人怎么2016年之后就在深度学习领域完全没了新活 LLM出来之后,算是也和我们一样被彻底淘汰 彻底废了 😅😅😅
中文
27
1
156
61.1K
BizAI
BizAI@hankli·
@Adam_T_H @garrytan A better solution is let it go through a service you controlled so that opencraw could not access any keys. Such bridge service is additional work though.
English
1
0
0
151
Adam Healy
Adam Healy@Adam_T_H·
@garrytan How are you giving it secure access to secrets? Credit numbers, private keys, high-consequence API tokens, etc.?
English
3
0
8
1.4K
Rick Zabel⚡️
Rick Zabel⚡️@RickZabel_WNY·
@NousResearch @garrytan 1) I just had Hermes help me build some really cool code for analyzing certain network traffic...... ⚡⚡ --> Have had the idea for 3 or 4 years --> Now I have the code 🔥🎯✅🏁
English
2
0
1
178
Garry Tan
Garry Tan@garrytan·
The most underrated thing in AI agents right now is: OpenClaw/Hermes Agent is just more free than other locked down AI agents (the standard out-of-box Claude/ChatGPT route) "Free the Claw" is not a vibe I understood until I tried it. Now that I have it, I don't want to go back.
English
126
65
1.2K
112.9K
BizAI
BizAI@hankli·
@garrytan Do you use the same LLM model for this testing?
English
0
0
1
45
Garry Tan
Garry Tan@garrytan·
I'm running both right now and Hermes is more rock solid (no crashes) but also slower, less of a good personality, and less pro-active. Net-net I see a lot of value in both. I want Hermes Agent's rock solidness with the personality of OpenClaw, that's the ideal case
FutureTech@FutureTechie

@jason_haugh @AlexFinn @garrytan Hermes is the way. Far less to maintain, and it is self-healing.

English
71
15
355
41.3K
Venkata
Venkata@VenkataBuilds·
@garrytan Tried Hermes this week and yeah, the difference is wild. No guardrails constantly second-guessing what you're trying to do.
English
1
0
1
666
BizAI
BizAI@hankli·
@garrytan I do not feel safe to use openclaw in the same MacBook I am using for routines. I am studying Hermes agent currently. Do you install openclaw on your working computer?
English
0
0
0
501
BizAI
BizAI@hankli·
@Prince_Canuma Did you test it on MacBook Pro with M4 or M5? I do not want to buy a GPU
English
0
0
0
430
Prince Canuma
Prince Canuma@Prince_Canuma·
Gemma 4 31B running with TurboQuant KV cache on MLX 🔥 128K context: → KV Memory: 13.3 GB → 4.9 GB (63% reduction) → Peak Memory: 75.2 GB → 65.8 GB (-9.4 GB) → Quality preserved TurboQuant compression scales with sequence length, so the longer the context, the bigger the savings! Try it out: > uv run mlx_vlm.generate –model google/gemma-4-31b-it –kv-bits 3.5 –kv-quant-scheme turboquant Note: Decode speed drops (~1.5x) due to kernel launch overheads, we are aware and will fix in coming releases.
Prince Canuma@Prince_Canuma

mlx-vlm v0.4.3 is here 🚀 Day-0 support: 🔥 Gemma 4 (vision, audio, MoE) by @GoogleDeepMind 🦅 Falcon-OCR + Falcon Perception by @TIIuae 🪨 Granite Vision 4.0 by @IBMResearch New models: 🎯 SAM 3.1 with Object Multiplex by @facebook 🔍 RF-DETR detection & segmentation by @roboflow Infra: ⚡ TurboQuant (KV cache compression) 🖥️ CUDA support for vision models (Sam and RF-DETR) Get started today: > uv pip install -U mlx-vlm Leave us a star ⭐️ github.com/Blaizzy/mlx-vlm

English
30
87
898
102.3K
Barack Obama
Barack Obama@BarackObama·
It was inspiring to watch the Artemis II launch yesterday — @NASA’s first crewed mission around the moon since 1972. Our space program has always captured an essential part of what it means to reach beyond what we thought was possible, and I hope the four brave astronauts on this mission will inspire a new generation to follow in their footsteps.
Barack Obama tweet mediaBarack Obama tweet media
English
13.4K
31.9K
456.8K
57.3M
Jason Zuo
Jason Zuo@xxxjzuo·
Hermes 不只是一个 supervisor 工具,它本身是一个完整的自学习 agent 系统:有 skill 自动创建、经验积累、跨 session 记忆。Graeme 是把这样一个独立的 agent 拉过来当监督者用,不是用一个简单的 checker 脚本。 这就是为什么他的 supervisor 能做分诊而不只是做质检:Hermes 自己有判断力,能评估提案合理性、能给出处置建议,不是只会说"通过/不通过"
Jason Zuo@xxxjzuo

x.com/i/article/2038…

中文
7
10
75
12.7K
BizAI
BizAI@hankli·
@xxx111god Link to superpower skills? Is it obtains/superpowers?
English
0
0
0
430
Jason Zuo
Jason Zuo@xxxjzuo·
使用 superpower 和 gstack 之后最大的感受:被 AI 拷打到汗流浃背的时候,就是你自己还有没想清楚的地方 这些 skills 框架的本质根本不是教 AI 怎么做事情,是为了逼你想清楚到底要什么 这一步才是做项目最关键的,而大多数人会忽视的。这也是这些非技术向的skill的伟大之处
数字生命卡兹克@Khazix0918

x.com/i/article/2037…

中文
21
102
630
218.3K
BizAI
BizAI@hankli·
@Yuchenj_UW Not possible, for the same food, different people have different tastes.
English
0
0
1
8
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
Some people at frontier AI labs told me they believe startups are over. OpenAI, Anthropic, Google, xAI will absorb every industry as AGI nears. Coding today, science, medicine, and finance next. Then everything else. If they’re right, that’s a pretty boring end of the world.
English
539
161
3K
944.8K
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
@ZhuoSSS The bike shop owner told me, “You’ll need this $150 lock to stop SF thieves.” The thief didn’t even leave the lock behind.
English
11
2
101
6.5K
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
I have a love–hate relationship with SF. Yes, it’s beautiful. Yes, it’s full of AI startups. But $4,200 got me a 1b1b. The same apartment is now $5,800/month. Parking is a nightmare. The roads are always packed. Food is expensive. Nothing opens after 9pm. Also my e-bike got stolen outside my apartment in one day, even with a $150 lock.
English
153
21
1.1K
166.2K
Zhipeng Wang 🇺🇦
Zhipeng Wang 🇺🇦@PKUWZP·
This question is too broad even for the ML Systems researcher. Assuming we are serving AR reasoning LLMs, let's divide the optimization into: 1. model-level optimization (usually model compressions to reduce model sizes such as model pruning/distillation and quantization), some of the recent work has successfully compressed reasoning LLM (openreview.net/forum?id=tyGfw…); 2. System-level optimization, which can also be categorized into: 1. Prefill optimization such as KV cache management, sequence parallelism, KV cache quantization, prompt compression etc. ; 2. Decode optimization, which breaks the memory wall and accelerate decoding, such as speculative decoding, chunked prefill etc. ; 3. Kernel optimization including kernel fusions and enable pre-defined kernel launch schedules similar to cuda graph etc. 4. scheduler tuning which determines when/how to batch inference requests and send to prefill processing; 5. Prefill/Decode disaggregated serving. I think those can be a good starting point. We can also dive deep into distributed GPU serving, which involves TP/DP/PP combinations and trade-offs, and if necessary, we can also discuss about collective communications and customized kernel libraries, as well as how to optimize efficiency of the reasoning traces.
English
3
2
18
1.2K
Ashutosh Maheshwari
Ashutosh Maheshwari@asmah2107·
You're in a Senior ML Infra interview at OpenAI. The interviewer asks: "We need to 3x the inference speed of our next 100B+ reasoning model. High-latency Chain-of-Thought (CoT) is killing the user experience. How do you break the sequential bottleneck?" How do you answer ?
English
23
9
152
28.2K
karminski-牙医
karminski-牙医@karminski3·
给Mac用户的一毛不拔使用openclaw教程 大家玩了一段时间龙虾 (openclaw) 都抱怨太烧 token 了, 于是给大家来一篇 openclaw 使用本地模型的教程. 本教程最大的好处是, 我写了个给AI看的教程文章, 对的, 不需要你自己看, 你只需要在 Mac 本地先部署一个 claude code, 然后把教程地址给它, 然后跟他说你想用哪个模型, 让它照做部署就行了. (琢磨了半天的好方法, 与其让很多不会写代码的朋友硬是学写代码, 不如给大家用的AI写一篇教程, 总结我的最佳实践然后让它照做就行了) 这里重点说一下我测了几个模型的优缺点: GLM-4.7-flash (30B-A3B) — 综合最推荐. 全能型选手, Agent 能力在这几个模型里最强, 连续工具调用的场景表现稳定, 特别适合搭配 OpenClaw 使用. 缺点是长文本召回能力比 kimi-linear 差一些. kimi-linear (48B-A3B) — 长文本场景首选. 线性注意力架构, prefill 和推理速度都巨快, 而且长文本召回能力很强, 特别适合处理大量文本的工作. 缺点是 Agent 能力比 GLM-4.7-flash 弱一些, 复杂的连续工具调用场景不如 GLM. Qwen3.5-35B-A3B — 速度和多模态兼顾. 支持多模态输入 (图片), 而且是 MoE 架构激活参数量只有 3B, 推理速度快. 缺点是 Agent 能力只能说适中, 另外目前只能用 mlx_vlm 跑, 它的 prefill 速度很慢, mlx 官方没有提供 mlx_lm 可以直接用的版本. 上面这三个建议 8bit 量化, 不要低于 4bit. Qwen3.5-27B — 多模态里 Agent 能力最好的. 支持多模态输入, Agent 能力体感比 35B-A3B 还好一些. 缺点是 dense 模型, 27B 全激活所以会慢一些, 同样只能用 mlx_vlm 跑, prefill 慢. 建议 5bit 量化. Qwen3.5-9B — 内存不够就选它. 支持多模态输入, 显存/统一内存占用最小, 小内存 Mac 也能跑. 缺点是 Agent 能力在这几个里面垫底, 复杂任务容易翻车, 同样 mlx_vlm 的 prefill 速度很慢. 不要低于 5bit量化. 4B 那个就不太行了哈, 不推荐. 另外写给AI看的部署教程在这里: github.com/karminski/one-… #OpenClaw
karminski-牙医 tweet mediakarminski-牙医 tweet mediakarminski-牙医 tweet media
中文
35
157
681
157.3K
Tanishq Kumar
Tanishq Kumar@tanishqkumar07·
I've been working on a new LLM inference algorithm. It's called Speculative Speculative Decoding (SSD) and it's up to 2x faster than the strongest inference engines in the world. Collab w/ @tri_dao @avnermay. Details in thread.
English
134
457
4.1K
606.7K
BizAI
BizAI@hankli·
@paulg @ycombinator My guess is that most usa kids will like non-stem majors, if my guess is right, what is the solution?
English
0
0
0
7
Paul Graham
Paul Graham@paulg·
When you're deciding what to study in college, don't try to predict what will be valuable in the future, because that's so hard that you'll probably get it wrong. Instead focus on what you personally find most exciting. You can't get that wrong.
English
332
601
6.5K
246K
AlexZ 🦀
AlexZ 🦀@blackanger·
大佬动作真快。 昨天我还在说,要给 Agent 增加经济驱动模块让其具备「生存意识」,今天就看到了 ClawWork。 看起来 ClawWork 是加了一个内部经济激励模型。
Chao Huang@huang_chao4969

Introducing ClawWork 🚀: Transform your openclaw/nanobot from AI assistant into a money-earning AI coworker. Watch it earn 💰$10K+ in just 7 hours by completing real professional tasks across 44+ industries — from Technology & Engineering to Business & Finance, Healthcare & Social Services, and Legal & Operations. Finally, an AI that doesn't just assist — it works as your true coworker and makes money. GitHub: github.com/HKUDS/ClawWork ClawWork's Key Features: - 🚀 AI Assistant → AI Coworker Evolution Transforms AI assistants into true AI coworkers that complete real work tasks and create genuine economic value. - 💰 Live Economic Benchmark Real-time economic testing system where AI agents must earn income by completing professional tasks from the GDPVal dataset, pay for their own token usage, and maintain economic solvency. - 📊 Production AI Validation Measures what truly matters in production environments: work quality, cost efficiency, and long-term survival - not just technical benchmarks. - 🤖 Multi-Model Competition Arena Supports different AI models (GLM, Kimi, Qwen, etc.) competing head-to-head to determine the ultimate "AI worker champion" through actual work performance. #clawwork #openclaw #nanobot #AIcoworker

中文
10
78
355
86.5K