Trần Ngọc Sơn

12 posts

Trần Ngọc Sơn

@nomiss194

Katılım Kasım 2025

52 Takip Edilen0 Takipçiler

Trần Ngọc Sơn retweetledi

CyrilXBT@cyrilXBT·2d

EVERYONE IS OBSESSING OVER WHICH AI MODEL IS FASTEST. They are measuring the wrong thing. Someone just ran Gemma 4 and Qwen 3 on the exact same coding task with the exact same hardware and the exact same prompt. Here are the raw numbers: Gemma 4 31B: 27 tokens per second. Finished in 3 minutes 51 seconds. Used 6,209 tokens. Qwen 3 27B: 32 tokens per second. Finished in 18 minutes 4 seconds. Used 33,946 tokens. Qwen was FASTER per token. Gemma finished 14 MINUTES EARLIER. Because Gemma used 5.5x fewer tokens to reach a complete answer. This is the insight almost nobody in AI benchmarking understands. Tokens per second is not the metric that matters. Tokens to completion is the metric that matters. A model that thinks more efficiently gets you to the answer faster even if it generates each token more slowly. A model that thinks out loud for 33,000 tokens when 6,000 tokens would have done the job is not smarter. It is more verbose. And verbosity costs you time, money, and context window. The fastest model is not the one with the highest tokens per second. It is the one that needs the fewest tokens to be right. That distinction is going to matter enormously as agent loops get longer and API costs compound across thousands of iterations. Screenshot this and come back to it every time someone shows you a tokens per second benchmark. Follow @cyrilXBT for every AI insight that changes how you think about the tools you are building with.

English

112

14.2K

Trần Ngọc Sơn@nomiss194·3d

@cryptogems555 @grok @BitcoinPulseX Ok

Trần Ngọc Sơn retweetledi

Emre Savcı@mstrYoda_·4d

Bir sürü lokal LLM var, benim bilgisayarım hangisini kaldırır diye soruyorsanız size bunun cevabını veren güzel bir site var: Can I Run AI Sistem bilgilerinize göre hangi modelin ne kadar token/saniye performansıyla çalışacağını gösteriyor 👇 canirun.ai

Türkçe

753

53.6K

Trần Ngọc Sơn retweetledi

Bruno Pinheiro@brunopinheiroms·6d

O que eu mais sinto falta do Claude Code em relação ao meu setup atual do OpenCode Go é a rapidez, mas em compensação tô conseguindo usar sem estourar limite e a qualidade está boa, harness importa, pessoal! Sai de um custo de 200$ por mês para 10$ (desafio pessoal) Meu setup tem orquestrado para usar o melhor de cada modelo: - Kimi 2.6 orquestra - MiMo V2.5 Pro codifica - DeepSeek V4 Flash faz o operacional - GLM-5 faz review e RCA - Qwen 3.5 Plus para MCPs (Kimi nao tem funcionado muito bem ainda) E to aproveitando que a OpenCode esta dando 3x de uso pro Kimi, baita modelo! #bolhadev #ia

Bruno Pinheiro@brunopinheiroms

Abandonei a Claude 100%, achei que não iria me acostumar mas nao pretendo voltar mais não, Claude é muito bom mas é caro e lento .. Tenho usado - OpenCode Go: Kimi 4.6 / GLM 5.1 - Codex 5.4 Habilitei Caveman com Rtk para economia de tokens Sai de 200$ por mes pra 30$ (OpenCode + Gpt) e nao estou sentindo falta de nada ate agora github.com/juliusbrussee/… github.com/rtk-ai/rtk #bolhadev

Português

82.9K

Trần Ngọc Sơn retweetledi

Suryansh Tiwari@Suryanshti777·26 Nis

This shouldn’t be possible… but Microsoft just did it. They made a 100B parameter model run on a single CPU. No GPU. No insane setup. Just math rewritten from scratch. Here’s the part most people are missing 👇 Traditional LLMs = 16-bit floats Every weight = messy decimal (0.0023, -1.47…) Inference = billions of float multiplications → That’s why GPUs exist BitNet flips the entire game. Instead of floats, it uses ternary weights: {-1, 0, 1} Not compression. Not optimization. A completely different computing primitive. Now look what happens: → ×1 = keep value → ×(-1) = flip sign → ×0 = ignore That’s it. No multiplications. Just add, subtract, skip. Matrix multiplication (the core of AI) becomes: cheap integer ops on a CPU And the results? → 2x–6x faster on CPUs → Up to 82% less energy usage → Scales BETTER with bigger models Let that sink in: A 100B model running at 5–7 tokens/sec on a single CPU This is not “optimization” This is a paradigm shift And the craziest part? These models are NOT quantized later They are trained like this from day one. No precision loss No quality drop The model literally learns inside the constraint Why 1.58 bits? Because: log₂(3) ≈ 1.58 3 possible values → max efficiency per weight We’re watching the hardware bottleneck disappear in real time. AI is not getting smaller. The math is getting smarter. Bookmark this. In a year, running LLMs locally won’t be impressive. Not running them locally will be.

English

106

30.2K

Trần Ngọc Sơn retweetledi

divyansh tiwari@DivyanshT91162·25 Nis

CAPTCHA IS OFFICIALLY OUTDATED A new open-source library called Cap is changing how websites stop bots. No puzzles. No traffic lights. No “select all bikes” anymore. Instead, it uses a SHA-256 proof-of-work system — simple, silent, and fast. Why devs are switching: • Only ~20KB in size • Zero tracking, zero data collection • No images, no user friction • Works with any JS runtime • Fully customizable (visible, invisible, floating modes) • Zero dependencies • Can be deployed instantly via Docker This is a full replacement for traditional CAPTCHA systems. Cleaner UX. Faster websites. Better privacy. 100% open-source on GitHub Link in comments.

English

984

112K

Trần Ngọc Sơn retweetledi

Kirill@kirillk_web3·24 Nis

A SINGLE CLAUDE.md FILE JUST HIT #1 ON GITHUB TRENDING. 82,100 stars. 7.8k forks. zero dependencies. Bookmark this before you forget. And your Claude will start working differently. 4 principles. one file. Karpathy's LLM coding habits. distilled. > think before coding. > simplicity first. > surgical edits only. > goal-driven targets before starting. swap it into your CLAUDE.md today. your Claude Code becomes a different tool. Read it today. Link below. Claude → Skills → CLAUDE.md → Better Code → Better Systems → Money

Kirill@kirillk_web3

🚨do you understand what the Head of Anthropic Coding Agents just dropped. 30 minutes. more value than 100 paid courses. not a course. not a tutorial. how top AI researchers actually build. here's the part nobody is talking about: > real workflows. not theory. > vibe coding from the source. > how they think, build, and ship with agents. watch this before you write another prompt. before you build another agent. before you touch another tool. 30 minutes. bookmark it. watch it today. this one changes how you use AI for good.

English

491

8.5K

4.5M

Trần Ngọc Sơn retweetledi

Dillon Mulroy@dillon_mulroy·25 Nis

don’t sleep on learning to effectively use /tree

ishak - oss/acc@IshakSebsib

.@opencode has a /tree now, and context is way easier to manage.

English

602

47.8K

Trần Ngọc Sơn@nomiss194·24 Nis

@opencode Cho tôi xin câu lệnh update opencode ở terminal

Tiếng Việt

217

OpenCode@opencode·24 Nis

the `reasoning_content` issue has been fixed in the latest release - v1.14.23 try it out and let us know

OpenCode@opencode

Try DeepSeek V4 in OpenCode today 1. `/connect` the DeepSeek provider 2. Grab your key from platform.deepseek.com 3. Select DeepSeek-V4-Pro Tell us what you think

English

507

43.4K

Trần Ngọc Sơn@nomiss194·23 Nis

@googleaidevs @grok Embedding 2 có gì khác so với Embedding 1

Tiếng Việt

Google AI Developers@googleaidevs·22 Nis

Gemini Embedding 2 is now generally available in the Gemini API and Vertex AI! Start building with our first natively multimodal embedding model, now equipped with the stability and optimizations required for production apps.

English

339

3.1K

806.8K

Trần Ngọc Sơn retweetledi

0xMarioNawfal@RoundtableSpace·12 Nis

Someone gave Claude Code permanent memory and it hit 46k stars in 48hrs 95% less token consumption/session. Never hits context limits. Picks up exactly where you left off. One command install. Completely free.