Northon Torga

149 posts

Northon Torga

@northontorga

CPTO @goinfinitenet

São José dos Campos, Brazil Katılım Ağustos 2016

217 Takip Edilen100 Takipçiler

Sabitlenmiş Tweet

Northon Torga@northontorga·9m

everyone’s cramming AIs with more tokens, bigger context, and longer prompts thinking it’ll make them smarter reality? it usually backfires. hard. 121 experiments: Kimi found 4 bugs at 16k tokens… and zero at 48k. ntorga.com/overfed-overth…

English

Northon Torga@northontorga·6h

hot take: smaller models are underrated the value that Haiku provides for reviewing and reducing cognitive burden, handling branching, and other tricky steps in a harness is unfathomable

English

Northon Torga@northontorga·1d

i’m reviewing the Kimi vs Qwen vs Seed vs Sonnet benchmark and boy, the AIs went bananas over the result even with the haters perspective, looks like this article is something else

English

Northon Torga@northontorga·1d

Why the benchmarks, the jobs data, and the economics already tell a very different story. ntorga.com/intelligence-c…

English

Northon Torga@northontorga·1d

@guillermode20 @0xSero needs more testing but LGTM x.com/northontorga/s…

Northon Torga@northontorga

@BytePlusGlobal will publish a full article on thinking budget, scope and models, but Seed 2.0 Pro is a strong contender for the orchestration role, it's really obedient, go for it

English

Guillermomo@guillermode20·1d

@northontorga @0xSero Went to sub today and it seems I can't make a personal account from the UK? Pretty odd. Might try a VPN.

English

0xSero@0xSero·2d

Best subscriptions on market based on your budget: 10$ a month: #1 - Opencode go #2 - GLM basic #3 - MiniMax basic #4 - Huggingface Pro 20$ a month: #1 - GPT-Plus #2 - Kimi-Code #3 - Mix and match from 10$ tier 50$ a month #1 - Qwen Code #2 - Mix and match from above

English

111

1.1K

68.5K

Northon Torga@northontorga·1d

@BytePlusGlobal will publish a full article on thinking budget, scope and models, but Seed 2.0 Pro is a strong contender for the orchestration role, it's really obedient, go for it

English

Northon Torga@northontorga·1d

looks like "dola-seed-2.0-pro" is available on @BytePlusGlobal ModelArk... if it's Qwen3.5+ level, i'm in for a treat, will report back in a few

English

144

Northon Torga@northontorga·1d

@BytePlusGlobal got rated limited faster than @alibaba_cloud Model Studio with Qwen3.5+, i was unable to run the full suite at one session

English

Northon Torga@northontorga·1d

@guillermode20 @0xSero got it working just fine here, starting my personal benchmarks now

English

Northon Torga@northontorga·1d

@TheAhmadOsman but doesn’t MoE arch improves memorization more than reasoning? Kimi is 1T but it reasons like a 32B model

English

168

Ahmad@TheAhmadOsman·1d

Fundamentals of LLMs: MoE vs Dense > many popular releases have been sparse MoEs > so when a dense model drops, everyone starts asking why it feels so much slower > that’s the cost of full activation > Dense = tokens run through every parameter of the model weifhts > MoE = tokens selectively activate a subset of the parameters of the model weights > Dense models (Qwen 3.5 27B, Gemma 4 31B) > every parameter fires on every token > ~27B ops per token, every time > MoE models (MiniMax M2, Kimi K2.5) > router + many experts > per token: activate top-k (usually 2) > the rest do nothing > this one design choice changes everything > inference speed > Dense is slower: all weights, every token > MoE is faster: a 675B model might only run ~40B active params > big model, small compute footprint > memory / VRAM > Dense: lower usage, only store what you execute (~140GB for 70B BF16) > MoE: all experts must live in memory (Kimi K2.5 is ~600GB in NVFP4) > compute / FLOPs > Dense: high compute burn per token > MoE: cheap per token, expensive to host in memory though

English

314

24K

Northon Torga@northontorga·1d

@0xSero got it... i'll benchmark them once i feel we have tried all possible approaches but screenshot-based is my fav just to avoid prompt injection

English

452

0xSero@0xSero·1d

@northontorga I’ve never tried it

English

0xSero@0xSero·1d

Agent-browser is the best CLI tool I have given to my agents. It lets them control the my browser, and all my electron apps (discord, vscode, slack, etc..) It barely consumes any tokens compared to things like playwright, has a great skill and the agents seem very comfortable with it Some workflows: - e2e testing and application by using it - setting up complicated sites for me - scanning through tons of messages Thanks Vercel github.com/vercel-labs/ag…

English

171

2.3K

126.3K

Northon Torga@northontorga·1d

looks like acqui-hiring is on fire right now

NIK@ns123abc

BREAKING: Anthropic Acquires 9-Person Biotech Startup For $400 Million >be coefficient bio >founded the startup 6 months ago >build AI platform for biotech >less than 10 employees >acquired by anthropic for ~$400 million > = $40+ million per head Coefficient Bio was building an AI platform for biotech tasks: planning drug R&D, managing clinical regulatory strategy, identifying new drug opportunities Team is joining Anthropic’s healthcare life sciences group led by Eric Kauderer-Abrams. Anthropic is building specialized tools for industries that actually pay enterprise rates: >software engineering >cybersecurity >life sciences >healthcare >finance Meanwhile OpenAI is buying media companies to control narratives LMAO

English

Northon Torga@northontorga·1d

@XiaomiMiMo @openclaw @opencode @kilocode @cline @roocode are u sure that’s not a per day limit?

English

524

Xiaomi MiMo@XiaomiMiMo·1d

Xiaomi MiMo Token Plan is here. One subscription. All modalities. Build with MiMo-V2-Pro, Omni, and TTS. No 5-hour limits. No throttling. Transparent usage and billing. Just ship. Works with whatever you use: @openclaw, @opencode, @kilocode, @cline, @roocode Includes priority beta access to our newest models. 12% off your first purchase. TTS is free for now. Subscribe now → platform.xiaomimimo.com

English

816

94.5K

Northon Torga@northontorga·1d

@ThePrimeagen simplifying app containerisation for solo devs: github.com/goinfinite/os

English

497

ThePrimeagen@ThePrimeagen·1d

Hey, you got a cool project that you are building? Link it I want to yap about cool projects

English

1.7K

2.5K

209.5K

Northon Torga@northontorga·1d

@guillermode20 @0xSero was waiting byteplus release seed-2.0-pro on the coding plan… looks like they did 5d ago and i wasn’t aware… i just signed up (i’m in Brazil), will run some tests tomorrow and post my findings

English

Guillermomo@guillermode20·1d

@northontorga @0xSero Can you sub to byteplus from western countries (like the UK)? It seems like good value for money. Can you give a review?

English

Northon Torga@northontorga·2d

@thdxr an autocomplete feat on opencode was considered already? smaller model trying to guess what you want to say so you just "tab"

English

Northon Torga@northontorga·2d

@alibaba_cloud @OpenRouter but not on Model Studio😢

English

Alibaba Cloud@alibaba_cloud·2d

Qwen3.6-Plus is now live on @OpenRouter and free to try for a limited time! ● Top-tier Performance: Leading benchmarks in coding and reasoning. ● 1M Context: Default window for repository level tasks. ● Free Trial: Limited access to our new flagship. ● Agentic: Autonomously plans and iterates code. Try it on OpenRouter: openrouter.ai/qwen/qwen3.6-p…

English

Northon Torga@northontorga·2d

@KingBreton78876 @Alibaba_Qwen was about to ask that, haha, pls pls 🙏

English

105

King Le Breton@KingBreton78876·2d

@Alibaba_Qwen Any chance of Qwen3.6-Plus being on the Fire Pass?

English

1.1K

Qwen@Alibaba_Qwen·2d

Strategic Partnership Launch: Qwen 3.6-Plus Debuts on Fireworks.ai Fireworks is known for its proven track record of serving state-of-the-art models at best-in-class performance and cost across the industry. We're thrilled to announce a landmark collaboration between Alibaba Qwen and Fireworks — bringing Qwen's flagship model to Fireworks' high-performance inference platform. Qwen3.6-Plus will be soon available on Fireworks AI for inference and fine-tuning, featuring: ● Industry-leading inference performance and cost efficiency powered by Fireworks' optimized serving stack ● Fine-tuning support to customize Qwen3.6-Plus for your specific use cases Built for scale, engineered for speed. US and global developers: your access is coming soon. Stay tuned！ @FireworksAI_HQ

English

386

23.7K

Keşfet

@guillermode20 @0xSero @BytePlusGlobal @alibaba_cloud @TheAhmadOsman @XiaomiMiMo @openclaw @opencode