Northon Torga

149 posts

Northon Torga banner
Northon Torga

Northon Torga

@northontorga

CPTO @goinfinitenet

São José dos Campos, Brazil Katılım Ağustos 2016
217 Takip Edilen100 Takipçiler
Sabitlenmiş Tweet
Northon Torga
Northon Torga@northontorga·
everyone’s cramming AIs with more tokens, bigger context, and longer prompts thinking it’ll make them smarter reality? it usually backfires. hard. 121 experiments: Kimi found 4 bugs at 16k tokens… and zero at 48k. ntorga.com/overfed-overth…
English
0
0
0
1
Northon Torga
Northon Torga@northontorga·
hot take: smaller models are underrated the value that Haiku provides for reviewing and reducing cognitive burden, handling branching, and other tricky steps in a harness is unfathomable
English
0
0
0
25
Northon Torga
Northon Torga@northontorga·
i’m reviewing the Kimi vs Qwen vs Seed vs Sonnet benchmark and boy, the AIs went bananas over the result even with the haters perspective, looks like this article is something else
English
0
0
0
98
Guillermomo
Guillermomo@guillermode20·
@northontorga @0xSero Went to sub today and it seems I can't make a personal account from the UK? Pretty odd. Might try a VPN.
English
2
0
0
19
0xSero
0xSero@0xSero·
Best subscriptions on market based on your budget: 10$ a month: #1 - Opencode go #2 - GLM basic #3 - MiniMax basic #4 - Huggingface Pro 20$ a month: #1 - GPT-Plus #2 - Kimi-Code #3 - Mix and match from 10$ tier 50$ a month #1 - Qwen Code #2 - Mix and match from above
0xSero tweet media
English
111
41
1.1K
68.5K
Northon Torga
Northon Torga@northontorga·
@BytePlusGlobal will publish a full article on thinking budget, scope and models, but Seed 2.0 Pro is a strong contender for the orchestration role, it's really obedient, go for it
Northon Torga tweet media
English
0
0
0
35
Northon Torga
Northon Torga@northontorga·
looks like "dola-seed-2.0-pro" is available on @BytePlusGlobal ModelArk... if it's Qwen3.5+ level, i'm in for a treat, will report back in a few
English
2
0
0
144
Northon Torga
Northon Torga@northontorga·
@TheAhmadOsman but doesn’t MoE arch improves memorization more than reasoning? Kimi is 1T but it reasons like a 32B model
English
0
0
0
168
Ahmad
Ahmad@TheAhmadOsman·
Fundamentals of LLMs: MoE vs Dense > many popular releases have been sparse MoEs > so when a dense model drops, everyone starts asking why it feels so much slower > that’s the cost of full activation > Dense = tokens run through every parameter of the model weifhts > MoE = tokens selectively activate a subset of the parameters of the model weights > Dense models (Qwen 3.5 27B, Gemma 4 31B) > every parameter fires on every token > ~27B ops per token, every time > MoE models (MiniMax M2, Kimi K2.5) > router + many experts > per token: activate top-k (usually 2) > the rest do nothing > this one design choice changes everything > inference speed > Dense is slower: all weights, every token > MoE is faster: a 675B model might only run ~40B active params > big model, small compute footprint > memory / VRAM > Dense: lower usage, only store what you execute (~140GB for 70B BF16) > MoE: all experts must live in memory (Kimi K2.5 is ~600GB in NVFP4) > compute / FLOPs > Dense: high compute burn per token > MoE: cheap per token, expensive to host in memory though
Ahmad tweet media
English
30
21
314
24K
Northon Torga
Northon Torga@northontorga·
@0xSero got it... i'll benchmark them once i feel we have tried all possible approaches but screenshot-based is my fav just to avoid prompt injection
English
0
0
8
452
0xSero
0xSero@0xSero·
Agent-browser is the best CLI tool I have given to my agents. It lets them control the my browser, and all my electron apps (discord, vscode, slack, etc..) It barely consumes any tokens compared to things like playwright, has a great skill and the agents seem very comfortable with it Some workflows: - e2e testing and application by using it - setting up complicated sites for me - scanning through tons of messages Thanks Vercel github.com/vercel-labs/ag…
English
61
171
2.3K
126.3K
Xiaomi MiMo
Xiaomi MiMo@XiaomiMiMo·
Xiaomi MiMo Token Plan is here. One subscription. All modalities. Build with MiMo-V2-Pro, Omni, and TTS. No 5-hour limits. No throttling. Transparent usage and billing. Just ship. Works with whatever you use: @openclaw, @opencode, @kilocode, @cline, @roocode Includes priority beta access to our newest models. 12% off your first purchase. TTS is free for now. Subscribe now → platform.xiaomimimo.com
Xiaomi MiMo tweet media
English
68
56
816
94.5K
ThePrimeagen
ThePrimeagen@ThePrimeagen·
Hey, you got a cool project that you are building? Link it I want to yap about cool projects
English
1.7K
39
2.5K
209.5K
Northon Torga
Northon Torga@northontorga·
@guillermode20 @0xSero was waiting byteplus release seed-2.0-pro on the coding plan… looks like they did 5d ago and i wasn’t aware… i just signed up (i’m in Brazil), will run some tests tomorrow and post my findings
English
1
0
0
23
Guillermomo
Guillermomo@guillermode20·
@northontorga @0xSero Can you sub to byteplus from western countries (like the UK)? It seems like good value for money. Can you give a review?
English
1
0
0
39
Northon Torga
Northon Torga@northontorga·
@thdxr an autocomplete feat on opencode was considered already? smaller model trying to guess what you want to say so you just "tab"
English
0
0
0
9
Alibaba Cloud
Alibaba Cloud@alibaba_cloud·
Qwen3.6-Plus is now live on @OpenRouter and free to try for a limited time! ● Top-tier Performance: Leading benchmarks in coding and reasoning. ● 1M Context: Default window for repository level tasks. ● Free Trial: Limited access to our new flagship. ● Agentic: Autonomously plans and iterates code. Try it on OpenRouter: openrouter.ai/qwen/qwen3.6-p…
Alibaba Cloud tweet media
English
5
9
60
7K
Qwen
Qwen@Alibaba_Qwen·
Strategic Partnership Launch: Qwen 3.6-Plus Debuts on Fireworks.ai Fireworks is known for its proven track record of serving state-of-the-art models at best-in-class performance and cost across the industry. We're thrilled to announce a landmark collaboration between Alibaba Qwen and Fireworks — bringing Qwen's flagship model to Fireworks' high-performance inference platform. Qwen3.6-Plus will be soon available on Fireworks AI for inference and fine-tuning, featuring: ● Industry-leading inference performance and cost efficiency powered by Fireworks' optimized serving stack ● Fine-tuning support to customize Qwen3.6-Plus for your specific use cases Built for scale, engineered for speed. US and global developers: your access is coming soon. Stay tuned! @FireworksAI_HQ
Qwen tweet media
English
16
30
386
23.7K