fos

492 posts

fos banner
fos

fos

@fosbix

life ends with the one that burns the candle

เข้าร่วม Şubat 2022
206 กำลังติดตาม17 ผู้ติดตาม
Petri Kuittinen
Petri Kuittinen@KuittinenPetri·
@trikcode I know that is a speed up video, but imagine that in not so far future parallel code generation could happen so fast. 1000+ bugs fixed in less than half an hour. Tasks which would normally take human coder many weeks.
English
2
0
0
286
Wise
Wise@trikcode·
Fun fact. It takes 32 minutes to burn through 5 hours worth of Opus credits on the Claude Code Max 20x plan. 12 agents trying to fix 3,528 Typescript errors.
English
100
74
1.3K
82K
송준 Jun Song
송준 Jun Song@songjunkr·
@juliandropsit 모델의 약한점을 ai 에이전트들로 깊게 분석하고, 필요한 분야의 데이터셋을 다른 강력한 모델에서 추출합니다.
한국어
4
0
39
1.2K
송준 Jun Song
송준 Jun Song@songjunkr·
SuperQwen3.6 35b 테스트중.. 뭔가 정말 강력한게 만들어졌어🤔
송준 Jun Song tweet media
한국어
32
27
351
12.3K
fos
fos@fosbix·
@LottoLabs @DataPlusEngine I don’t see any benefit to using lmastudio over llama.cpp directly. You just lose out on a ton of features.
English
0
0
1
29
Lotto
Lotto@LottoLabs·
@DataPlusEngine Lots of casuals will never even attempt cli Ollama is typically the recommended easy route LMstudio is far better and far easier I’ve used sglang,llama.cpp, vllm etc. Right tool for the job
English
2
0
13
767
fos
fos@fosbix·
Just you guys wait till an Afmoe model releases. Gemma 4 and Qwen 3.6 are just the start
English
0
0
0
22
Jorwhol
Jorwhol@jorwhol·
@gosrum What setup is needed to run this model? Two RTX 3090s sufficient?
English
3
0
1
2.3K
金のニワトリ
Qwen3.6-35B-A3Bが強すぎる!!! ・opencode,vibe-local,GitHub Copilot,qwencode,claude codeと組み合わせたときのts-benchを実施したところ、すべて満点 ・しかもClaude sonnet 4.6やOpus 4.6と同じくらい速くタスクを遂行できている Qwen3.5-27Bもすごかったが、Qwen3.6-35B-A3Bは赤い彗星のごとく27Bよりも推論速度が3倍速いので、ベンチマーク結果からもわかるようにタスク遂行までの時間が大幅に短縮できるようになったのが大きい
金のニワトリ tweet media
金のニワトリ@gosrum

Claude Opus 4.7に隠れてあまり話題になってないけど、Qwen3.6-35B-A3Bかなりすごいモデルなのでは?

日本語
21
102
620
168.8K
fos
fos@fosbix·
@griffisu As soon as the guys videos stop going viral we’ve achieved AGI
English
0
0
2
767
fos
fos@fosbix·
@stevibe @0xkeenz I would try Unsloth’s UD Q_4_K_XL/IQ4_NL/NVFP4
English
0
0
0
47
stevibe
stevibe@stevibe·
Qwen3.6 35B-A3B: smarter, but forgot how to use tools? Running 6 Bench Packs on BenchLocal across 3 open-source Qwen models. ✅ ReasonMath: 92 vs 85 vs 86 — 3.6 wins ✅ InstructFollow: 97 / 97 / 97 — tied ❌ ToolCall: 83 vs 97 vs 100 — 3.6 tanks Qwen3.5 27B still the tool-calling champ. 3.6 clearly leveled up reasoning, but tool use took a hit. DataExtract live now. BugFind + StructOutput next.
stevibe tweet mediastevibe tweet mediastevibe tweet mediastevibe tweet media
English
33
28
392
33.4K
First We Feast
First We Feast@firstwefeast·
david blaine being just as surprised as us 😩#hotones
English
7
55
1.1K
172.7K
fos
fos@fosbix·
@eliebakouch Probably ever frontier model is a MoE
English
0
0
0
231
elie
elie@eliebakouch·
yeah you know.... moe model are fundamentally limited... dense model are way better look at gemma4 and qwen3.5... you don't get it this is just a trend... moe are dead!!!
elie tweet media
Qwen@Alibaba_Qwen

⚡ Meet Qwen3.6-35B-A3B:Now Open-Source!🚀🚀 A sparse MoE model, 35B total params, 3B active. Apache 2.0 license. 🔥 Agentic coding on par with models 10x its active size 📷 Strong multimodal perception and reasoning ability 🧠 Multimodal thinking + non-thinking modes Efficient. Powerful. Versatile. Try it now👇 Blog:qwen.ai/blog?id=qwen3.… Qwen Studio:chat.qwen.ai HuggingFace:huggingface.co/Qwen/Qwen3.6-3… ModelScope:modelscope.cn/models/Qwen/Qw… API(‘Qwen3.6-Flash’ on Model Studio):Coming soon~ Stay tuned

English
31
10
438
60.8K
fos
fos@fosbix·
@stevibe @0xkeenz Which q4 are you using, qwen’s UD variant or NVFP4?
English
1
0
0
43
fos
fos@fosbix·
@songjunkr 2 months? It’s been less than 48 hours
English
0
0
1
219
송준 Jun Song
송준 Jun Song@songjunkr·
와우, qwen3.6-35b는 기존 27b와 sonnet-4.5를 이겼어요. Moe로 가능하다는것, 이건 말도 안되는 발전입니다. 고작 출시 2달도 안되었는데 말이에요. 허깅페이스⬇️
송준 Jun Song tweet media송준 Jun Song tweet media
한국어
19
38
477
44.9K
Jonas Čeika
Jonas Čeika@Jonas_Ceika·
ChatGPT glazing experiment #2
Jonas Čeika tweet media
English
108
386
13.2K
910.8K
fos
fos@fosbix·
@theo @robinebers There’s just no way you fail to acknowledge your own bias that egregiously. Cursor have just as much resources as Anthropic do. Just because they’re claiming to be putting the work in, that is enough cause for you to avoid holding them accountable in public? You’re being tricked.
English
0
0
0
239
Theo - t3.gg
Theo - t3.gg@theo·
@robinebers Oh I crash out at Cursor all the time in our private slack. It's 10x worse than anything I post here. The difference is that they listen and they're trying. I'd do similar to Google but I gave up long ago on using anything they produce lmao
English
11
0
248
25.2K
Theo - t3.gg
Theo - t3.gg@theo·
I feel bad dunking on them so much but it's genuinely absurd how bad the new Claude Code desktop app is. You can feel the vibe code leaking everywhere. Every "feature" is barely integrated and full of edge cases that weren't considered. Every menu feels barren, stuffed in last second for some random toggle. Every hotkey breaks as soon as you try to do anything else. I've lost track of how many bugs I've encountered. I found at least 40 in under an hour. And it's all truly absurd arcane shit. Stuff like voice mode typing in all input boxes instead of just the one you have focused. Any one of these issues would have been enough for me to do a massive post-mortem and likely fire someone. A $400b company shipping this is absurd. I feel like I'm going mad. How does anyone seriously use this?? It is broken on fundamental levels that are hard to comprehend. How are we supposed to trust the code these models produce if Anthropic's official showcases are absolute slop? Dedicated video on this coming tomorrow. Just needed to get this off my chest.
English
448
230
5.6K
1.2M
fos
fos@fosbix·
@whatever Face 5.5, Body 6.5, Total 5. Brains must’ve been a 3
English
0
0
0
146
whatever
whatever@whatever·
LOOKS RATINGS! He RATES them, they rate HIM?!
English
631
192
15.6K
586.4K
Guri Singh
Guri Singh@heygurisingh·
NVIDIA just dropped a 120B parameter model that only uses 12B at inference. It's called Nemotron 3 Super. 60.47% on SWE-Bench Verified, highest open-weight model ever for real-world coding. 85.6% on PinchBench, best open model as an AI agent brain. 91.75% on RULER at 1M tokens while GPT-OSS-120B collapses to 22.3%. 2.2x faster than GPT-OSS-120B. 7.5x faster than Qwen3.5-122B. Here's what makes this different from every other open model: It fuses 3 architectures into one: → Mamba-2 layers for linear-time sequence processing → LatentMoE, a new expert routing system with 512 total experts, 22 active per token → Strategic Transformer attention layers as "global anchors" LatentMoE is the real breakthrough. It compresses tokens into a latent space before routing to experts. This cuts memory bandwidth and communication costs by 4x while activating MORE experts per token. More experts. Less compute. Better accuracy. The model was trained on 25 TRILLION tokens. Natively in 4-bit precision (NVFP4) from the very first gradient update. Not quantized after training. Trained in 4-bit from day one. Post-training used 21 different RL environments across math, code, STEM, safety, tool use, and long-horizon agentic tasks. It also has built-in speculative decoding via Multi-Token Prediction. Average acceptance length of 3.45 tokens per step, beating DeepSeek-R1's 2.70 across every category. No external draft model needed. The speed is baked into the architecture. CodeRabbit, Factory, and Greptile already shipped integrations. Open weights. Open datasets. Open training recipes. All on HuggingFace. 100% Open Source.
Guri Singh tweet media
English
45
67
445
39.1K
fos
fos@fosbix·
@denizdd33 @yasinaktimur At best they use A*, much more likely something like Contraction Hierarchies, especially since accounting for traffic
English
0
0
1
243
Deniz Dede
Deniz Dede@denizdd33·
@fosbix @yasinaktimur Evet saf Dijkstra hantal kalacağı için doğrudan kullanılmaz. Yerine hedefe odaklanan A* veya ana yolları önceliklendiren CH gibi optimize türevleri tercih ediliyor. Ama mantık hala o 'en kısa yol' temelinden besleniyor.
Türkçe
1
0
17
1K
Rich kids of claude
Rich kids of claude@yasinaktimur·
🚨 son dakika : navigasyon uygulamalarının en kısa yolu nasıl bulduğu sızdırıldı.
Türkçe
171
452
14.1K
5.3M
Deniz Dede
Deniz Dede@denizdd33·
@yasinaktimur Sızdırma değil Dijkstra algoritması bilgisayar mühendisliği bölümlerinde anlatılan en temel algoritmalardan biri
Türkçe
3
0
353
21.4K
Elena
Elena@elenacute01·
He bit a blue-ringed octopus and the neurotoxin literally inflated his head into two giant orbs... ocean life is absolutely wild
English
367
1.1K
13.8K
3.7M
fos
fos@fosbix·
@V1RACY @gezine_dev There is a zero day in every software in the world. Knowing that one exists is a nothing burger
English
1
1
54
3K
Gezine
Gezine@gezine_dev·
Finally, after a year and a half since I started PlayStation hacking, I have achieved my goal. PS4/PS5 zero-day kernel exploit. Obviously, no plan to release it.
Gezine tweet media
English
1K
458
9.2K
2.1M