Emiliano Conti

1.7K posts

Emiliano Conti

Emiliano Conti

@Rekeyea

Software Engineer. Lawful Good. #WindowsInsiders He/Him

Joined Eylül 2013
956 Following141 Followers
Tech Dev Notes
Tech Dev Notes@techdevnotes·
We need a way to use SuperGrok subscription in Cursor ...
English
33
15
315
13.2K
Emiliano Conti retweeted
Sandro
Sandro@pupposandro·
Excited to launch Luce Spark: now a 35B MoE runs on a 16GB GPU, with no offload tax. An A3B model fires ~8 of its 256 experts per token, but to keep it resident you pay VRAM for all 256. Spark pins the experts your traffic actually hits, offloads the rest to CPU, and decodes the whole token in one fused graph, so offload stops costing speed. ▸ Qwen3.6 35B-A3B: ~20.5 → 13.3 GiB ▸ Laguna XS.2 33B-A3B: 18.8 → 14.6 GiB Decode holds ~100 tok/s, close to the 119 you get with every expert resident on a 24 GB card. No calibration step. It tunes itself from live traffic.
Sandro@pupposandro

x.com/i/article/2063…

English
14
17
151
19.4K
Taelin
Taelin@VictorTaelin·
5.5 is unbelievable Yesterday night I, once again, left 4 codex tabs optimizing the new HVM5 (nothing to do with Bend2). This time I was sure I covered every form of reward hack it could possibly do. I defined what "general" means, I put a max perf cap so it couldn't just hardcode the answers, I locked the tests, I put clear time (not interaction) metrics. I went to bed confident it couldn't do anything other than optimize the interpreter. ... the interpreter, huh? I never wrote "interpreter". I just asked it to make HVM5 faster. ... ... ... It built a compiler. It built a complete functioning compiler. Overnight. It works. HVM5 is compiled now. It overshot the target 10-fold. But it is a compiler. For SupGen, that doesn't work because it generates functions dynamically. We need a fast interpreter. It didn't touch the interpreter. ...
English
64
29
1.2K
94.8K
skcd
skcd@skcd42·
Bug fixes shipping to Grok Build 0.1.220 (release notes will be available in the TUI) - Support gt and git in /execute-plan - Always-approve is now an option during permission selection - Fix routing for hook commands starting with tilde - Make group collapse header an independent selectable entry - Fix copy/paste on Linux Wayland (Omarchy, CachyOS, Hyprland) - Skip KKP for unknown terminals with no multiplexer (fixes broken Shift) - Paste file path text instead of [Image #1] for non-image files - Improve legibility on legacy Windows Console Host - Delete misleading post-compaction todo reseed reminder - Auto-background long running user-triggered bash-mode commands when invoked via `!`
English
55
25
457
47.8K
Emiliano Conti
Emiliano Conti@Rekeyea·
@LottoLabs Also for the 3090 TI? A little bit confused because of the nvlink not being supported?
English
0
0
0
24
Lotto
Lotto@LottoLabs·
It’s very simple Find a 3090 or two Get any mobo that supports 2 pcie x16 ports (at least x16x4 for lanes) Get a 1200W+ PSU Buy the cheapest ddr4 ram 64gb+ (you’re not using it anyways) Install Linux, vLLM, Llama.cpp, SGlang, tailscale Download any flavour of qwen 3.7 27b You are now localmaxxing
English
99
61
1.2K
72.9K
skcd
skcd@skcd42·
Bug fixes shipping to Grok Build (release notes will be available in the TUI) 0.1.217 • Proactive system-reminders to reduce laziness • Improve compaction via prompt tuning and context management • Add /export • Fix rendering on extra-large monitors • Fix grok -w crash when git empty-index • Ctrl+X as default shortcuts help binding on Windows • Show path on image and video generation • Fix image pasting on Linux • Show first tool call when group expanded • /config-agents modal • expose session agent name in session/info • Laziness detector and todo reminder • Validate image bytes to prevent retries • Fix UTF-8 truncation in tool call output causing crashes
English
40
17
326
27K
Emiliano Conti
Emiliano Conti@Rekeyea·
@sudoingX I'm not finding much info (or don't know where to look at). Would you say a double 3090ti is a good local AI rig (I already have one) or should I go for a bigger gpu (or more than one like Arc B70)?
English
0
0
0
19
0xSero
0xSero@0xSero·
Best tools for AImaxing Harness: - Codex best Desktop App - Droid best CLI - Pi best building block - Opencode best TUI Models: - GPT-5.5 best model - GLM-5.1 & Kimi best reverse engineers - Deepseek Pro/Flash best cost to intelligence - Opus-4.7 best for UI / Charts / LLMOps - Qwen3.6-27B / 35B best local agents - Gemma-4-31B best local intelligence Mobile control: - termius - codex & ChatGPT - kittylitter Service and networking - tailscale - cliproxyapi Tracking usage: - automation in codex - codexbar Plugins, CLIs and MCP: - computer-use (codex) - chrome (codex) - agent-browser (droid) - Figma MCP (all) - GitHub CLI (all) - GMAIL/CAL plugins (codex) - grill me skill ADE: - Warp - Zed Current meta: - vLLM-studio for local agents - Codex app for /goal and non-coding work - Droid for coding - Zed/Warp if I need to read the code
English
71
96
1.4K
51.6K
mrciffa
mrciffa@davideciffa·
Luce PFlash now run @poolsideai’s Laguna-XS.2 (33B-A3B MoE) on a single RTX 3090. - 111 tok/s decode @ short ctx - 128K TTFT in 15.91s, 5.4x faster prefill vs llama.cpp - NIAH passes every (ctx, keep) point up to 131K - first MoE target supported by PFlash - hand-rolled CUDA, ggml only, no libllama Great collab w/ @eisokant, @erc, and the rest of the team. looking forward to working more on their great coding models.🏎️ repo + GGUF in first comment
mrciffa tweet media
English
4
3
40
2.2K
Emiliano Conti
Emiliano Conti@Rekeyea·
@orca_build is everything I wanted for an ADE. Loving the ergonomics and the freedom to use the best tool for the job while adapting perfectly to my workflow
English
0
0
0
6
Emiliano Conti
Emiliano Conti@Rekeyea·
@thsottiaux Obviously Frontend. Also I miss my workflow in Pi where I can tell de harness to Self correct whenever I find something that is missing or need to configure. Worktrees workflows could be improved too to make it clearer. And lastly an editor with LSP
English
0
0
1
319
Tibo
Tibo@thsottiaux·
What are we obviously not getting right with Codex?
English
2.8K
29
2.5K
616K
Emiliano Conti
Emiliano Conti@Rekeyea·
@0xSero Amazing! I will try to replicate then. Any guidance on this would also be awesome. And thanks for the work you been doing lately 🙏
English
0
0
0
65
0xSero
0xSero@0xSero·
How I work. I typically have 4-8 workspaces - autoresearch - vllm-studio - whatever i'm doing for work - blog ------- I prefer file editor ADEs, I don't want the code to be abstracted away from me. ------- I run vertical panels for dealing with bugs as I run into them ------- For larger work, I have a session which writes tickets and 1 which just does the work. (New session per ticket) The only apps that have been able to support my style comfortably. 1. Zed 2. Warp
0xSero tweet media
English
38
23
574
21.6K
Kirill Skrygan
Kirill Skrygan@kskrygan·
Would you be interested if JetBrains releases a totally local AI agent, working 100% on your laptop, using our code insight engine and deeply integrated into the IDE? Yes, it will be probably 1 month behind the very recent frontier models, but no token blood bath anymore WDYT?
English
802
231
7.1K
490.7K
Emiliano Conti
Emiliano Conti@Rekeyea·
@shehab_amins Wow! I think Sail goes the right direction. Would you say it is ready for streaming capabilities?
English
1
0
0
139
Shehab Amin
Shehab Amin@shehab_amins·
For too long, the composable data stack has lacked a solid distributed compute layer. Our latest blog post covers why we believe Sail is the last missing piece, definitely worth a read. Read the full post: lakesail.com/blog/sail-comp…
English
2
5
18
5.8K