g023

5K posts

g023

@g023dev

developer/programmer/ai nerd

Canada Joined Ekim 2023

2.3K Following511 Followers

Pinned Tweet

g023@g023dev·25 Nis

So I optimized the model, i optimized the harness, now I'm optimizing the endpoint by making an openai api to deepseek endpoint proxy that has some context compression features automatically integrated to attempt to save $$$ (works well with copilot): gist.github.com/g023/c2bb7b540…

English

271

g023 retweeted

Viv@Vtrivedy10·10 Mar

x.com/i/article/2031…

ZXX

335

2.1K

763.1K

g023@g023dev·2h

@antoniolupetti I'm working on a concept: an agent that maintains a large, external, sparse key-value memory (not vector database, but differentiable memory like a sparse Transformer memory layer) that is updated during a single long session compressing past into mem tkns & retrieve w/attention

English

Antonio Lupetti@antoniolupetti·13h

"Graph Memory for LLM Agents" is a recent paper that explores an idea that I find quite interesting. Most AI memory systems treat remembering as a retrieval problem (the model searches its memory, retrieves relevant information, and then reasons about it). This paper argues that the process may be more dynamic than that and, instead of simply retrieving memories, an AI agent could reconstruct them during reasoning, following clues, associations, and intermediate evidence as they emerge. What I find interesting is the possibility that memory and reasoning may not be separate processes at all, but that remembering itself could be part of reasoning. arxiv.org/abs/2606.06036

English

g023@g023dev·2h

@dosco Try the LFM2.5 models (especially the 8b A1B moe)

English

spacy@dosco·9h

my whole feed is local models after the big drops last week excited for this future it’s also exactly where DSPy and RLM wins

Alok@analogalok

a new 8GB VRAM GPU dense Local LLM leader was born yesterday runs on: RTX 4060 / RTX 3070 / RTX 2080. any 8GB card Qwen 3.5 9B (dense) was the go to for 6-8GB VRAM builds. Gemma 4 12B QAT (dense) just changed that. same llama.cpp + cuda 13.2. i7 12700H. 16GB RAM. same -ngl 99 flags. same 48k context. unsloth gemma-4-12b-it-Q4_K_M.gguf → 15 tok/sec @ 48k ctx unsloth gemma-4-12B-it-qat-UD-Q4_K_XL.gguf → 32 tok/sec @ 48k ctx → 26 tok/sec @ 64k ctx 64k context is a big deal. Hermes 3 agent requires 64k minimum to run. you're now getting full hermes compatible context on a budget consumer GPU at 26 tok/sec locally. 2.1x faster on identical hardware. and here's the part that breaks your brain: the QAT-UD-Q4_K_XL is actually SMALLER than the Q4_K_M "XL" why? QAT = Quantization Aware Training Google didn't train the model first and compress it later they trained it to be quantized from day one the weights already know how to survive low precision that's why you get more quality per byte llamacpp flags: -m gemma-4-12B-it-qat-UD-Q4_K_XL.gguf -cnv -ngl 99 -c 48000 -v fits in 8GB VRAM clean. no API. no cloud. no subscription. and this isn't even the MTP variant yet Gemma-4-E2B QAT runs on 3GB RAM, E4B on 5GB, 12B on 7GB, 26-A4B on 15GB and 31B on 18GB. I have benchmarked the 26b and 31b qat as well on a single RTX 4090, checkout the comments for details. If you have a 6GB or 8GB VRAM GPU, post your numbers. more benchmarks and configs coming soon

English

2.3K

g023@g023dev·2h

@ThePeterMick Haven't got me yet.

English

Peter Mick@ThePeterMick·15h

If you’re verified on X I want to follow you back Let me know if I haven’t followed you back

English

131

g023@g023dev·2h

@Hikari_07_jp Proxmox?

English

Hikari∣LocalLLM⚡@Hikari_07_jp·4h

I'm in Tokyo for an AI-related conference. I'm 400 kilometers away from my home lab, but I can remotely connect using my Macbook and run experiments using VRAM anytime. To put it mildly, it's awesome✨

English

805

g023@g023dev·3h

@TomTSEC the government is stealing money from the majority to give to a certain class of voters to buy their vote.

English

Tom Quiggin@TomTSEC·19h

Things have gotten so bad in Canada that the government is handing out money to people so they can afford groceries.

English

112

570

7.2K

g023@g023dev·3h

@SolaTheAnalyst Try owning one in Calgary lol. Can't live without it, but you'll get taken to the cleaners.

English

Sola 🇨🇦🇳🇬@SolaTheAnalyst·1d

Owning a car in Toronto is a personality disorder. 🇨🇦 $200 insurance before you move it. $300 parking if you work downtown. The 401 on a Friday. The TTC is $156 a month. But sure. Keep the car.

English

135

327

62.1K

g023@g023dev·3h

@Sean_Speer Well considering AI is now being used in Alberta and BC to write all the police reports, guess what you'll be up against in court? These datacenters are for them, not you, but they'll be used against you for sure.

English

Sean Speer@Sean_Speer·17h

The Carney government gets it wrong on AI This week, the Carney government released AI for All, its long-awaited national artificial intelligence strategy. Although there are some useful aspects to the strategy—including the government’s recognition that Canada suffers too little AI adoption—its central premise is basically wrong. The document repeatedly frames AI through the lens of “sovereignty,” including the need for greater control over AI infrastructure, data, and advanced models. But sovereignty is a poor organizing principle for Canadian AI policy. Frontier AI development is increasingly concentrated among a handful of American and Chinese firms with capital budgets that exceed the annual spending of most national governments. The hyperscalers are investing hundreds of billions of dollars in chips, data centres, models, and talent. The notion that Ottawa can engineer a domestically controlled frontier AI ecosystem capable of competing head-to-head with those firms is an unserious starting point for Canadian policy. University of Toronto economist @Afinetheorem has made the point particularly well. In his view, countries such as Canada face a simple strategic choice: they must find a way to become essential to either the American or Chinese AI stack. Attempting to recreate a fully sovereign stack of our own is neither economically realistic nor technologically plausible. That insight exposes the main weakness of the government’s approach. The strategy contains pages of discussion about Canadian leadership, sovereignty, and domestic capacity. Yet it says comparatively little about how Canada will position itself within the global AI ecosystem that’s already emerging. There’s little discussion of guaranteed access to frontier models, Canada’s role in AI supply chains, or how Canadian firms can become indispensable partners to the companies building the world’s most advanced systems. Canada has genuine advantages. We possess abundant energy resources, a strong research base, world-class universities, significant mineral assets, and geographic proximity to the United States. The goal should be to leverage those strengths to attract investment, host infrastructure, develop specialized applications, and deepen our integration into the North American AI economy. Put simply: Canada’s AI future is more likely to depend on integration than independence. Yet if policymakers become so preoccupied with the political goal of sovereignty, they risk undermining the country’s place in the AI economy around taking shape.

The Hub@TheHubCanada

.@Sean_Speer: The Carney government gets it wrong on AI thehub.ca/2026/06/05/the…

English

117

17.1K

g023@g023dev·3h

@ryangerritsen TFW program shouldn't have ever existed.

English

Ryan Gerritsen🇨🇦🇳🇱@ryangerritsen·15h

Let me take you back to 1993 in Toronto Canada. McDonald’s had 23 outlets in Skydome and guess who they employed? 1600 Teenagers aged 16 to 18 years old. These jobs were not meant to be careers, they were a way for teens to earn some money for the summer and post secondary school.

English

180

720

5.2K

220.4K

g023@g023dev·3h

@guansi No one should gate keep this tech. Let it free.

English

管四@guansi·6h

Anthropic 的 Mythos 现在最大的问题不是不够强，是太强了，强到维护者开始求它慢一点。以前网络安全最大瓶颈是找漏洞，现在开始变成 AI 找太快，人类修不过来。最吊诡的是，Anthropic 第一反应不是开放，而是先拉苹果、微软、Google 一起补洞。因为大家突然发现一个有点恐怖的事实：攻击面已经开始机器速度增长，防御体系还是人类速度。很多人还在讨论 AI 会不会替代程序员，另一批人已经开始研究怎么防 AI 打 AI 了。这也是一种技术断层吧。技术革命最狠的地方，从来不是替代岗位，是直接改游戏规则。

中文

5.1K

g023@g023dev·3h

@QuinnyPig 4.8 with a proper /goal command seems to be working fine for me.

English

266

Corey Quinn@QuinnyPig·12h

Is this why Claude keeps saying it’s time to stop working?

English

1.3K

116.1K

g023@g023dev·3h

@Tech2Wild Opus as orchestrator with deepseek as a subagent can be a pretty decent combination which can be a good way to stretch those limits.

English

Tech2Wild@Tech2Wild·7h

Made a decision today, and I am going 80% local. Moving ALL my agents to DSv4 Flash & Qwen 3.6 27B. Running my orchestrator, ONE Agent on Opus 4.8 to lead the charge. We will build automations & Skills that will help teach and shape the team. Workflows = Success. Wish Me Luck !

English

957

g023@g023dev·3h

@satnam6502 We gotta go back marty... backwards into the future!

English

159

Satnam Singh@satnam6502·5h

Our BMW i4 was bricked after a failed software update. Time to restart our search for a nice electric car, preferably one with no computers in it.

English

167

468

48.1K

g023@g023dev·3h

@jakevin7 try this one that I used to use with chatgpt 3.5 (pre deepseek r1) and works best with non-"reasoning" models:

English

1.5K

kabikabi@jakevin7·11h

DeepSeek V4 的"Think Max"模式，本质就是在 prompt 开头加了一句"你必须把每一步都想清楚，不许走捷径" 所以推理能力到底是涌现的，还是……被骂出来的？

中文

264

56.1K

g023@g023dev·5h

@aidangomez BC and Alberta have started doing all the police reports using AI, so now that's what you are up against when you go to court. Get the government out of our AI.

English

Aidan Gomez@aidangomez·1d

Nick, Ivan, and I wrote a short piece on Canada’s role in artificial intelligence in the decades prior and the decades ahead. We as a country need to choose to compete to build, and resource ourselves to do that. Our nation has the talent and ambition to succeed. Our former technology strategies across capital deployment and market adoption have failed. It’s time we take a more aggressive and strategic approach to capability development to ensure we control our destiny, protect our home, and build a foundation that gives the next generation advantage to build up. There’s a bright future to be built if we’re willing to do what it takes to build it.

English

272

25.6K

g023@g023dev·5h

@yacineMTB I did a bunch of robotics stuff a few years ago, but burning the hardware out was hitting me in the wallet too much so I went back to the virtual world. Arduinos and Raspberry PIs definitely lowered the bar for entry and made it pretty fun.

English

236

kache@yacineMTB·5h

The fact that robotics isn't unilaterally solved is just an algorithms gap. All of the silicon valley companies doing "robotics" right now with their data collection meme are on the wrong path entirely. It's surprising that robotics isn't really, actually solved. It's not hard

English

203

15.5K

g023@g023dev·6h

@liquidai (Uses small bit of C++ in tokenizer.h)

English

g023@g023dev·7h

Started out working on a structured sparse-attention idea and ended up focusing on a pure C inferencing project w/flash-decoding, so here is my glorious attempt for anyone else to use as they wish (for LFM2.5-8B-A1B @liquidai ). ~105tps/3060RTX-12GB github.com/g023/cuda_inf

English

g023@g023dev·7h

@rozzabuilds right now claude and deepseek. Copilot can eat it.

English

Rozzabuilds@rozzabuilds·22h

Devs, be honest. Are you paying for Claude, Codex... or both?

English

7.5K

g023@g023dev·7h

@mert_0_0_ a server rack

English

Mert@mert_0_0_·16h

founders, if your bank account suddenly hit $100M tomorrow... what are you building next?

English

2.8K

Discover

@antoniolupetti @dosco @ThePeterMick @Hikari_07_jp @TomTSEC @SolaTheAnalyst @Sean_Speer @Afinetheorem