g023

5K posts

g023 banner
g023

g023

@g023dev

developer/programmer/ai nerd

Canada Beigetreten Ekim 2023
2.3K Folgt514 Follower
Angehefteter Tweet
g023
g023@g023dev·
So I optimized the model, i optimized the harness, now I'm optimizing the endpoint by making an openai api to deepseek endpoint proxy that has some context compression features automatically integrated to attempt to save $$$ (works well with copilot): gist.github.com/g023/c2bb7b540…
English
0
0
4
285
g023
g023@g023dev·
@oota_yoshinori0 thats the beauty of open source... its an adventure.
English
0
0
0
12
g023
g023@g023dev·
@RoyShilkrot ... also a lot less reading to see what it messed around with.
English
0
0
0
9
g023
g023@g023dev·
@RoyShilkrot it truly does help to isolate and work on the problem as a component, rather than the whole, for speed and token efficiency. Especially when dealing with smaller models for tasks.
English
1
0
1
11
Roy Shilkrot
Roy Shilkrot@RoyShilkrot·
The bigger your software project is - the higher the context token cost is. Therefore the KISS principle in software dev still holds, 40 years after its inception. Intelligence is intelligence. Artificial or Human. Holding too much in context doesn’t scale. Keep small. Pay less
English
1
0
2
265
g023
g023@g023dev·
@hyuki I mean 12b would be a bit large for that task for most people and might be really slow on large volumes of pics. You can find some nice ~2b models that can do a good enough job for most purposes.
English
0
0
0
103
結城浩 / Hiroshi Yuki
LM Studio + gemma-4-12b-qat でOpenAI コンパチなAPI持つローカルサーバ立ち上げると、無課金で画像処理AIが使える。たとえば大量のスクリーンショットや写真の分類整理やタグ付けにはぴったりではないだろうか。クラウドに出すのに抵抗があり、分量が多く、スピードと精度はそこそこで良い。
日本語
12
77
484
48.2K
antirez
antirez@antirez·
@ivanfioravanti No, if it misrepresents models in random ways, how is it good? Only because 5.5 happens to be on top?
English
4
1
12
1.8K
antirez
antirez@antirez·
For days, many folks here are citing DeepSWE as the benchmark that restores reality only because it shows GPT 5.5 on top. But actually, it almost gets a single entry right: the top one, and all the rest is shuffled.
English
15
1
136
21.9K
g023
g023@g023dev·
@lmrankhan depending on the task, cleaving out the subagents altogether gives some surprisingly good results.
English
0
0
0
20
Imran
Imran@lmrankhan·
A lot of people are talking about running tons of agents, parallel workflows, skills, and orchestration layers. Honestly, for building an app, I've found two coding agents running in async works perfectly fine, Codex for backend and Opus/Claude Code for frontend. Haven't had to use more than that, skills, or complex workflows. The bottleneck is usually figuring out what to build, not how many agents you're running or using any of the advanced workflows. I'm sure there are more advanced things people are doing, but for most MVPs or early stage products, simplicity works
English
64
17
307
28.8K
g023
g023@g023dev·
@yuhasbeentaken wow thats a pretty good incentive to burn tokens
English
0
0
0
10
Yum⋆₊˚
Yum⋆₊˚@yuhasbeentaken·
at tencent (china’s largest internet company), the token reimbursement quota is dynamic. the more you use, the more you get when it refreshes next month. so… it kinda looks like you’re incentivized to build side projects at work? 😂😂
Zack Korman@ZackKorman

Companies are like "we are spending all this money on AI but we don't know what the devs are even doing with it." Let me answer that for you: They're working on their personal side projects.

English
2
0
11
1.4K
g023
g023@g023dev·
@djcows Does give a bit of an esteem bump for the day when some bigwig top-dawg gives a response to your random yellings on the internet.
English
0
0
0
22
djcows
djcows@djcows·
you can dm some of the smartest people on earth here and they'll sometimes just answer casually, it's honestly crazy and humbling
English
20
2
138
2.8K
g023
g023@g023dev·
@RoguePoma I share a lot of my things in public, and yes sometimes they are pretty raw but useful to me. Always like to learn from others too.
English
1
0
1
6
Alex Poma 🏗️
Alex Poma 🏗️@RoguePoma·
I’m building construction SaaS in public while working full-time as an architect. I’d like to connect with more people doing the same kind of thing: - Building products. - Learning in public. - Sharing the messy times - Sharing the good times What are you building?
English
45
0
37
1.4K
g023
g023@g023dev·
@shyamalanadkat I think what would qualify as AGI would be a session that is always on, has infinite history that doesn't need to be cleared, and carries out its business on its own, either seeding itself with roles, or being directed to a role as a seed role.
English
0
0
0
33
shyamal
shyamal@shyamalanadkat·
early days of agi are going to be so special
English
5
3
45
2.8K
g023
g023@g023dev·
@rajyaligar @smhanov try using deepseek as a subagent and opus as orchestrator to stretch out the window
English
0
0
0
15
Raj
Raj@rajyaligar·
@smhanov Had to upgrade my codex sub this month from 5x to 20x cause the 5 hour window wasn’t cutting it for my workflows Big month for shipping
English
2
0
0
21
Steve Hanov
Steve Hanov@smhanov·
What was your AI bill? I've been using Claude Code and Hermes pretty heavily and up to $33 last month
Steve Hanov tweet mediaSteve Hanov tweet media
English
2
0
5
261
g023
g023@g023dev·
Made a deepseek powered agentic html editor tonite that runs amazing (of course because deepseek is amazing). Man we've come a long ways since Dreamweaver lol. Oh ya, deepseek made it too.
g023 tweet media
English
0
0
1
25
g023
g023@g023dev·
@antoniolupetti I'm working on a concept: an agent that maintains a large, external, sparse key-value memory (not vector database, but differentiable memory like a sparse Transformer memory layer) that is updated during a single long session compressing past into mem tkns & retrieve w/attention
English
0
0
0
16
Antonio Lupetti
Antonio Lupetti@antoniolupetti·
"Graph Memory for LLM Agents" is a recent paper that explores an idea that I find quite interesting. Most AI memory systems treat remembering as a retrieval problem (the model searches its memory, retrieves relevant information, and then reasons about it). This paper argues that the process may be more dynamic than that and, instead of simply retrieving memories, an AI agent could reconstruct them during reasoning, following clues, associations, and intermediate evidence as they emerge. What I find interesting is the possibility that memory and reasoning may not be separate processes at all, but that remembering itself could be part of reasoning. arxiv.org/abs/2606.06036
Antonio Lupetti tweet media
English
5
2
54
2.4K
g023
g023@g023dev·
@dosco Try the LFM2.5 models (especially the 8b A1B moe)
English
1
0
0
26
spacy
spacy@dosco·
my whole feed is local models after the big drops last week excited for this future it’s also exactly where DSPy and RLM wins
Alok@analogalok

a new 8GB VRAM GPU dense Local LLM leader was born yesterday runs on: RTX 4060 / RTX 3070 / RTX 2080. any 8GB card Qwen 3.5 9B (dense) was the go to for 6-8GB VRAM builds. Gemma 4 12B QAT (dense) just changed that. same llama.cpp + cuda 13.2. i7 12700H. 16GB RAM. same -ngl 99 flags. same 48k context. unsloth gemma-4-12b-it-Q4_K_M.gguf → 15 tok/sec @ 48k ctx unsloth gemma-4-12B-it-qat-UD-Q4_K_XL.gguf → 32 tok/sec @ 48k ctx → 26 tok/sec @ 64k ctx 64k context is a big deal. Hermes 3 agent requires 64k minimum to run. you're now getting full hermes compatible context on a budget consumer GPU at 26 tok/sec locally. 2.1x faster on identical hardware. and here's the part that breaks your brain: the QAT-UD-Q4_K_XL is actually SMALLER than the Q4_K_M "XL" why? QAT = Quantization Aware Training Google didn't train the model first and compress it later they trained it to be quantized from day one the weights already know how to survive low precision that's why you get more quality per byte llamacpp flags: -m gemma-4-12B-it-qat-UD-Q4_K_XL.gguf -cnv -ngl 99 -c 48000 -v fits in 8GB VRAM clean. no API. no cloud. no subscription. and this isn't even the MTP variant yet Gemma-4-E2B QAT runs on 3GB RAM, E4B on 5GB, 12B on 7GB, 26-A4B on 15GB and 31B on 18GB. I have benchmarked the 26b and 31b qat as well on a single RTX 4090, checkout the comments for details. If you have a 6GB or 8GB VRAM GPU, post your numbers. more benchmarks and configs coming soon

English
2
1
28
3.1K
Peter Mick
Peter Mick@ThePeterMick·
If you’re verified on X I want to follow you back Let me know if I haven’t followed you back
Peter Mick tweet media
English
165
3
134
8.8K
Hikari∣LocalLLM⚡
Hikari∣LocalLLM⚡@Hikari_07_jp·
I'm in Tokyo for an AI-related conference. I'm 400 kilometers away from my home lab, but I can remotely connect using my Macbook and run experiments using VRAM anytime. To put it mildly, it's awesome✨
Hikari∣LocalLLM⚡ tweet media
English
7
0
39
1.1K
g023
g023@g023dev·
@TomTSEC the government is stealing money from the majority to give to a certain class of voters to buy their vote.
English
0
0
0
6
Tom Quiggin
Tom Quiggin@TomTSEC·
Things have gotten so bad in Canada that the government is handing out money to people so they can afford groceries.
English
74
118
602
8K
g023
g023@g023dev·
@SolaTheAnalyst Try owning one in Calgary lol. Can't live without it, but you'll get taken to the cleaners.
English
0
0
0
21
Sola 🇨🇦🇳🇬
Sola 🇨🇦🇳🇬@SolaTheAnalyst·
Owning a car in Toronto is a personality disorder. 🇨🇦 $200 insurance before you move it. $300 parking if you work downtown. The 401 on a Friday. The TTC is $156 a month. But sure. Keep the car.
English
135
34
327
62.5K
g023
g023@g023dev·
@Sean_Speer Well considering AI is now being used in Alberta and BC to write all the police reports, guess what you'll be up against in court? These datacenters are for them, not you, but they'll be used against you for sure.
English
0
0
0
6
Sean Speer
Sean Speer@Sean_Speer·
The Carney government gets it wrong on AI This week, the Carney government released AI for All, its long-awaited national artificial intelligence strategy. Although there are some useful aspects to the strategy—including the government’s recognition that Canada suffers too little AI adoption—its central premise is basically wrong. The document repeatedly frames AI through the lens of “sovereignty,” including the need for greater control over AI infrastructure, data, and advanced models. But sovereignty is a poor organizing principle for Canadian AI policy. Frontier AI development is increasingly concentrated among a handful of American and Chinese firms with capital budgets that exceed the annual spending of most national governments. The hyperscalers are investing hundreds of billions of dollars in chips, data centres, models, and talent. The notion that Ottawa can engineer a domestically controlled frontier AI ecosystem capable of competing head-to-head with those firms is an unserious starting point for Canadian policy. University of Toronto economist @Afinetheorem has made the point particularly well. In his view, countries such as Canada face a simple strategic choice: they must find a way to become essential to either the American or Chinese AI stack. Attempting to recreate a fully sovereign stack of our own is neither economically realistic nor technologically plausible. That insight exposes the main weakness of the government’s approach. The strategy contains pages of discussion about Canadian leadership, sovereignty, and domestic capacity. Yet it says comparatively little about how Canada will position itself within the global AI ecosystem that’s already emerging. There’s little discussion of guaranteed access to frontier models, Canada’s role in AI supply chains, or how Canadian firms can become indispensable partners to the companies building the world’s most advanced systems. Canada has genuine advantages. We possess abundant energy resources, a strong research base, world-class universities, significant mineral assets, and geographic proximity to the United States. The goal should be to leverage those strengths to attract investment, host infrastructure, develop specialized applications, and deepen our integration into the North American AI economy. Put simply: Canada’s AI future is more likely to depend on integration than independence. Yet if policymakers become so preoccupied with the political goal of sovereignty, they risk undermining the country’s place in the AI economy around taking shape.
The Hub@TheHubCanada

.@Sean_Speer: The Carney government gets it wrong on AI thehub.ca/2026/06/05/the…

English
25
30
121
18.2K