g023

5K posts

g023 banner
g023

g023

@g023dev

developer/programmer/ai nerd

Canada انضم Ekim 2023
2.3K يتبع517 المتابعون
تغريدة مثبتة
g023
g023@g023dev·
So I optimized the model, i optimized the harness, now I'm optimizing the endpoint by making an openai api to deepseek endpoint proxy that has some context compression features automatically integrated to attempt to save $$$ (works well with copilot): gist.github.com/g023/c2bb7b540…
English
0
0
4
311
antirez
antirez@antirez·
I find myself telling coding agents on different machines to take a log about the work the are doing, and to use different skill files for certain processes. It's time to start using Redis arrays with ARGREP I guess, to have all centralized. I'll share a skill file and a video.
English
6
1
80
3.8K
g023
g023@g023dev·
@binomen96 depends on your usage. For coding, probably not as the price of energy makes api costs cheaper than running local. For small agentic tasks, its nice because you don't have to worry about rate limits and can hammer the hell outta them.
English
0
0
0
1
binomen
binomen@binomen96·
I've been thinking about running local AI models instead of paying for API subscriptions. anyone actually doing this? is the quality gap real or overhyped?
English
2
0
3
37
g023
g023@g023dev·
@amirmxt Try being in Calgary. Like a ghost town for tech talent. Not sure what the plan is for this country, but it ain't looking good.
English
0
0
0
15
amirmxt
amirmxt@amirmxt·
had a call with company hiring for a FDE (their portfolio includes 190+ companies) salary: $130K how does canada expect to compete for talent
English
10
1
43
7.7K
g023
g023@g023dev·
@auroter add deepseek v4 flash to that mix
English
1
0
1
18
Gandalf Stormdrain
Gandalf Stormdrain@auroter·
Frontier AI is BRAINDEAD. GPT5.5 xHigh in Codex thinks I should use Tensor Parallelism to deploy Qwen 3.6 27B on my system which has 4x RTX 6000 Pro Blackwell cards. Why, you ask? It's reasoning is that without Tensor Parallelism, "we would be forced to serve the model across 4 separate ports, which would confuse OpenCode." Yes, it's suggesting I run Tensor Parallelism to deploy a model which easily fits in BF16 on a single card. Because ports are scary, and you couldn't possibly listen on one of them and direct traffic accordingly. ... In other news, I am doing a shootout this morning, giving the same problem to GPT 5.5 xHigh, Opus 4.8 Max, Qwen3.5 397b-a17b, Qwen3.6 27B and Nemotron 3 Ultra. The larger models will be quantized to NVFP4, and the 27B will be run in BF16. As you may have noticed, we are not off to a good start with GPT5.5. It's struggling to figure out how to set up the shootout without my explicit guidance. So far I am seeing that its superiority over Opus 4.8 is marginal at best. Stay tuned. This topic of open source models vs frontier has come up a few times in recent conversations with people on X, so I want to do a real life comparison of these models and their ability to problem-solve real scenarios on my ongoing project.
English
4
0
24
1.9K
g023
g023@g023dev·
In Canada they're using this technology in Alberta and BC... the future is already being judged by robots eff.org/deeplinks/2025…
English
0
0
0
2
g023 أُعيد تغريده
DailyPapers
DailyPapers@HuggingPapers·
dMoE: Block-level routing for diffusion LLMs Reduces uniquely activated experts from 69.5 to 14.6 while retaining 99.11% performance, cuts memory by up to 80%, and delivers up to 1.66× speedup.
DailyPapers tweet media
English
1
17
100
5.1K
g023
g023@g023dev·
@AdrianaTX1m I mean you're 70. How long do you think your body is going to look pristine? At that point you don't really care so much i'm sure.
English
0
0
0
179
Adriana M
Adriana M@AdrianaTX1m·
¿Qué cambio tendrán estos tatuajes cuando su piel tenga 70 años?
Español
1.4K
814
32.7K
7.3M
g023
g023@g023dev·
@AmandineFlachs I like keeping my bench specs minimal. Easy to move up, not so easy to revert back. I mainly use 3060s@12GB for my dev work as a good baseline. The frustration/challenge makes it fun.
English
0
0
1
28
Amandine Flachs
Amandine Flachs@AmandineFlachs·
Running local llms is humbling. Whatever gpu you have, it always feels like your gpu poor.
English
3
0
4
224
g023
g023@g023dev·
Some HAM radio guy showed me how to communicate with satellites using some scrap copper wire and a broomstick the other day. I need more of those people in my life.
English
0
0
0
17
g023
g023@g023dev·
@porterstansb Stop putting PHDs on a pedestal and this would be a non-issue. Go back to hiring nerds that just like to grind.
English
0
0
0
14
Porter Stansberry
Porter Stansberry@porterstansb·
My son's elite private high school school held graduation last week. There were roughly 100 kids in the graduating class. Lots of extremely bright kids. Probably 20 IB diplomas. And maybe a dozen of those kids were Cum Laude Society too, including my son. These kids are incredibly smart and hard working. They had straight A's throughout high school, at an elite school, in the most challenging curriculum and earned the highest SAT scores. Out of the entire class, there was only one student who was accepted into an Ivy League school. One. But he didn't earn an IB diploma. He wasn't Cum Laude Society. Or on the Head Master's List. He is a great kid. And I'm happy for him. This isn't, in any way, a criticism of him. But, you already know -- he had something none of the highest achieving kids had. Something has gone terribly wrong with our society when the very brightest kids are systematically barred from the most elite academic colleges in the country. And especially when the board of the school and its staff see absolutely nothing wrong with this outcome. In fact they will, undoutably, call me a bigot -- again. But when will they fight for our sons? They won't. Our sons are the wrong color. P.S. In the middle of the ceremony, the American flag left of the stage fell over and hit the ground. The school's headmaster picked it up. But he didn't know how to fix the mount. So, he just stood it back up and let it fall over again. Twice. Finally a man who was obviously a former member of the armed services came over and, standing at attention, held the flag upright. Tells you 100% everything you need to know about the headmaster and the culture of that school.
English
867
1.7K
20K
1.2M
Liquid AI
Liquid AI@liquidai·
Training LFMs at scale means solving parallelism across every layer of the architecture. And not all layers are the same. Our CTO Mathias Lechner (@mlech26l) sits down with Liquid's founding engineer Paul Pak (@paulpak__) to talk training infrastructure: Data, tensor, pipeline, expert, and context parallelism, and how they make context parallelism work across hybrid architectures with both attention and convolution operators.
English
8
15
149
15.1K
g023
g023@g023dev·
@shaun_on_x ya sometimes I just run it to see what it does and then just take the changes I want and manually implement them in a different copy.
English
0
0
0
7
Shaun
Shaun@shaun_on_x·
@g023dev I just don't have it in me to ship code approved by another "Review agent" I wanna know what the 65 file code change does myself because bruh if the entire system fails, it's so over
English
1
0
1
36
Shaun
Shaun@shaun_on_x·
Uhh, how do I review and actually check all of this code myself 😭 65 files bruh, 5385 new lines of code HELLLLLP
Shaun tweet media
English
4
0
10
1.2K
g023
g023@g023dev·
Other calgary devs, do you even exist out there? Like a wasteland for tech from calgary here. We should change that.
English
0
0
1
19
g023
g023@g023dev·
@buildwithhassan Do you really need V4 Pro? I find a lot of tasks can be handled with V4 Flash with relatively minimal issues.
English
0
0
0
532
Hassan
Hassan@buildwithhassan·
update: opencode published their full model pricing table. deepseek V4 pro still showing $1.74 input / $3.48 output on opencode. deepseek official price after the permanent discount: $0.41 / $0.83. that's 4x more expensive on opencode than going direct. still waiting on the sync.
Hassan tweet media
English
65
21
661
78.9K
g023
g023@g023dev·
@_oliveiradanilo Sounds like an issue that can be solved moving to DeepSeek
English
1
0
28
4.6K
Danilo Oliveira
Danilo Oliveira@_oliveiradanilo·
We will go bankrupt if these Gemini cache costs don't stop. Gemini is currently charging us $1K per hour due a bug with the explicit cache feature, and I am unable to delete the cache from my end. I have been trying to resolve this with billing support for over 24 hours. We spend more than $30k a month on GCP more than 8 years and always pay our bills on time without a single day of delay. They only generate AI SLOP messages in my case. There is not a real person looking this. Please help me escalate this so someone can fix this issue immediately. Please, help me share this post to someone help me to fix this my case #72026129 @Gemini @jastephx @OfficialLoganK #bolhadev @sseraphini @acgfbr
Danilo Oliveira tweet media
English
112
123
1.7K
208.8K
Tristan Rhee
Tristan Rhee@Tristanrhee3·
how do you know if you’re building stuff people actually want?
English
332
5
260
41.6K
g023
g023@g023dev·
They should have an Opus slow version at half price by utilizing more available hardware even if it results in slower performance.
English
0
0
0
14
g023
g023@g023dev·
@merlinaudio_ I prefer it for remembering what was done where, but still like to make excuses to burn my opus rips.
English
0
0
1
107
merlin
merlin@merlinaudio_·
in the age of AI at least some of us still code by hand
English
54
52
1.1K
112K