g023

5K posts

g023

@g023dev

developer/programmer/ai nerd

Canada انضم Ekim 2023

2.3K يتبع517 المتابعون

تغريدة مثبتة

g023@g023dev·25 Nis

So I optimized the model, i optimized the harness, now I'm optimizing the endpoint by making an openai api to deepseek endpoint proxy that has some context compression features automatically integrated to attempt to save $$$ (works well with copilot): gist.github.com/g023/c2bb7b540…

English

311

g023@g023dev·1m

@antirez this might "peek" your curiousity.. ( arxiv.org/abs/2605.19932 )

English

antirez@antirez·1h

I find myself telling coding agents on different machines to take a log about the work the are doing, and to use different skill files for certain processes. It's time to start using Redis arrays with ARGREP I guess, to have all centralized. I'll share a skill file and a video.

English

3.8K

g023@g023dev·5m

@binomen96 depends on your usage. For coding, probably not as the price of energy makes api costs cheaper than running local. For small agentic tasks, its nice because you don't have to worry about rate limits and can hammer the hell outta them.

English

binomen@binomen96·5h

I've been thinking about running local AI models instead of paying for API subscriptions. anyone actually doing this? is the quality gap real or overhyped?

English

g023@g023dev·7m

@amirmxt Try being in Calgary. Like a ghost town for tech talent. Not sure what the plan is for this country, but it ain't looking good.

English

amirmxt@amirmxt·2h

had a call with company hiring for a FDE (their portfolio includes 190+ companies) salary: $130K how does canada expect to compete for talent

English

7.7K

g023@g023dev·9m

@auroter add deepseek v4 flash to that mix

English

Gandalf Stormdrain@auroter·3h

Frontier AI is BRAINDEAD. GPT5.5 xHigh in Codex thinks I should use Tensor Parallelism to deploy Qwen 3.6 27B on my system which has 4x RTX 6000 Pro Blackwell cards. Why, you ask? It's reasoning is that without Tensor Parallelism, "we would be forced to serve the model across 4 separate ports, which would confuse OpenCode." Yes, it's suggesting I run Tensor Parallelism to deploy a model which easily fits in BF16 on a single card. Because ports are scary, and you couldn't possibly listen on one of them and direct traffic accordingly. ... In other news, I am doing a shootout this morning, giving the same problem to GPT 5.5 xHigh, Opus 4.8 Max, Qwen3.5 397b-a17b, Qwen3.6 27B and Nemotron 3 Ultra. The larger models will be quantized to NVFP4, and the 27B will be run in BF16. As you may have noticed, we are not off to a good start with GPT5.5. It's struggling to figure out how to set up the shootout without my explicit guidance. So far I am seeing that its superiority over Opus 4.8 is marginal at best. Stay tuned. This topic of open source models vs frontier has come up a few times in recent conversations with people on X, so I want to do a real life comparison of these models and their ability to problem-solve real scenarios on my ongoing project.

English

1.9K

g023@g023dev·11m

In Canada they're using this technology in Alberta and BC... the future is already being judged by robots eff.org/deeplinks/2025…

English

g023 أُعيد تغريده

DailyPapers@HuggingPapers·1d

dMoE: Block-level routing for diffusion LLMs Reduces uniquely activated experts from 69.5 to 14.6 while retaining 99.11% performance, cuts memory by up to 80%, and delivers up to 1.66× speedup.

English

100

5.1K

g023@g023dev·11h

@nikitabier Optimizing > Building

English

Nikita Bier@nikitabier·16h

Some of the most impactful work we do at X is invisible to the user. In the last 12 months, we have rewritten almost every core part of the app. We will soon be shipping a 90% reduction in our web app’s load times.

Yagiz Nizipli@yagiznizipli

Performance improvements to upcoming versions of X.com (under slow / 4G internet connection).

English

905

307

430.1K

g023@g023dev·12h

@AdrianaTX1m I mean you're 70. How long do you think your body is going to look pristine? At that point you don't really care so much i'm sure.

English

179

Adriana M@AdrianaTX1m·19h

¿Qué cambio tendrán estos tatuajes cuando su piel tenga 70 años?

Español

1.4K

814

32.7K

7.3M

g023@g023dev·16h

@AmandineFlachs I like keeping my bench specs minimal. Easy to move up, not so easy to revert back. I mainly use 3060s@12GB for my dev work as a good baseline. The frustration/challenge makes it fun.

English

Amandine Flachs@AmandineFlachs·1d

Running local llms is humbling. Whatever gpu you have, it always feels like your gpu poor.

English

224

g023@g023dev·16h

Some HAM radio guy showed me how to communicate with satellites using some scrap copper wire and a broomstick the other day. I need more of those people in my life.

English

g023@g023dev·16h

@porterstansb Stop putting PHDs on a pedestal and this would be a non-issue. Go back to hiring nerds that just like to grind.

English

Porter Stansberry@porterstansb·1d

My son's elite private high school school held graduation last week. There were roughly 100 kids in the graduating class. Lots of extremely bright kids. Probably 20 IB diplomas. And maybe a dozen of those kids were Cum Laude Society too, including my son. These kids are incredibly smart and hard working. They had straight A's throughout high school, at an elite school, in the most challenging curriculum and earned the highest SAT scores. Out of the entire class, there was only one student who was accepted into an Ivy League school. One. But he didn't earn an IB diploma. He wasn't Cum Laude Society. Or on the Head Master's List. He is a great kid. And I'm happy for him. This isn't, in any way, a criticism of him. But, you already know -- he had something none of the highest achieving kids had. Something has gone terribly wrong with our society when the very brightest kids are systematically barred from the most elite academic colleges in the country. And especially when the board of the school and its staff see absolutely nothing wrong with this outcome. In fact they will, undoutably, call me a bigot -- again. But when will they fight for our sons? They won't. Our sons are the wrong color. P.S. In the middle of the ceremony, the American flag left of the stage fell over and hit the ground. The school's headmaster picked it up. But he didn't know how to fix the mount. So, he just stood it back up and let it fall over again. Twice. Finally a man who was obviously a former member of the armed services came over and, standing at attention, held the flag upright. Tells you 100% everything you need to know about the headmaster and the culture of that school.

English

867

1.7K

20K

1.2M

g023@g023dev·19h

@liquidai @mlech26l @paulpak__ Implemented flash-decoding in a pure C/C++/CUDA inferencing engine for LFM2.5-8B-A1B if anyone's interested in trashing: github.com/g023/cuda_inf

English

107

Liquid AI@liquidai·1d

Training LFMs at scale means solving parallelism across every layer of the architecture. And not all layers are the same. Our CTO Mathias Lechner (@mlech26l) sits down with Liquid's founding engineer Paul Pak (@paulpak__) to talk training infrastructure: Data, tensor, pipeline, expert, and context parallelism, and how they make context parallelism work across hybrid architectures with both attention and convolution operators.

English

149

15.1K

g023@g023dev·19h

@shaun_on_x ya sometimes I just run it to see what it does and then just take the changes I want and manually implement them in a different copy.

English

Shaun@shaun_on_x·19h

@g023dev I just don't have it in me to ship code approved by another "Review agent" I wanna know what the 65 file code change does myself because bruh if the entire system fails, it's so over

English

Shaun@shaun_on_x·1d

Uhh, how do I review and actually check all of this code myself 😭 65 files bruh, 5385 new lines of code HELLLLLP

English

1.2K

g023@g023dev·19h

Other calgary devs, do you even exist out there? Like a wasteland for tech from calgary here. We should change that.

English

g023@g023dev·19h

@buildwithhassan Do you really need V4 Pro? I find a lot of tasks can be handled with V4 Flash with relatively minimal issues.

English

532

Hassan@buildwithhassan·1d

update: opencode published their full model pricing table. deepseek V4 pro still showing $1.74 input / $3.48 output on opencode. deepseek official price after the permanent discount: $0.41 / $0.83. that's 4x more expensive on opencode than going direct. still waiting on the sync.

English

661

78.9K

g023@g023dev·19h

@_oliveiradanilo Sounds like an issue that can be solved moving to DeepSeek

English

4.6K

Danilo Oliveira@_oliveiradanilo·1d

We will go bankrupt if these Gemini cache costs don't stop. Gemini is currently charging us $1K per hour due a bug with the explicit cache feature, and I am unable to delete the cache from my end. I have been trying to resolve this with billing support for over 24 hours. We spend more than $30k a month on GCP more than 8 years and always pay our bills on time without a single day of delay. They only generate AI SLOP messages in my case. There is not a real person looking this. Please help me escalate this so someone can fix this issue immediately. Please, help me share this post to someone help me to fix this my case #72026129 @Gemini @jastephx @OfficialLoganK #bolhadev @sseraphini @acgfbr