Lee Higgins

8.9K posts

Lee Higgins

Lee Higgins

@Depthperpixel

AI augmented Flutter dev. We have entered the age of AI. What a time to be alive! #flutter #flutterDev

Barcelona, Spain Katılım Mayıs 2010
919 Takip Edilen963 Takipçiler
Sabitlenmiş Tweet
Lee Higgins
Lee Higgins@Depthperpixel·
birdfeedgames.com We are looking for commissions. Lets build a beautiful game together.
Lee Higgins tweet media
English
1
0
14
2.9K
Jesse Ezell
Jesse Ezell@jezell·
Tried every audio package there is, all of either don't support or suck on flutter web. By suck I mean all of them can't do basic audio streaming without clicks and artifacts every few milliseconds, especially in the debugger. Take soloud for example, it's a really cool library, probably amazing on native, but you just can't pump those js byte arrays through to WASM fast enough or something (byte array copy perf from js to WASM is the bane of a lot of things). So, after wasting hours trying everything on pub.dev and getting crap results with all of them, I just asked codex to write me one that uses JS interop and web audio. Worked great on the first try. Codex is the best package manager.
English
6
2
42
2K
Sudo su
Sudo su@sudoingX·
do you understand what's happening here? if this doesn't excite you about local ai nothing will. my dgx spark is writing custom CUDA kernels to optimize its own inference. the agent studied the triton-proven algorithm, understood the dispatch chain, and is now writing a native CUDA kernel as a fast path for Q8 matmul decode. this is a machine improving itself. autonomously. powered by hermes agent /goal running qwen 27B locally. no human wrote this. no api was called. just local silicon teaching itself to run faster.
Sudo su tweet media
Sudo su@sudoingX

my dgx spark is writing custom CUDA kernels to make itself faster. let that sink in. hermes agent running qwen 3.6 27B Q8 autonomously decided to port its own triton kernel to native CUDA C++ for llama.cpp integration. it understood the dispatch chain. studied the mmq kernel structure. now it's writing the port itself. this machine is literally optimizing its own inference pipeline. no human in the loop. i set a /goal last night and woke up to a 12.91x speedup on SSM and 9.66x on Q8 matmul. now it wants another 2-3x through FP8 tensor cores. local ai. autonomous agents. self-improving inference. this is not science fiction. this is my friday.

English
15
15
181
14.3K
kitze
kitze@thekitze·
@elonmusk how many electron apps can it run
English
5
1
47
4.7K
Lee Higgins
Lee Higgins@Depthperpixel·
@dakshgup @greptile Fine with usage based, it's just not all PRs are the same. We have many small PRs that cost the same as big PRs.
English
0
0
0
27
Daksh Gupta
Daksh Gupta@dakshgup·
hey! it’s 30 per month, with 50 reviews and you can turn off usage based pricing from the dashboard if you’d prefer. it’s unlikely we’ll move off a usage model any time soon. we want to be able to continue using the best frontier models in our product and it’s hard to do so without it.
English
1
0
0
30
Lee Higgins
Lee Higgins@Depthperpixel·
@greptile is a great product, but the pricing is far too high. 90 per month per person with only 30 reviews, then 1 dollar per review.... 250 in extra spend a week into this month. The pace of PRs with AI makes you too expensive. And I can't seem to find a way to turn off the overage and waste a bunch of my reviews on tiny PRs. You need a better pricing model. Let me know when you have one I might come back.
English
2
0
1
121
Lee Higgins
Lee Higgins@Depthperpixel·
@sudoingX Vibe code as fast as possible -> improve structure -> Image gen 2 chat loop with codex on the codebase and then build a design system with separate storybook app. worked shockingly well for me today.
English
0
0
0
43
Sudo su
Sudo su@sudoingX·
ill just say it. chatgpt 5.5 frontend skills are retarded. great at agentic backend, terrible at design execution.
English
24
3
67
4K
ℏεsam
ℏεsam@Hesamation·
> 12M context window (read it again) > 52x faster than FlashAttention > beats Opus 4.6 on SWE-Bench > 5% the cost of Opus BUT WAIT A MINUTE: > technical blog not technical > access coming soon > paper coming soon > ““Built by researchers from Meta, Google, Oxford, Cambridge, BYU” doesn’t name a single one of them if this is not a scam, or the numbers aren’t dishonest, it’s disgustingly promotional.
ℏεsam tweet media
Alexander Whedon@alex_whedon

Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.

English
55
47
1.3K
123.1K
Noah
Noah@NoahKingJr·
TELL ME SOMETHING YOU CAN DO THAT CLAUDE CANNOT
English
3.1K
71
1.8K
896.7K
Alexander Whedon
Alexander Whedon@alex_whedon·
Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.
English
1.5K
2.9K
23K
12.6M
Lee Higgins
Lee Higgins@Depthperpixel·
I have run both for a few months, they catch things each other misses. And greptile caches more things. Happy to have them both if the price was reasonable. Code rabbit is less than half the price for us. We can do 50prs a day and at $1 a PR that gets expensive. Some PRs are small so the pricing does not work for us.
English
0
0
0
30
Sudo su
Sudo su@sudoingX·
if you want mac portability and you want to learn cuda, the dgx spark is the silent king nobody is talking about. 128gb unified memory in a form factor that fits on a desk corner, full cuda stack, runs nemotron 30b q8 at 56 tok/s on hermes agent, multimodal + tool calls nobody has written custom kernels for this specific silicon yet. spark has its own architecture (gb10 blackwell, aarch64), the whole ecosystem of model-specific kernel work for 3090 / 4090 / 5090 has not been ported here. that is an openlane for builders who want the territory. i expect nvidia to focus on its ecosystem more this year. the hardware is in front of builders, the software needs to catch up to make spark the developer-default for portable ai workstations. if you have one and you have not written or tested anything model-specific on it yet, you are sitting on the most underexplored consumer AI silicon shipping right now.
Sudo su tweet media
English
33
8
175
15.8K
Lee Higgins
Lee Higgins@Depthperpixel·
@XorDev Yeah this is what makes it hard. You need a contribution heatmap. Looks at conditions etc. nightmare
English
0
0
1
7
Xor
Xor@XorDev·
@Depthperpixel Essentially every bit of the code effects every pixel
English
3
0
0
49
Xor
Xor@XorDev·
for(float i,z,d,f;i++<1e2;o+=vec4(4,6,8.+z,0)/f-min(dFdx(z)*r.y+z,0.)/exp(d*d/.1)){vec3 p=z*(FC.rgb*2.-r.xyy)/r.y,c=p;p.z+=8.;c.z*=3.;for(f=1.;f++<9.;c+=sin(c.yzx*f+z+t*.5)/f);z+=min(f=.1+abs(.2*c.y+abs(p.y+.8)),d=max(length(p)-3.,.9-length(p-vec3(-1,1,3))))/7.;}o=tanh(o/2e3);
Sam Altman@sama

80
177
3.2K
166.3K
Lee Higgins
Lee Higgins@Depthperpixel·
Interesting though experiment to anyone following the open AI case. Replace the name "OpenAI", with "Starving Baby Food Aid". Does your opinion change?
English
0
0
0
56
Lee Higgins
Lee Higgins@Depthperpixel·
Goblins 😂
English
0
0
0
39
Sky
Sky@code_coded·
@Depthperpixel @LinusEkenstam Happy days! Turns out there’s a recently formed fencing club here in Pattaya! Gonna check it out soon! Pretty sure I’ll feel a like a fat knight in armour though when I put the kit back on 🤣
English
1
0
1
62
Linus ✦ Ekenstam
Linus ✦ Ekenstam@LinusEkenstam·
I was skeptical, but now I’m completely convinced. Fencing will become super popular due to this one very particular improvement to the sport. “Sword tip visualization” It’s going to debut at the summer olympics. Every single duel will look like a bloody lightsaber fight
English
972
4.2K
66.1K
3.9M
Lee Higgins
Lee Higgins@Depthperpixel·
@sama The correct answer is to use them all in a diverse team. Works the same as humans did.
English
0
0
0
18
Sam Altman
Sam Altman@sama·
you know what all of these "which is better" polls are silly use codex or claude code, whatever works best for you i am grateful we live in a time with such amazing tools, and grateful there is a choice
English
2.2K
1.1K
23K
1.6M
Lee Higgins
Lee Higgins@Depthperpixel·
@sudoingX How does the output compare to the frontier models? Benchmarks say one thing but every time I used local models in the past there's a massive gap.
English
0
0
0
516
Sudo su
Sudo su@sudoingX·
most of you don't know how big a deal it is that a single rtx 3090 from 2020 runs qwen 27b dense q4 with 256k context at 40 tok/s, full agentic loops on hermes agent, zero tool call failures. the more i build on this card the more i think nobody really knows how untapped it actually is. the silicon was always capable, the models finally caught up.
English
45
31
569
243.6K
PeterSweden
PeterSweden@PeterSweden7·
They did it. Poland has enacted ZERO income tax for parents who have at least two children. Parents will pay no income tax on income up to around €33.000 This is being done to increase the birthrates. Very good 🇵🇱👍
English
382
1.4K
12.4K
250.9K