Derek Colley

4.3K posts

Derek Colley

@DerekColley_

Consulting Technology Lead, CTO & CIO Building https://t.co/XGysrW0D1H

Beaconsfield, UK Katılım Ağustos 2009

229 Takip Edilen210 Takipçiler

Derek Colley retweetledi

Sudo su@sudoingX·8h

local builders kept qwen ecosystem alive across 3.5 and 3.6 open weights releases. that's why qwen is THE name in open weight inference in 2026. don't break the pattern on 3.7. closed is the path openai took. they dropped "open" from everything except the company name and now nobody trusts a roadmap that mentions "soon." don't let qwen drift into that territory. 3.7 is currently api only. weights still tba. one repost on this thread is one more signal alibaba sees. let's see if the gate moves.

Sudo su@sudoingX

qwen 3.7 api results look insane. when open weights? don't leave us local builders behind. if we want qwen 3.7 open weights, like and repost. let them hear us.

English

182

7.7K

Derek Colley@DerekColley_·5h

@sudoingX ooh, I have an old RX580 8GB... does this mean my old ETH minings rigs can come out of the museum? could we make unified memory with multiple cards?

English

Sudo su@sudoingX·7h

ok this is wild. 10 year old gtx 1080 8gb pascal card running qwen3 8b locally at 18-20 tok/s via hermes agent and it's actually doing the thing. asked it to build a wireworld cellular automata simulator with 10 tests. autonomous run, no hand holding. expected it to fail on the tool calls. that's not what happened. write_file works. browser_navigate works. terminal commands work. file ops, package installs, version probes, environment setup. agent is firing tool calls cleanly and the model is reasoning about next steps at 18-20 tok/s. on hardware that pre dates "agentic" as a word. it even hit an npm install fail because node 12 is too old. didn't crash. didn't ask me. just started bootstrapping nvm on its own to fix the environment. 10 minutes in. 40% context used. 7.5gb of 8gb vram occupied. still going. i did not think this would work on this hardware. this is the most i've been wrong this month.

Sudo su@sudoingX

so yesterday i dropped the bench numbers and what fits. today is the actual agent running on this 10 year old gpu card. qwen3 8b q4_k_m on a gtx 1080 8gb. hermes agent loaded with full tool set, browser controls live, nvtop pinned at 100% gpu 7.5gb of 8gb vram occupied. the unsloth weights pulled directly from huggingface, q4 quant, llama.cpp built for sm_61 (the pascal compute capability that everyone forgot exists). 31 tok/s gen speed, faster than most people read. this is what happens after the bench. raw perf was the receipt for what fits. now we test what actually works. agent loops, tool calls, real coding tasks coming next. ten year old card, $150 used, running a current open weight model with a current agent. nothing exotic. just the right quant, the right kv cache trick, the right engine compiled for the right arch. tell me what gpu you have, i'll tell you what runs.

English

158

26.7K

Derek Colley@DerekColley_·8h

@mstrakastrak You may laugh, but latin information density is higher. So, this dude is winning on smaller context.

English

4.5K

Michael Straka@mstrakastrak·13h

An important and underrated skill - learning Latin so you can mog your coworkers by prompting Claude like a wizard

English

1.8K

98.1K

Derek Colley retweetledi

Tiago Forte@fortelabs·1d

I think the main thing AI has taught me, through all the time savings it brings, is that I’m not a very interesting person Faced with a surplus of free time, I realize I don’t really have hobbies besides content consumption I’m forced to conclude that I don’t have very deep friendships, and am not a core member of any particular community I’m not very cultured, I’m finding, and don’t have abiding interests in art or literature or history or much that isn’t directly related to my work I have a work-centric life, in other words. AI pulls back the curtain on just how impoverished such an existence is, by disabusing me of its necessity Given the freedom I’ve always said I wanted, I’m at a loss as to what to do with it, except plow myself even harder into work, thus exacerbating the lesson There’s nothing more confronting to humans than freedom

English

338

208

3.7K

312.5K

Derek Colley@DerekColley_·9h

@0xSero Have you tried using @mastra components? They have harness object, durable agent, workflow, mcp client and server, pluggable workspace and memoryoptions for compression?

English

0xSero@0xSero·1d

Also on Youtube now! youtu.be/3gxFQ-ynJU0

YouTube

0xSero@0xSero

I had a chat about context + agentic engineering with Eric the founder of Repoprompt and a member of the rate limited podcast. I've learned a lot from Eric over the last 6 months, he has a great understanding of how to best utilise AI agents. Enjoy

English

4.7K

Derek Colley retweetledi

Rohas Nagpal@rohasnagpal·18h

AI-native law firm approved in the UK The UK’s Solicitors Regulation Authority has approved an AI-native law firm that can send legal demand letters for as little as £2. Not a legal tech tool. Not “AI-assisted”. An actual regulated law firm. This matters because it changes the economics of legal work. When AI handles intake, drafting, workflow, and process-heavy claims, the traditional pyramid model starts breaking: * fewer juniors * lower marginal cost * fixed-fee services * faster turnaround * lawyers shifting toward supervision and judgment The interesting question is no longer: “Will lawyers use AI?” It is: “What kinds of legal work still require a human law firm structure at all?” We are entering the era of AI-native law firms. And regulators are beginning to accept that reality.

English

17.2K

Derek Colley@DerekColley_·11h

@witcheer thank you!! So wtf means "what's this flag"?! 🤯

English

witcheer@witcheer·1d

when I started tuning llama-server, I changed flags randomly until something worked (or didn't). ncmoe 30? ncmoe 10? why is it suddenly 5x slower? what even is the KV cache eating my VRAM for? so I measured everything. every flag, every ncmoe value, the exact VRAM cost per layer, the exact point where performance falls off a cliff. this is the reference I built for myself. 16 flags, each explained with the "when to change it" and "what breaks if you get it wrong" enjoy

witcheer@witcheer

x.com/i/article/2057…

English

5.2K

Derek Colley@DerekColley_·13h

@TechboyUK @SawyerMerritt same happened at JLR with the Jaguar adverts at least in this one you can see the car, which just looks like a mondeo and magic mouse had a baby

English

Paul Richardson@TechboyUK·17h

@SawyerMerritt What a boring advert

English

Sawyer Merritt@SawyerMerritt·1d

Ferrari has just released a new video of the all-electric Ferrari Luce.

Sawyer Merritt@SawyerMerritt

Ferrari has just officially unveiled its first ever all-electric car, called the Ferrari Luce. • Starting price: $640,000 • Interior co-designed with Apple's former head of design, Jony Ive • Range: 280 miles (expected EPA) • Peak charging speed: 350kW • 122 kWh battery • 1,050 horsepower • 0-60mph: 2.4s • 800v • Four-door four-seater • Four electric motors • OLED screens • Weight: 4,982 lbs • Front motors spin to 30,000 rpm, rears hit 25,500 rpm • Car uses an accelerometer to capture real vibrations from the electric motors & rear chassis. An algorithm filters out unpleasant frequencies and amplifies only the more “musical” sounds. This can be heard inside and outside the car. • Paddle shifter on steering wheel changes how aggressively torque is delivered, with five different levels • The trunk has 21.1 cubic feet of space, the largest luggage capacity the company has ever offered • 197.6 inches long, about as long as a Tesla Model S U.S. deliveries start in Q2 2027. More photos in the thread below:

English

747

368

3.9K

1.4M

Derek Colley@DerekColley_·14h

@shafu0x a network and market for local inference providers. github.com/orgs/sparkl-ne…

English

123

shafu@shafu0x·1d

shill me what you are building

English

147

114

10.2K

Derek Colley retweetledi

Alan Smith@AlanJLSmith·16h

Entrepreneurs who sell their business for £10m: Tax bill in 2019: £1 million Tax bill in 2026: £2.34 million A 134% increase in 6 years. Have you noticed public services significantly improve as a result?

English

109

558

41.1K

Derek Colley@DerekColley_·14h

In the UK, electronic tracking evidence combined with a credible claim of stolen property is now generally sufficient for police to gain entry and search premises without a traditional warrant, under specific conditions. The Crime and Policing Act 2026 amended the Theft Act 1968 to create a new warrantless entry power specifically for electronically tracked stolen goods. - at least Inspector rank must authorise - not reasonably practicable to obtain a warrant, e.g. risk of flight - police must use discretion

English

Queen Natalie 👑@TheNorfolkLion·1d

This guy had his bike stolen and traced it down to a house. The bike’s been there for over 12 hours, but listen to how the police acted and the attitude! They say they can’t do anything because the back gate is open and there’s no other evidence. He then argues with the man, who clearly just wants justice for his stolen bike. The British police are an absolute joke.

English

3.3K

13.2K

62K

3.7M

Derek Colley@DerekColley_·16h

@0xSero Download link broken...

English

0xSero@0xSero·1d

You can now launch a backend on each of your computers to use each for inference In the agent sessions all models from all servers will be available in the Pi model selector. You can have a vision model on a macbook, and a non-vision coding model on the other, subagents work

English

3.7K

Derek Colley@DerekColley_·16h

@0xSero thanks, I have been looking for something to unify my devices and capacity

English

Derek Colley@DerekColley_·18h

AI Builder, tip of the day. Tired of short iterations, small changes, constantly looking for the next thing? Plan!! use /plan, create bigger changes. Chat with your plan. the /build. Sit back and watch. Or better, get out, come back later.

English

Derek Colley@DerekColley_·18h

@aijoey Aggregates tps is interesting, but you currently can’t use parallel agents on the same problem in a harness.

English

Joey@aijoey·21h

ifykyk. concurrency on dgx spark before all the new breakthroughs. i need to run mtp and nvfp4 versions.

Joey@aijoey

Local AI landing page generation on a DGX Spark. One Gemma-4-26B Q4 GGUF served by llama.cpp with 7 concurrent decode slots. The orchestrator breaks “landing page” into 6 section briefs: hero features steps testimonials pricing CTA Then 6 Gemma instances generate the sections in parallel and stitch everything into one Tailwind page. ~3 minutes end to end. The best part: everything you just watched happens offline, forever. No one can turn it off besides my light company lol @googlegemma @NVIDIAAIDev

English

1.5K

Derek Colley@DerekColley_·18h

Credit report is such a scam! Pay off your debt and your score goes down..!!!

English

Derek Colley@DerekColley_·18h

AI Builder tip of the day Use /plan !! add multiple observations about a feature, chat with your plan. When you think the Agent understands the assignment, trigger the /build

English

Derek Colley@DerekColley_·1d

England-sur-mer… or Greece, but without the sea ☀️🏖️

English

Derek Colley@DerekColley_·1d

Exactly this: 'panem et circenses' Coined by the Roman poet Juvenal in his Satires around the end of the 1st century AD, the phrase was a satirical critique of the Roman population. It described how the ruling class used free grain (bread) and lavish entertainment—like chariot races and gladiator games (circuses)—to pacify the public and distract them from deeper political and civic issues.

English

Alice Smith@TheAliceSmith·1d

The problem with many British men in one meme, as Orwell foresaw. “Football, beer, and above all, gambling filled up the horizon of their minds. To keep them in control was not difficult.” - 1984

illuminatibot@iluminatibot

English

140

846

12.2K

Derek Colley@DerekColley_·1d

@ChadNauseam just use screen # creates a session screen # to detach ctrl-a d # go to the loo... to reattach screen -r

English

109