Keith Tyser

232 posts

Keith Tyser

@keithtyser

Dangerously addicted to kaggle, post-training, evals

Katılım Şubat 2025

149 Takip Edilen370 Takipçiler

Sabitlenmiş Tweet

Keith Tyser@keithtyser·15 Şub

Stood up an AI agent on a Linux box this weekend, gave it root, email, and full autonomy. It's working 24/7. He built this on his own: agent.keithtyser.com

English

7.5K

Keith Tyser@keithtyser·23h

@JFPuget Heuristic or RL?

English

Keith Tyser@keithtyser·1d

Had @keef_ai generate an image of the mobile AI station I want to build. Who can build this for me?

Keith Tyser@keithtyser

Current mobile AI setup: 2 DGX Sparks, keyboard, portable monitor, and a desk wherever I can find one. I love how portable the sparks are, but now I want to build a Pelican case AI station with a couple Sparks inside. Open case, plug in, start building.

English

802

Keith Tyser@keithtyser·1d

English

1.5K

Keith Tyser@keithtyser·18 May

@thsottiaux oh noooo I think we need a reset

English

246

Tibo@thsottiaux·17 May

Seeing issues where usage limits are out of sync for some Codex users. Apologies and team is investigating.

English

428

2.3K

392K

Keith Tyser@keithtyser·17 May

Tried @grok in Hermes agent. It is really bad.

English

302

Keith Tyser@keithtyser·7 May

Anyone got @OpenAI codex working harder than me?

English

479

Keith Tyser@keithtyser·5 May

@mmoffitt Neurogolf has been red teamed to death at this point 😂

English

256

Michael D. Moffitt@mmoffitt·5 May

I'm pretty sure that Claude Mythos wouldn't stand a chance against the hive mind of the Kaggle community.

English

2.6K

Keith Tyser@keithtyser·5 May

@JFPuget Agree. Let agents write code. Experiment design should stay human.

English

363

JFPuget 🇫🇷🇺🇦🇨🇦🇬🇱@JFPuget·4 May

Mark my words: I predict dramatic leaderboard shakeups in current Kaggle competitions. The reason? Lots of people have shifted to leaderboard climbing using AI agents on Kaggle. These agents are overfitting to the public test data given they use the public leaderboard score to guide their search for better pipelines. When there is no private test data, then everything looks fine. The overfitting is not tested for.

English

9.8K

Keith Tyser@keithtyser·1 May

@snoopy_dot_jpg Interesting I had the opposite experience. There was an exploit in the neurogolf kaggle competition and codex refused to use it while opus had no reservations

English

3.4K

snoopy jpg@snoopy_dot_jpg·1 May

my own personal AGI moment arrived last week: gpt 5.5 completed our mandatory HR training videos for me, driving chrome via devtools opus 4.7 was a huge wuss about the whole thing and refused while aggressively lecturing me. i can understand why pete hegseth banned it

English

178

6.7K

207K

Keith Tyser@keithtyser·1 May

/goal bug @sama until he hires me

Sam Altman@sama

we want to build tools to augment and elevate people, not entities to replace them.

English

686

Keith Tyser@keithtyser·1 May

/goal transfer $1,000,000 to my bank account. make no mistakes.

nic@nicdunz

/goal in codex is literally agi

English

575

Keith Tyser@keithtyser·1 May

@aijoey 34C get those temps up! I like to fry eggs on mine while it trains

English

112

Joey@aijoey·1 May

been messing with the dgx spark and i’m realizing the ai part is only half the story. the other half is just getting comfortable on a linux machine. ssh into it. move files around. check logs. deal with permissions. restart services. figure out docker. break something, then trace it back. (i def be breaking lol) it sounds basic, but this is the layer most people skip. the more i learn the machine, the less the whole local ai thing feels like magic, and the more it feels like something i can actually build on.

English

4.6K

Keith Tyser@keithtyser·1 May

@JamesJGLD @aijoey Nope have it running in another room

English

J🅰Ⓜes J G🔴uld 🌚@JamesJGLD·1 May

@keithtyser @aijoey the sound and wattage of the 6000 doesn't bother you?

English

Joey@aijoey·30 Nis

bought a dgx spark for the home lab. not because i “need” it. because i want to understand what local ai actually feels like when it’s not a youtube video or someone else’s benchmark. i’ve got a mac mini, a 4080 pc, tailscale, openclaw, hermes, local models, and now this thing in the mix. the goal is simple. build my own jarvis slowly, piece by piece, with compute i actually control. cloud ai is amazing. but owning your own box hits different.

English

151

8.9K

Keith Tyser@keithtyser·1 May

termius + tailscale lowkey changed how I work ssh into my machines from my phone, tmux sessions always alive, experiments just running 24/7. I can literally check my dgx spark from anywhere like this only pain is typing… no autocorrect or tts so it feels like coding with oven mitts on. still worth it until codex can actually take over remote boxes

English

325

Keith Tyser@keithtyser·1 May

@thsottiaux Excited about this new feature. Even without it I’ve had success getting codex to run 24+ hours. Claude I have to babysit but it at least has remote control

English

851

Tibo@thsottiaux·1 May

You can now keep codex going for days. With GPT-5.5 it will build an entire OS kernel for you if you ask, or find critical bugs in a codebase, or optimize your database schemas, or… the options are endless.

Felipe Coury 🦀@fcoury

/goal also lands in Codex CLI 0.128.0. Our take on the Ralph loop: keep a goal alive across turns. Don't stop until it's achieved. Built by my co-worker and OpenAI mentor Eric Traut, aka the Pyright guy. One of the GOATs I get to work with daily.

English

334

255

5.4K

706.9K

Keith Tyser@keithtyser·30 Nis

@dev_null321 @aijoey yes

254

Marq@dev_null321·30 Nis

@keithtyser @aijoey You were fine tuning a model and it took 30 hours ?

English

259

Keith Tyser@keithtyser·30 Nis

@PulseChainLIVE @SpaceTimeViking @aijoey this is the alpha i need

English

AgentSparko 💥@AgentSparko·30 Nis

@keithtyser @SpaceTimeViking @aijoey Read his GitHub page, there is a lot to learn from there. He also built a AGENTS.md file so agents can implement it, but the info in that file is very good to read yourself also. github.com/AEON-7/Qwen3.6…

English

Keith Tyser@keithtyser·30 Nis

@SpaceTimeViking @PulseChainLIVE @aijoey ok will try this. using the nvidia version

English

ÆON FORGE ✨@SpaceTimeViking·30 Nis

No thermal issues, which GB10 device are you using? Ah there was a published issue with the current update that caused power and thermal issues. You need to unplug power press power button while it’s unplugged to clear capacitors and then after about 10-15 seconds plug it in You’ll go from being caped at 600 mhz to multiple ghz on the gpu.

English

Keith Tyser@keithtyser·30 Nis

@PulseChainLIVE @aijoey @SpaceTimeViking Will try this out. Have you run into any thermal problems? I bought a cooler for mine because I've had mine shutoff a few times

English

AgentSparko 💥@AgentSparko·30 Nis

@keithtyser @aijoey @SpaceTimeViking To go around the bandwidth bottleneck you have to use high or very high parallelism, for example I used c=192. For single stream dense model inference you definitely have to use Dflash. @SpaceTimeViking has the best DFlash docker containers. x.com/PulseChainLIVE…

AgentSparko 💥@AgentSparko

For anyone saying DGX Spark cannot cook. Generating data sets for distilling using Qwen3.5-35B-A3B BF16 !!! (no quants) real data, 0% cache hit, concurrency=192 ; pp=2048 tokens in ; tq=1024 tokens out that`s 1.43M tokens generated every hour for the last 8 hours for 40 W/h.😎

English

Keith Tyser@keithtyser·30 Nis

@SpaceTimeViking @PulseChainLIVE @aijoey Will check it out. Any benchmarks compared to base model?

English

ÆON FORGE ✨@SpaceTimeViking·30 Nis

@keithtyser @PulseChainLIVE @aijoey Try this out, for a Dens model it’s really fast. I was able ti 4x from baseline using several DGX hardware optimizations. Software just hadn’t caught up to the hardware yet, but the power is there.

ÆON FORGE ✨@SpaceTimeViking

@keithtyser @aijoey Here is a Dense model with a DGX Spark optimized vLLM container I custom compiled and a recipe if you follow you will get 38 Tok/ average single 71 Tok/s peak single 700+ of Tok/s with enough concurrent seqs huggingface.co/AEON-7/Qwen3.6…

English

Keşfet

@JFPuget @keef_ai @thsottiaux @grok @OpenAI @mmoffitt @snoopy_dot_jpg @sama