Koen

3.9K posts

Koen

Koen

@koenvaneijk

ai augmented senior swe - ex automagica (acquired '20)

Austin, TX Katılım Haziran 2009
124 Takip Edilen8.9K Takipçiler
Koen
Koen@koenvaneijk·
@LottoLabs have you tried the claude distillation? i haven't yet but it's supposed to be good
English
3
0
3
1.5K
Lotto
Lotto@LottoLabs·
Qwen 3.5 27b never degrades, never stops running, never has token limits, never refuses, never logs my prompts, never trains on my data, never sells my data, never runs up my credit card
English
103
94
2.3K
129.7K
Koen
Koen@koenvaneijk·
I think the next evolution of *claws and other agents is that they're integrated in the AI apps and there will be a decoupling of user initiative and agentic action. There will be a heartbeat at which appropriate context is built and action is taken without user initiative.
English
0
0
0
31
Koen
Koen@koenvaneijk·
localcode now available on gh, coding agent that does not suck and a single .py file
GIF
English
1
0
0
35
Koen
Koen@koenvaneijk·
In my opinion, Hermes tries to do too many things. It could be better if it follows UNIX philosophy and do one thing great. I installed it and it's just too many dependencies out of the door that I would never even consider using this in a real production scenario, while I am convinced the core agentic loop could be very lightweight. I rather introduce dependencies based on the usage scenario, but maybe Hermes is not for me. Appreciate all your efforts!
English
0
0
1
199
Sudo su
Sudo su@sudoingX·
@NousResearch try it right now if you're already running it show your setup and tell me your experience. pros, cons, all welcome.
Sudo su tweet media
English
39
6
182
18.5K
Sudo su
Sudo su@sudoingX·
been getting this question in DMs and comments so let me be clear. hermes agent is built by @NousResearch. when a research lab builds an agent framework the whole point is to get the full capability out of the model. 11 parsers, 30+ tools, skills system, all designed to extract everything the model can give. for the best experience use it with frontier models through nous research. the difference between hermes + frontier and any other framework is not even close. i've used it with opus 4.6 and it was on another level. it also runs beautifully on local models. i run on my gpus daily. but if you want the full power, pair it with the best model you can access. if you're still on openclaw it's time. i will personally help anyone migrating. you deserve better tools.
mika@bladgolem

@sudoingX what do you think about using codex on hermes?

English
46
19
395
29.8K
Sudo su
Sudo su@sudoingX·
i just became a mod of x/LocalLLaMA. if you're running local models on your own hardware and want in, the community is open. pinned and highlighted on my profile. approving members starting today. drop your setup below and i'll get you in. 3060, 3090, 4090, 5090, AMD, whatever you're running. all welcome. if you're hitting issues with hermes agent, llama.cpp, model selection, configs, i'm here. let's make local AI accessible for everyone.
Sudo su tweet media
Sudo su@sudoingX

let me get you started in local AI and bring you to the edge. if you have a GPU or thinking about diving into the local LLM rabbit hole, first thing you do before any setup is join x/LocalLLaMA. this is the community that will help you at every step. post your issue and we will direct you, debug with you, and save you hours of work. once you're in, follow these three: @TheAhmadOsman the oracle. this is where you consume the latest edges in infrastructure and AI. if something dropped you hear it from him first. his content alone will keep you ahead of most. @0xsero one man army when it comes to model compression, novel quantization research, new tools and tricks that make your local setup better. you will learn, experiment, and discover things you didn't know existed. @Teknium maker of Hermes Agent, the agent i use every day from @NousResearch. from Teknium you don't just stay at the frontier, you get your hands on the tools before everyone else. this is where things are headed. if you follow me follow these three and join the community. you will be ahead of most people in this space. if you run into wrong configs, stuck debugging hardware, or can't get a model to load, post there so we can help. get started with local AI now. not only understand the stack but own your cognition. don't pay openai fees on top of giving them your prompts, your research, and your most valuable thinking to be monitored and metered. buy a GPU and build your own token factory.

English
328
43
817
60.2K
Koen
Koen@koenvaneijk·
@TwatterLester @sudoingX It replaced Opus 4.6 for me personally, it feels at the same level in terms of agency, but it needs context. It's definitely much worse for non-English. Just try the APIs or Runpod if you want to be sure before you buy hardware.
English
0
0
1
71
Lester
Lester@TwatterLester·
@koenvaneijk @sudoingX How good is that model? Is it worth thr cost of 5090? I m thinking of the same setup, but worried the model isnt that good at coding.
English
2
0
0
90
Koen
Koen@koenvaneijk·
@sudoingX I've spent over 3k on opus 4.5. Yet now I use Qwen 3.5 27B on a 3k RTX 5090 and honestly I prefer it over Opus 4.6
English
1
0
1
113
Sudo su
Sudo su@sudoingX·
opus is not for everyone
English
6
0
36
4K
Koen
Koen@koenvaneijk·
@juliafedorin Please, please, just @tool wrappers over Python functions! You don't need more. Actually, just a simple if elif else will do.
English
0
0
0
46
Julia Fedorin
Julia Fedorin@juliafedorin·
come by Marina Green RIGHT NOW and tell us your opinion!
Julia Fedorin tweet media
English
105
18
609
146K
Koen
Koen@koenvaneijk·
@BubCasto @LottoLabs Yes, my setup allows for 90k context, I never get close to that with localcode as it auto-prunes tool calls and compacts shell output etc.
English
0
0
1
65
Bub
Bub@BubCasto·
@koenvaneijk @LottoLabs So the trade off is context, got a 5090 also so just trying to wrap my head around all this, I’ll give q6 (and maybe q8) a shot!
Bub tweet media
English
1
0
0
24
Lotto
Lotto@LottoLabs·
Okay tonight we do vLLM vs LMstudio(llama.cpp) checks w/ qwen 27b
English
20
0
123
8.9K
Koen
Koen@koenvaneijk·
@brah_ddah @LottoLabs Yes this maxes the VRAM, my cofounder wrote a script that grid searched the max Qwen 3.5 we could run on RTX5090 and 27B Q6 from Unsloth hits the spot with 90k context.
English
1
0
0
72
Koen
Koen@koenvaneijk·
@bearlyai token machine man says you need more token machines
English
0
0
1
76
Bearly AI
Bearly AI@bearlyai·
Jensen says he will be upset if he finds out his $500k engineer is *not* using at least $250k in tokens
English
152
174
3.8K
905.5K
Koen
Koen@koenvaneijk·
qwen 27b is at least opus 4 but at home
English
1
0
1
50
Koen
Koen@koenvaneijk·
qwen 27b on an rtx5090 with localcode is a beast and you can't convince me otherwise > 97% cache hits > 60 tokens/s this is it. the 100k/pa SWE in a 5k box.
English
0
0
1
63