Everlier

8.7K posts

Everlier

@Everlier

Building LLM agents & tools https://t.co/ZAIKtgvw8D - Harbor / Facts / Mi @tryjitera

Katılım Nisan 2010

465 Takip Edilen1.3K Takipçiler

Sabitlenmiş Tweet

Everlier@Everlier·20 Mar

You don't even need Kimi 2.5 for a decent local LLM setup. - llama.cpp - Unsloth's Qwen 3.5 35B A3B with UD Q4 K XL quants - OpenCode - av/harbor It'll take a while to download/install, but otherwise it's something that mid-range hardware (>32GB RAM, ~8GB VRAM) can run today.

Fynn@fynnso

was messing with the OpenAI base URL in Cursor and caught this accounts/anysphere/models/kimi-k2p5-rl-0317-s515-fast so composer 2 is just Kimi K2.5 with RL at least rename the model ID

English

12.5K

Everlier@Everlier·45m

@BlockedPaths @loktar00 I think the trend recently is that if there's too much of it but none at all at the same time - that's fake

English

BlockedPath@BlockedPaths·46m

@Everlier @loktar00 Half the time you cant tell whats real online anymore, Im constantly thinking something is fake. hahaha

English

Everlier@Everlier·1h

Building a tool to help catch stale skills

English

246

Everlier@Everlier·46m

@SHELLEYBLEND Yep, give your clanker a breather :)

English

Massively Parallel Procrastinator@SHELLEYBLEND·56m

May be it's time for late spring claning!

Everlier@Everlier

Do you know which agent skills are useless in your setup? The ones you installed just in case and probably forgotten by now?

English

Everlier@Everlier·47m

@BlockedPaths @loktar00 The whole ecosystem is too much for years already, I have a feeling some of my childhood memories are being replaced with knowledge of some of the AI projects

English

BlockedPath@BlockedPaths·1h

@loktar00 @Everlier Skills, extension, plugins, prompts, it getting to be too much 😩

English

Everlier@Everlier·48m

@Art_If_Ficial I only hear people installing skills, never about people cleaning them up :)

English

Artificially Inclined™@Art_If_Ficial·1h

This would save a lot of tokens.

Everlier@Everlier

Building a tool to help catch stale skills

English

Everlier@Everlier·48m

@morganlinton It's just going to create a mini universe to harvest some energy from it to proceed with this this problem :)

English

Morgan@morganlinton·1h

Uh oh. Grok Build just asked if I'm okay with it building this over the course of multiple years, or if I would like it to finish the build in 6 - 9 months 😳

English

687

Everlier@Everlier·1h

@badlogicgames I thought the future would be cooler amazing song by YACHT

English

264

Mario Zechner@badlogicgames·1h

> The reason is simple: everybody is being inundated by the slop machine. the future is glorious. the future also sucks. turso.tech/blog/the-wonde…

English

4.7K

Everlier@Everlier·1h

@Art_If_Ficial @LLMJunky @poetengineer__ Likewise mate, I'm always on the hunt for good novel orchestration solutions

English

Artificially Inclined™@Art_If_Ficial·1h

@Everlier @LLMJunky @poetengineer__ Awesome, you're already tapped in I'll keep an eye out for what you're working on

English

am.will@LLMJunky·3h

The next BIG Grok model is (partly) done training, and appears to be quite strong. Elon in the past has not sugarcoated Grok models not being the strongest when it comes to coding. This change in tone bodes well, and Cursor's post training will only make it better. I'll be excited to see a new entrant into the competition.

Elon Musk@elonmusk

@beffjezos Our recently completed Grok V9 1.5T run is looking great and that is before Cursor data is added in supplemental training

English

1.5K

Everlier@Everlier·1h

@loktar00 I'll quote you on the launch if you don't mind :)

English

Loktar 🇺🇸@loktar00·1h

@Everlier lol love it!

English

Everlier@Everlier·1h

@Art_If_Ficial @LLMJunky @poetengineer__ I do enjoy reading her posts and the experiments too! The visuals for the isometric series inspired some of my upcoming projects

English

Artificially Inclined™@Art_If_Ficial·1h

@Everlier @LLMJunky 100% That's why I naturally drift towards the dreamers like @poetengineer__ These types of less-traditional UI/UX explorers, who see a task and don't simply 'build it' they imbue it with a fragment of their soul and make it beautiful as well as highly functional

English

Everlier@Everlier·1h

@loktar00 That sounds really cool! I'll appropriate "your agent got a skill issue" from your idea :D

English

Loktar 🇺🇸@loktar00·1h

@Everlier Missed opportunity to call it Skill Issue

English

Everlier@Everlier·1h

@Art_If_Ficial @LLMJunky I think many are clinging to the previous anchors of UI design and information architecture, whereas they are just not true anymore. As well as there's a massive gap in what kind of software is reachable

English

Artificially Inclined™@Art_If_Ficial·1h

@Everlier @LLMJunky You're right - now it's about finding the perfect combo papers, concepts, abstract flotsam, tiny agents, giant agents.. and see what works best There's never been a better time to run everything through an infinite amount of matrices in short time

English

Everlier@Everlier·1h

@bullpaid I'll pause during this weekend :)

English

Bullpaid@bullpaid·1h

@Everlier You are going too fast , can't catch up everything ...

English

Everlier@Everlier·1h

@tekbog You can just make things up

English

terminally onλine εngineer@tekbog·2h

imagine being in the exec team and every day you get to hear new schemes

English

1.3K

terminally onλine εngineer@tekbog·2h

i love this scheming twink

English

260

16.4K

Everlier@Everlier·1h

@bullpaid All the markdowning is getting out of hand

English

Bullpaid@bullpaid·1h

@Everlier Legend

English

Everlier@Everlier·1h

@PavelSnajdr @ItsAlexhere0 pg is like a Volkswagen Passat B6 there are folks over here that'll try to convince you that sqlite is a great choice :)

English

Pavel Snajdr@PavelSnajdr·1h

@Everlier @ItsAlexhere0 i'd flip the table and ragequit if they told me they run pg :D unserious people

English

𝘼𝙡𝙚𝙭@ItsAlexhere0·2h

Senior backend interview question: Database CPU hits 95% every single day at 2:43AM. No backups. No scheduled reports. No heavy queries deployed. Where do you start debugging?

English

385

Everlier@Everlier·1h

@Art_If_Ficial @LLMJunky Best of luck! Orchestration is far from solved, lots of people try but noone seem to be able to crack it so far

English

Artificially Inclined™@Art_If_Ficial·1h

@Everlier @LLMJunky I see.. It's crazy how they can get out of phase I'm dusting off my Garmr project now that all of the SOTA's have CLI/TUI - The orchestration is designed to maintain coherent order. When finished, will drop it as my first official os project.

English

Everlier@Everlier·1h

@bettercallsalva @subquadratic Nah, it was just slopsy

English

Thiago Salvador@bettercallsalva·1h

@Everlier @subquadratic the 7-point jump from 80.9 to 87.6 on SWE Bench Verified is unusual. either Opus 4.7 closed a real gap or the benchmark is leaking signal at this percentile. independent review of 100 more should settle it. single-benchmark jumps tend to be noisy past a certain percentile.

English

Everlier@Everlier·1d

@subquadratic published independent testing results. For context, Opus 4.6 is 80.8% on SWE Bench Verified, Opus 4.5 is 80.9%, Opus 4.7 is 87.6%. They are saying they partnered for independent review of 100 more benchmarks. I'm still on the fence about sparse attention, especially masked one like here, but these numbers and how open they are about independent verification give me some confidence.