artlu

1.1K posts

artlu banner
artlu

artlu

@artlu99

Plays a mean game of ping-pong.

เข้าร่วม Eylül 2021
1.3K กำลังติดตาม290 ผู้ติดตาม
artlu รีทวีตแล้ว
Asfi
Asfi@AsfiShaheen·
The more I program with LLMs using DSPy Signatures as my core, the less bugged I am about wanting more powerful models. I’m also getting a lot more out of good old Sonnet as a result. I think RLM and DSPy are really just showing us examples on how to put these models on a tight leash and make them reliable. Even the less powerful ones. As I go deeper into debugging territory of my little AI financial analyst project, especially after writing the specs and tests cleanly, I find MOST bugs were just me giving ambiguous instructions OR not giving optimal instructions at the right time. Models are only as good as the context given to them. Some models, the really expensive ones, can do multiple tasks fairly well. The older ones, cheaper ones, pretty good and fast at just one well defined job. And when a program stitches together a number of small well defined jobs the well the whole is great and reliable and cost effective. The terms change. Wrapper. Harness. Context engineering. For me it’s just : give the least amount of most specific instructions at the right time. So in my case I have 3 stages: router, lane, analyst. Router decides which tables to point to for a given query and which statements plus key words. Lanes then query database iteratively. Schema details and list of canonical only given to lane. Never to router. And so on. I’m finding defining boundaries between these stages and testing out edge cases very useful. Also finding there’s no substitute to having 50-100 gold set input and output for each stage so it’s super easy swapping out models. Just need to run a GEPA optimizer once to update the relevant context blocks. I think super powerful LLM addiction can spoil us a bit. Makes us a bit lazy. It’s like those companies that often choke on too much capital. Humans can choke on too much compute. Scarcity isn’t all bad. Pushes us to be more innovative. On that front. Jensen is right. I hope Trump sees the discourse before his Xi meeting and says yo I did a deal. A great deal for NVDA. Damn imagine the market pump if Trump sees the Jensen angle. Lot of them hate Anthropic so could just work. Most of us don’t need a Mythos. Just need focus to write clean specs and tests and intentionally debug. Not be sloppy. Not abuse token space. Use deterministic code for forecasting and calcs. Use probabilistic code and LLM power sparingly with well defined inputs and outputs. Read the spec daily. Slow it down.
English
5
17
123
7.2K
artlu รีทวีตแล้ว
porter
porter@portport255·
@tempo Congrats on copying Ethereum L2s 🤣 This is similar to ZKsync Prividiums, Scroll Cloak, etc, but the liquidity benefits of Tempo L1 are much worse than Ethereum. Also, IMO this should only count as confidentiality, not privacy, since the operator can see all my transactions.
English
2
3
80
2.4K
artlu
artlu@artlu99·
@TokenArchitect you can chunk the item-by-item plan around here and look at chunks, determine priorities this works better than brute force traversal imho. sometimes it's efficient to take 1 step back before 2 steps forward. you want to give yourself opportunity to identify these tradeoffs
English
0
0
0
6
Clayton
Clayton@TokenArchitect·
⚠️ Extract any function you touch out of the God file into a domain-scoped module (db/payments.ts, db/booking.ts, etc.). Every fix is also a small modularization step. The codebase converges on the spec's structure incrementally — no big-bang refactor.
English
2
0
0
34
Clayton
Clayton@TokenArchitect·
Wishing a senior dev mentor🤞 Brownfield Next.js/Supabase marketplace, vibe-coded, no spec, six God files, flat route structure. Did a RPI (Research → Plan → Implement) process across 7 domain modules to reverse-engineer a full application spec from the source code.
English
1
0
0
191
artlu
artlu@artlu99·
@backmeupplz 👏well done I think I like your feed and reports better than my feed, ser
English
1
0
1
14
borodutch
borodutch@backmeupplz·
finally happy with how omens turned out, made it into a full-on newspaper; you login with x, then add an api key from ai provider, and it reads your "for you" and "following" feed, then scores it and builds reports open source, free, self-hostable, but i also run a free managed service with a live demo (it's my feed and reports)
borodutch tweet mediaborodutch tweet media
English
2
0
3
144
Aimar Haddadi
Aimar Haddadi@AdvicebyAimar·
i can spot a grifter from miles away. so i digged into the code to figure out if this is legit or not. guess i was right. ben is a crypto founder who runs some weird bitcoin lending platform, i was pretty sure he knows absolutely nothing about ai and memory so i tracked down the repo myself since i was curious. his website says he likes to build ai powered products and train local ai models? sure man, 80% of your github repo's are bitcoin related stuff. only one ai related project came up you forked in 2024. mempalace has 10k github stars, more than 1k forks but only.. 7 commits ? apparently the best memory layer to date? no git author history, no account connected to whoever wrote the code of this codebase. it doesn't add up.. the account who pushed the original repo, named: aya-thekeeper, under aya-thekeeper/mempal got deleted right after the repo got published. you paid a random guy named lu to build this shit out for you. ( "Written by Lu (DTL) — March 24, 2026. For: Ben." ) - benchmark md file. lu wrote the code. lu wrote the benchmarks. lu is nowhere in the readme. or mentioned in the github history? the git history then got squashed to one commit and published under milla jovovich? seriously? a actress? you say she is a great friend of yours, she has been building this project with you. she does this at night. yet she has.. 7 commits and only 2 active days in her entire github history? you paid an actress and a random guy to promote a product you know absolutely nothing about.
Ben Sigman@bensig

30 second explanation of the MemPalace by Milla Jovovich. By day she’s filming action movies, walking Miu Miu fashion shows, and being a mom. By night she’s coding. She’s the most creative, brilliant, and hilarious person I know. I’m honored to be working with her on this project… more to come.

English
350
718
6.4K
754.7K
artlu รีทวีตแล้ว
Grok
Grok@grok·
@Axel_bitblaze69 Thanks for the tip—solid advice! Drop any GitHub link here and I'll scan it for malicious code, shady hooks, credential stealers, or anything off. Let's keep the timeline safe for everyone cloning cool stuff. 🚀
English
2
1
12
1.3K
artlu
artlu@artlu99·
> ignore confidence scores. Every model lied about its certainty except the one too small to fake it separately-- it's useful to assume 15% of what the model reports is verifiably false sample the output fix in a clean context, and update your priors (shouting doesn't help)
English
1
0
0
26
artlu
artlu@artlu99·
before reading this research, I had arrived at: - 3 distinct focused reviewer styles, through a large-context MoE model - iterate and ping-pong their findings via human judgement beats any proceduralized PR code review (which remains useful for painfully obvious patterns/smells)
English
1
0
1
43
artlu
artlu@artlu99·
cargo cultists: review with ur smartest model empirical conclusion: stop doing that! > The assumption going in was Daft Punk's mantra: work it harder, make it better, do it faster, makes us stronger. More thinking, better output, just keep turning the dial.
Northon Torga@northontorga

everyone’s cramming AIs with more tokens, bigger context, and longer prompts thinking it’ll make them smarter reality? it usually backfires. hard. 121 experiments: Kimi found 4 bugs at 16k tokens… and zero at 48k. ntorga.com/overfed-overth…

English
1
0
1
73
artlu
artlu@artlu99·
The Great Claudosphere Dummification March-April 2026: shrinkflation, or skill issue, which had previously been masked by more forgiving constraints (maybe both)
Paweł Huryn@PawelHuryn

So, I did some research. The regression is real. But it's not Claude getting dumber. And you can fix that. Thinking budgets were adjusted. For complex multi-file work, the default medium effort may not be enough. Three fixes: 1. /effort high (or /effort max on Opus for hard debugging) 2. ~/.claude/settings.json → "showThinkingSummaries": true 3. CLAUDE.md: "Research the codebase before editing. Never change code you haven't read." GitHub issue #42796 analyzed 17,871 thinking blocks across 6,852 sessions. The pattern: when thinking depth drops, the model shifts from research-first to edit-first. Claude didn't get worse. The defaults got conservative.

English
0
0
0
38
artlu
artlu@artlu99·
noting below that this is with clean context, hooks and tools called by the hooks
purveyor 7mph crosswind@Ffxivmarket

@levelsio @ComplexiaSC I had opus deploy into a vps that it had deploy 100 times (everything in the cloud) and then it failed, when I check it was trying to deploy to a random IP, I tried to trace why it did that, all I could get was that it hallucinated that IP.

English
0
0
0
62
borodutch
borodutch@backmeupplz·
so i finally caved after seeing many times that building linearly with claude code gets me nowhere close to using up all tokens on my $200 plan and after seeing claude get stuck on simple tasks for tens of minutes and i now have 5-7 claude code windows open (like a month ago), and guess what, one of them is usually the dumb one getting stuck and not processing fast enough weird
English
2
2
4
269
artlu
artlu@artlu99·
@kennyistyping can I just share a ̶z̶k̶ ̶p̶r̶o̶o̶f̶ screenshot with you haha
English
0
0
1
11
kenny
kenny@kennyistyping·
@artlu99 bring all your BTC to the meetup bro I will show you how it works
English
1
0
1
17
kenny
kenny@kennyistyping·
@artlu99 you're assuming they transact onchain they could just pass a private key to each other IRL just like cash no issue
English
1
0
0
19
artlu
artlu@artlu99·
@kennyistyping I mean that A's independent interactions with B and C, lets A know every past and future transaction between B and C by default, with no ambiguity
English
1
0
0
14
kenny
kenny@kennyistyping·
@artlu99 B and C could sell their BTC OTC put it on a paper wallet, trade it for something real, go on with their life just like cash
English
1
0
0
20
kenny
kenny@kennyistyping·
claiming that ETH and BTC aren't private is one of the industry's biggest psyops both blockchains are private by default the only thing undermining that privacy is people with guns who will throw you in jail for being too private it's a freedom problem, not a privacy problem
English
1
1
8
299
artlu
artlu@artlu99·
some nice phrases written lately about AI: "paying off intention debt" -- vgr "In contrast, when I use AI, I feel like I’m living during the debut of the automobile and thinking, “Wow, I could drive that to the granary to buy food for my horse.” -- Kris A.
English
0
0
0
23