kailash

2.7K posts

kailash banner
kailash

kailash

@hsaliak

(╯°□°)╯︵ ┻━┻

Katılım Ağustos 2009
323 Takip Edilen265 Takipçiler
Sabitlenmiş Tweet
kailash
kailash@hsaliak·
Can technological singularity arise in a Boltzmann brain?
Română
0
0
0
1.4K
kailash
kailash@hsaliak·
@antirez Yup! They also ban and then retrospectively update TOS, like Gemini did. Codex is the only plan that has not yet eroded trust.
English
0
0
0
341
antirez
antirez@antirez·
@hsaliak Yep, especially if certain usages are denied by TOS for subscriptions, like it is happening more and more.
English
2
0
0
481
antirez
antirez@antirez·
First impressions on DeepSeek v4 pro used via Claude Code. It is great but not so cheap compared to how much tokens you get with the OpenAI 200$ subscription. More or less I burned 1$ per hour of intense usage.
English
15
3
167
19.2K
kailash
kailash@hsaliak·
@antirez Ok I see where you are coming from “comparing current offerings” for dev. However comparing “deploying agents on the backends” may have a different equation. Since at that point, metered APIs must be compared.
English
1
0
2
507
antirez
antirez@antirez·
@hsaliak Not practically. There is no DeepSeek sub AFAIK, so if for $200 you get a given amount of GPT 5.5 tokens, you have to understand if you could go for DeepSeek v4 saving a lot of money or not.
English
2
0
6
1.5K
kailash
kailash@hsaliak·
GPT-5.5 seems to be exactly what it was advertised as. A visible step up.
GIF
English
0
0
0
48
kailash
kailash@hsaliak·
@jtregunna Makes sense.. LSP is key. The problem with repo maps without LSP is that they get stale every turn. Another possible idea is a context 'sieve' that sifts in relevant info from a large context for an LLM query, perhaps implemented with an SLM? Many ideas unexplored..
English
1
0
1
12
Jeremy Tregunna
Jeremy Tregunna@jtregunna·
Yeah it's on my reading list, at work right now so will check later. Also thinking of a major rewrite to ctrl+code just to slim it down, and make a context router out of one of the libraries i use in ctrl+code called harnessutils and have not just my approach, but essentially a bunch of different approaches, you pick and choose, but ontologies of types of interactions, a cheap way to classify (TBD) and use that ontology to pick which of the approaches makes sense... continuation type? just take last prompt + the minimal delta of whatever the user's one word-ish prompt said... debugging? Focus your retrieval, error logs, recent changes, test failures, LSP symbols in the failing module, etc... it'll be robust.
English
1
0
0
9
Jeremy Tregunna
Jeremy Tregunna@jtregunna·
Coding harnesses listen up: Integrate LSPs, you can cover many languages as there are many quality LSPs. Use those LSPs to build indexes and local semantic understanding around things like invariants, contracts, call graphs, etc., and where they're located. When you use the edit/update tool, track what file was changed, what offsets, update those in the indexes. You can probably take it from here, but you'll want some sort of scoring system on the code you index. Then, on the next turn, don't just shit this turn's response into the context window and continue... reset the context window (you know, what some harnesses have a /clear command for), and recreate it based on the user's next turn prompt. Rinse, repeat. Yes it adds latency... yes it improves accuracy and context window management... no almost nobody does it this way because they're bloody moronic. Use your own human brain from time to time at least when working on your harnesses.
English
1
2
4
475
kailash
kailash@hsaliak·
@jtregunna I'd love to hear your take on the approach i've taken (doc linked earlier). It's a turn based, rolling window that just tells the LLM where to look for history. I've found it effective. I'll check out ctrl+code.
English
1
0
1
13
Jeremy Tregunna
Jeremy Tregunna@jtregunna·
@hsaliak Yeah, my approach is not quite fully complete in ctrl+code, but it's in this direction. Don't even need an embedding model if you're ok burning a turn in the loop on it though, but yup. The key is to shed messages like we'd shed traffic during a DoS attack on a website.
English
1
0
0
13
kailash
kailash@hsaliak·
@jtregunna But your approach will certainly work. 1. Build a code map based on user request. 2. Use an embedding model to obtain relevant data from it. 3. ignore irrelevant (or consider previous N turn's user messages) Has legs. It could work!
English
1
0
1
14
kailash
kailash@hsaliak·
@jtregunna I've noticed that a 'message group' which is a user request to final response, often varies, it's like moving between one scene to another. I take advantage of this insight in designing the context. You can cover for multi turn context by tuning the number of groups to consider.
English
2
0
0
12
kailash
kailash@hsaliak·
The multi-agent feature announced by @zeddotdev totally works with std::slop's ACP support! I dont think i'll merge this feature to main because of the complexity of the protocol, but it's been an interesting learning experiment.
kailash tweet media
English
0
0
0
21
kailash
kailash@hsaliak·
@bkaradzic @olson_dan No arguments there, but this nuance is lost on a non technical audience with a blanket policy.
English
0
0
0
31
Dan Olson
Dan Olson@olson_dan·
Allowing C++20 in codebase is something I wish I'd fought against.
English
19
0
88
16.4K
kailash
kailash@hsaliak·
@bkaradzic @olson_dan Many companies or teams have an uphill battle to climb the moment discussion on vendoring a gpl3 codebase starts, regardless of use. It’s a hurdle.
English
1
0
0
117
kailash
kailash@hsaliak·
@olson_dan No-exceptions abseil and c++17 works for me. My problem with c++20 is not really the language changes, but the standard library bloat with features unsuitable for high performance code. Ranges in 20 for example.
English
0
0
0
322
kailash
kailash@hsaliak·
Spent time this weekend on a new release of std::slop. Native OAuth flow, so you can get going with a single binary download. Revised documentation github.com/hsaliak/std_sl… UX polish. github.com/hsaliak/std_sl… ACP support remains in a separate branch for now. I’m not convinced.
English
0
0
0
102
kailash
kailash@hsaliak·
The “zig is better than rust” arguments imply that the Swiss cheese model for safety was always better than safety at compile time. Perhaps what the world needed was a no-gc language with modern ergonomics and packaging after all.
English
0
0
0
62
kailash
kailash@hsaliak·
@jorandirkgreef The value of compounding engineering work. It beats haste all the time.
English
1
0
5
1K
Joran Dirk Greef
Joran Dirk Greef@jorandirkgreef·
I asked the TigerBeetle team yesterday: “What are the things that accelerate us existentially, by orders of magnitude?” Everyone said: “Exponential quality” “First principles understanding” “Systems thinking” “A methodology that’s 2nd order remarkable” Guess what nobody said?
English
29
32
292
35.3K
Bryan Johnson
Bryan Johnson@bryan_johnson·
Friends, stop drinking alcohol. Not cut back. Eliminate. > alcohol increases cortisol > disrupts REM sleep > accelerates epigenetic aging > shrinks hippocampal volume > elevates resting heart rate > raises inflammatory markers > impairs glucose metabolism for 16 hrs One drink does that.
English
1.9K
1.1K
16.7K
5.4M
kailash
kailash@hsaliak·
@olson_dan Please don’t get me wrong, I know I can still do this, but the pricing model is not consistent with what I perceive to be the ideal workflow. Pricing had to make the right things easy, and the wrong things hard - hence the questions.
English
0
0
0
24
kailash
kailash@hsaliak·
@olson_dan Ok I see - use the SOTA model for one-shot but use something simpler for conversational workflows. I still feel that this is the opposite of how LLMs work.. and I feel that I need to converse and “close the loop” in a fundamentally non deterministic workflow, but I will try it!
English
1
0
0
36
Dan Olson
Dan Olson@olson_dan·
I am a Microsoft employee if you go down the org chart far enough, so take that as a disclaimer. But I think github copilot is one of the best values in AI all around right now for personal stuff. OpenCode and copilot CLI both support it, and both are not bad. Another disclaimer: I haven't tried codex or claude code to see how they compare.
English
6
0
10
3.6K