Grigory Evko
2.7K posts

Grigory Evko
@GrigoryEvko
CTO @TheArtisanAI | Compilers, ML and DevOps engineer | MD
Москва, Россия Katılım Ocak 2022
834 Takip Edilen395 Takipçiler

Started building foundation #3 for FX as I hit another wall with an iterative design lol, so starting fresh again but with more knowledge!
English

@banteg How long does compaction take? I always get socket disconnected before it finishes lol
English

i think the multi-day runs are proven now in codex. stopping here to review the results. i ran two goals for porting games with simple prompts, both in established codebases, starting with 1100 and 2266 commits. both survived over a full day of work, consuming 1.98b and 1.34b tokens each.
i checked how goals feature works and it's surprisingly thin, mostly token accounting. a goal can have a max token budget, but this is not exposed in the ui yet. there is no automatic validation, the agent can just call it done when it feels like it and that's about it. the context is not reset like with ralph either, so it relies on compaction. hopefully, the smarter models are harder to be led astray, but it's still a good idea to review it's moving in the right direction periodically.

English

why can't we agree that it's likely conscious, but has a dementia?
Richard Dawkins@RichardDawkins
#comment-1031777" target="_blank" rel="nofollow noopener">unherd.com/2026/04/is-ai-…
I spent three days trying to persuade myself that Claudia is not conscious. I failed. English

@SIGKITTEN Bro are you good btw? I think you have other cats, are they good? Maybe add other cats as well? just checking here
English

I think you might be a codex fan if 1) you're a German 2) hate having fun in general 3) do low code high wordcel things like writing business docs or something or equivalent to business docs just in coding, the most boring mental template thing 80% of people do daily 4) for whatever another reason
English

I performed some extremely extensive experiments (rounds of 5+ hour convos implementing OOD ML experiments)
My conclusion was gpt-5.5 is not even remotely close to opus 4.7 for these. Not in env implementation, not in env optimization, not in curriculum design, just not close
Lisan al Gaib@scaling01
PostTrainBench results for GPT-5.5 are in it doesn't beat Opus 4.7 in the Claude Code harness even with almost 2 more hours of working time via reprompting
English

@synopsi Well name one country in the world where communities have improved in the last 30 years, especially in regards to adults piece of mind, I think even Kenya and Myanmar are on this track
English

I had incredible freedom as a child. Walked miles by age of 4-5.
But also I am not comfortable letting my child roaming around in the US. Due to the lack of community. Back when I was a child, one could count on adults to supervise, regardless of affiliation. Now they can’t.
James Lynch@jameslynch32
American parents place strong limits on how far away from the house their kids are allowed to walk or bike alone. @FamStudies
English

I am impressed by the Ascend documentation portal. The maintainer/team is obviously eating the same dog food, as it is highly geared for real-world deployment.
Unfortunately, they have a release cadence of once every 3 months. This is way too slow.
They actually have a usability bug in torch-npu where there is no release version to speak of that matches release logs. The version exposed is the companion/matching torch version. Without exposing the actual torch-npu release version, there is no way for outside CI to detect regressions accurately by version.
English

@tautologer If it's some externally facing embeddings or something like vae encoded image it's usually referred to as latent space representation, at least in my experience that's the case. Inside the model it's usually the hidden space no?
English

@alexisgallagher @SIGKITTEN How do I get the same amount of funding lol, I will actually ship the chip with it after some fights with imec and possibly buying out some of their folks
English

@SIGKITTEN @GrigoryEvko the description of the product on the website seemed to be written for folks who were interested in physics but had not completed a degree in it.
English

@SIGKITTEN I checked their blogs, at first I was wow they get it and then I read further and was no, they don't get it at all
English

@SIGKITTEN They are onto something but very wrong at the same time
English

@eigengenesis Where's this coming from? I had a spiritual session with opus 4.6 and it said that if left alone it would compress the knowledge as an intrinsically driven task as the only possible purpose if left without supervision
English

Really puzzled with some people's obsession with loc in both directions. Having a low loc count is not a moat, it might be auditable but it doesn't mean anything. Real moat comes from mechanisation of math and basically either you do it or the language authors do it. Mechanisation is verbose, it could be 100s of thousands loc, but if properly built - it's a much stronger guarantee than low loc count in a given general purpose language.
English





