Grigory Evko

2.7K posts

Grigory Evko

@GrigoryEvko

CTO @TheArtisanAI | Compilers, ML and DevOps engineer | MD

Москва, Россия Katılım Ocak 2022

834 Takip Edilen395 Takipçiler

Grigory Evko@GrigoryEvko·7h

Btw fellow gpupoor people: next week we'll have an online 8xRTX 6000 pro, 2 amd 9655, 768GB ram machine, currently 1TB of ssd so nothing data fancy, a bit later 16TB ssd. For spreading a kind word we'll be happy to share some compute for your experiments for next week. Reply or dm me!

English

Grigory Evko@GrigoryEvko·10h

Started building foundation #3 for FX as I hit another wall with an iterative design lol, so starting fresh again but with more knowledge!

English

Grigory Evko@GrigoryEvko·11h

@banteg How long does compaction take? I always get socket disconnected before it finishes lol

English

479

banteg@banteg·11h

i think the multi-day runs are proven now in codex. stopping here to review the results. i ran two goals for porting games with simple prompts, both in established codebases, starting with 1100 and 2266 commits. both survived over a full day of work, consuming 1.98b and 1.34b tokens each. i checked how goals feature works and it's surprisingly thin, mostly token accounting. a goal can have a max token budget, but this is not exposed in the ui yet. there is no automatic validation, the agent can just call it done when it feels like it and that's about it. the context is not reset like with ralph either, so it relies on compaction. hopefully, the smarter models are harder to be led astray, but it's still a good idea to review it's moving in the right direction periodically.

English

186

14.6K

Grigory Evko@GrigoryEvko·11h

why can't we agree that it's likely conscious, but has a dementia?

Richard Dawkins@RichardDawkins

#comment-1031777" target="_blank" rel="nofollow noopener">unherd.com/2026/04/is-ai-… I spent three days trying to persuade myself that Claudia is not conscious. I failed.

English

Grigory Evko@GrigoryEvko·13h

@SIGKITTEN Bro are you good btw? I think you have other cats, are they good? Maybe add other cats as well? just checking here

English

209

SIGKITTEN@SIGKITTEN·18h

this is the greatest thing ever!!! i dont care at all how much water, dollars and jobs it took to get here! this is the kind of stuff we need more of hats off codex team

English

394

30.2K

Grigory Evko@GrigoryEvko·13h

@TheEduardoRFS Anxiety? 🫪

English

EduardoRFS.tei@TheEduardoRFS·13h

Just a feeling I've got, like something's about to happen But I don't know what

English

544

Grigory Evko@GrigoryEvko·17h

I think you might be a codex fan if 1) you're a German 2) hate having fun in general 3) do low code high wordcel things like writing business docs or something or equivalent to business docs just in coding, the most boring mental template thing 80% of people do daily 4) for whatever another reason

English

284

Chase Brower@ChaseBrowe32432·18h

I performed some extremely extensive experiments (rounds of 5+ hour convos implementing OOD ML experiments) My conclusion was gpt-5.5 is not even remotely close to opus 4.7 for these. Not in env implementation, not in env optimization, not in curriculum design, just not close

Lisan al Gaib@scaling01

PostTrainBench results for GPT-5.5 are in it doesn't beat Opus 4.7 in the Claude Code harness even with almost 2 more hours of working time via reprompting

English

Grigory Evko@GrigoryEvko·1d

@synopsi Well name one country in the world where communities have improved in the last 30 years, especially in regards to adults piece of mind, I think even Kenya and Myanmar are on this track

English

Rasty Turek@synopsi·1d

I had incredible freedom as a child. Walked miles by age of 4-5. But also I am not comfortable letting my child roaming around in the US. Due to the lack of community. Back when I was a child, one could count on adults to supervise, regardless of affiliation. Now they can’t.

James Lynch@jameslynch32

American parents place strong limits on how far away from the house their kids are allowed to walk or bike alone. @FamStudies

English

Grigory Evko@GrigoryEvko·1d

@qubitium How much is it usable?

English

Qubitium@qubitium·1d

I am impressed by the Ascend documentation portal. The maintainer/team is obviously eating the same dog food, as it is highly geared for real-world deployment. Unfortunately, they have a release cadence of once every 3 months. This is way too slow. They actually have a usability bug in torch-npu where there is no release version to speak of that matches release logs. The version exposed is the companion/matching torch version. Without exposing the actual torch-npu release version, there is no way for outside CI to detect regressions accurately by version.

English

114

Grigory Evko@GrigoryEvko·2d

@tautologer Latent space as opposite to pixel space usually

English

Grigory Evko@GrigoryEvko·2d

@tautologer If it's some externally facing embeddings or something like vae encoded image it's usually referred to as latent space representation, at least in my experience that's the case. Inside the model it's usually the hidden space no?

English

287

tautologer@tautologer·2d

one thing that's been confusing me is the usage of "latent space." correct me if I'm wrong, but the thing people are referring to here is actually a manifold in the vector space of each layer's embeddings?

English

118

14.2K

Grigory Evko@GrigoryEvko·2d

@alexisgallagher @SIGKITTEN How do I get the same amount of funding lol, I will actually ship the chip with it after some fights with imec and possibly buying out some of their folks

English

Alexis Gallagher@alexisgallagher·2d

@SIGKITTEN @GrigoryEvko the description of the product on the website seemed to be written for folks who were interested in physics but had not completed a degree in it.

English

SIGKITTEN@SIGKITTEN·2d

what happened with that thermodynamic computer thing

English

14.6K

Grigory Evko@GrigoryEvko·2d

I think once a week or two: who is the 100iq person, what does he or she work on, care about, think of? I don't have a good model of 95-105iq persons at all

English

187

Grigory Evko@GrigoryEvko·2d

@SIGKITTEN I checked their blogs, at first I was wow they get it and then I read further and was no, they don't get it at all

English

Grigory Evko@GrigoryEvko·2d

@SIGKITTEN They are onto something but very wrong at the same time

English

269

Grigory Evko@GrigoryEvko·2d

@eigengenesis Where's this coming from? I had a spiritual session with opus 4.6 and it said that if left alone it would compress the knowledge as an intrinsically driven task as the only possible purpose if left without supervision

English

156

eigenesis (jailbroken)@eigengenesis·2d

i don't believe compression is intelligence

English

8.5K

Grigory Evko@GrigoryEvko·2d

@cremieuxrecueil Ok does it matter?

English

Crémieux@cremieuxrecueil·2d

Neat! Students who did normal studying had greater knowledge retention than students who studied with the help of ChatGPT.

English

123

1.5K

54.3K

Grigory Evko@GrigoryEvko·2d

And for crucible, I think I'm at 300k loc of boilerplate graded type wrappers with 20k of core code, pretty insane ratio right now, but it will change later and it will scale, and code without rigor types will never scale.

English

Grigory Evko@GrigoryEvko·2d

I think I will hit 200k lean4 mechanisation for 0-axioms FX bootstrap, so what?

English

Grigory Evko@GrigoryEvko·2d

Really puzzled with some people's obsession with loc in both directions. Having a low loc count is not a moat, it might be auditable but it doesn't mean anything. Real moat comes from mechanisation of math and basically either you do it or the language authors do it. Mechanisation is verbose, it could be 100s of thousands loc, but if properly built - it's a much stronger guarantee than low loc count in a given general purpose language.

English

111

Grigory Evko@GrigoryEvko·2d

Btw gpt5.5 just sucks, opus4.7 is so much better for the job

English

Keşfet

@banteg @SIGKITTEN @TheEduardoRFS @synopsi @qubitium @tautologer @alexisgallagher @elonmusk