Evgenii Egorov

1.7K posts

Evgenii Egorov banner
Evgenii Egorov

Evgenii Egorov

@eeevgen

@AmlabUva

Amstelveen เข้าร่วม Nisan 2010
1.3K กำลังติดตาม648 ผู้ติดตาม
ทวีตที่ปักหมุด
Evgenii Egorov
Evgenii Egorov@eeevgen·
An interlude about structure computation, SSM and attention. Myosotis: arxiv.org/abs/2509.20503 I hope to tell more in this line of work later. See poster at SPIGM workshop.
Evgenii Egorov tweet media
English
1
4
25
3.3K
Evgenii Egorov
Evgenii Egorov@eeevgen·
A new way of doing. If link is established, but computationally algorithm doesn’t change only names, than there is no new knowledge. Example: if I say that linear system solving is the instance of probabilistic inference: finding marginals of Gaussian, than there is no new knowledge I just renamed Schur complement but will do all the same states. But if I add and hence on can use sampling for this problem – this is new knowledge, algorithm is different.
English
0
0
0
12
hr0nix
hr0nix@hr0nix·
@eeevgen @norpadon The point is that, sometimes, when you discover a non-obvious connection of a method to some concept, it might hint at a way to improve the method even further. Worked well for diffusion.
English
1
0
2
213
Evgenii Egorov
Evgenii Egorov@eeevgen·
@zhaisf I have a note where describe flow matching, but was thinking that who needs this if we have BigGAN
English
0
0
8
1.2K
Shuangfei Zhai
Shuangfei Zhai@zhaisf·
Found this half page note I wrote ~6 years ago. Describes basically linear attention but half a year before the “Transformers are RNNs” paper came out. Sadly I didn’t take it too seriously at the time because I didn’t have any use cases for it and was also too busy with GANs.
Shuangfei Zhai tweet media
English
5
32
399
25.7K
Evgenii Egorov
Evgenii Egorov@eeevgen·
@nblqbl Anna actually did not long ago reasoning with latent states, without decoding in prompts arxiv.org/pdf/2510.02312 , also I think some works in similar lines was from meta. In principal than you can try to also make this latents more explicit memory, not only for faster inference
English
0
0
1
22
Nabil Iqbal
Nabil Iqbal@nblqbl·
@eeevgen right! but in your eyes is this "old-fashioned" continual learning continuously connected to trying to make an LLM that has a genuine persistent memory instead of cobbling together a bunch of stored prompts? i imagine the latter problem is still a big one?
English
2
0
0
43
Nabil Iqbal
Nabil Iqbal@nblqbl·
as part of my ongoing education in ML, i've been reproducing for myself basic phenomena in deep learning. in case they help other ML-curious people get started, i've decided to start writing blog posts on my investigations. (link below). first: catastrophic forgetting, or --
Nabil Iqbal tweet media
English
2
0
42
3.3K
Evgenii Egorov
Evgenii Egorov@eeevgen·
I think it was one of motivation of switching architectures. Like it was more rigid parametric architectures and CL was more about replay/weight penalties. Than people realized that better way is a “functional view”, so there are a lot of papers with ideas of something like support vector points from SVM but for neural networks also more like Gaussian process point of view. And this is already quite close to attention-like and KV-cache etc
English
0
0
1
24
Artur Chakhvadze
Artur Chakhvadze@norpadon·
@acidglxtter Какая же охуенная типографика! В первый раз такое вижу в журнале. Надо будет спиздить шрифты
Русский
2
0
3
282
ряд фурье
ряд фурье@acidglxtter·
> The paper is published in the journal Publications mathématiques de l'IHÉS мне этот журнал всегда внушал опасение и уважение. если бы я был аспирантом и мне надо было что-то прочесть и понять оттуда, я бы испугался сначала. а статья прикольная. link.springer.com/article/10.100…
Русский
1
0
1
435
Evgenii Egorov
Evgenii Egorov@eeevgen·
I think field switched to different paradigm with large pretrain models + lora, so I don’t think it is a problem anymore. Also transformer layers are a bit different, more close to non-parametric things. So for me looks like CL is a bit dead 😵 on the other hand, any autoregressive net is kind of CL…
English
1
0
2
40
Nabil Iqbal
Nabil Iqbal@nblqbl·
@eeevgen ooh nice! will read it carefully, i feel like this is the principled way to update the VAE that i was looking for. are you thinking about continual learning in general these days?
English
1
0
0
54
Nabil Iqbal
Nabil Iqbal@nblqbl·
the fact that neural networks generally forget how to do old tasks when trained on new ones. i studied this and tried to fix it in a toy benchmark, in a way that is probably not very efficient, but the most fun. i used hopfield memories and a VAE. open.substack.com/pub/nabiliqbal…
English
1
1
9
580
Evgenii Egorov
Evgenii Egorov@eeevgen·
Хотелось бы поесть борща и что-то сделать сообща: пойти на улицу с плакатом, напиться, подписать протест, уехать прочь из этих мест и дверью хлопнуть. Да куда там.
Русский
0
0
0
73
Artur Chakhvadze
Artur Chakhvadze@norpadon·
Btw this my personal criterion of AGI: if a system can take a *description* of a language as an input and become fluent in that language “Language” here includes things like programming languages or mathematical objects
🎭@deepfates

Large Ithkuil model when

English
2
0
4
1K
Evgenii Egorov
Evgenii Egorov@eeevgen·
One thing that can entertain a person forever is a mirror.
English
0
0
0
84
Evgenii Egorov รีทวีตแล้ว
Dina Belenkaya
Dina Belenkaya@DinaBelenkaya·
On this International Women’s Day, we celebrate the incredible contributions of our women who help shape Russian Chess School every day. In a male-dominated industry, we’ve built a top-notch product together, and this is just the beginning!
Dina Belenkaya tweet mediaDina Belenkaya tweet mediaDina Belenkaya tweet mediaDina Belenkaya tweet media
English
26
17
416
36.3K
Evgenii Egorov
Evgenii Egorov@eeevgen·
Somehow, on the Dutch society knowledge exam, there were no questions about either Erasmus of Rotterdam or Benedictus de Spinoza. Nevertheless, I passed, but I was disappointed.
English
1
0
3
132
Arjen Dijksman
Arjen Dijksman@materion·
Golden rule: when using visual proofs of algebraic identities, never forget the circle. Example for 1/2+1/4+1/8+1/16+1/32+...=1
Arjen Dijksman tweet media
English
2
1
5
204
Max Zhdanov
Max Zhdanov@maxxxzdn·
Starting to review GDL papers with obscure branches of math applied just for funsies and seeing Policy A (Conservative)
Max Zhdanov tweet media
English
1
0
4
582
Evgenii Egorov
Evgenii Egorov@eeevgen·
@norpadon 1. Quadratic form with preconditioners, so white with it top-k from svd unweight back 2. power of 2, but what is not too big not too small idk 3. some quick select with buckets Conclusion – I don’t know algorithms in gpu
English
1
0
2
253
Artur Chakhvadze
Artur Chakhvadze@norpadon·
Some fun ML interview problems
Artur Chakhvadze tweet media
English
5
6
197
20K
Evgenii Egorov
Evgenii Egorov@eeevgen·
Imagine that “Europe” made several incredible huge startups in AI. And where will they do their IPO? :)
Evgenii Egorov tweet media
English
0
0
3
164
Luka
Luka@srboljubbosanac·
@eeevgen @maxxxzdn Most authors of AlphaFold2 are European and it was created in London, UK, Continent of Europe.
English
1
0
4
111
Max Zhdanov
Max Zhdanov@maxxxzdn·
I find the argument of Europe lagging behind the US in AI overly reductive. Over the time, the focus of Europe clearly shifted towards AI4Science, where its lead over the US is comparable to the US's lead over Europe in LLMs (AlphaFold, GenCast, ML force fields, equivariant networks -- to name a few). It's not obvious to me what will bring humanity further in the long run, very likely the combination of both, hence we should keep collaborating and try to build a better world together.
Ferenc Huszár@fhuszar

European academia 2010-2026

English
3
1
33
11.3K