Sergey Pozdnyakov

139 posts

Sergey Pozdnyakov

@spozdn

Postdoc @SchwallerGroup @EFFL | Geometric deep learning on 3D point clouds for atomistic modeling

Lausanne Katılım Mayıs 2023

515 Takip Edilen344 Takipçiler

Sergey Pozdnyakov retweetledi

Jean-Philip Piquemal@jppiquem·17 Ara

💫 As promised, we just released on GitHub the weights of the #FeNNixBio1 foundation machine learning model for drug design! 💫 Weights: github.com/FeNNol-tools/F… FeNNol GPU code: github.com/FeNNol-tools/F… The models are distributed under the open source ASL license (i. e. restricted to non-commercial academic research). You can also check the updated version of the preprint that includes a unified transformers architecture as well as the full computation of the Freesolv hydration free energies dataset etc... doi.org/10.26434/chemr… Happy holidays and merry Christmas everyone! 🎅 🎄 Sorbonne Université / CNRS @qubit_pharma #machinelearning #moleculardynamics #drugdesign #compchem #GPU #biophysics

English

5.2K

Sergey Pozdnyakov retweetledi

Junwu Chen@JunwuChen25·10 Kas

Thrilled to share our new work MatInvent, a general and efficient reinforcement learning workflow that optimizes diffusion models for goal-directed crystal generation. Thanks to @JeffGuo__ , @efssh , @pschwllr , @SchwallerGroup , @NCCR_Catalysis . arxiv.org/abs/2511.03112

English

1.6K

Sergey Pozdnyakov@spozdn·22 Eki

@AlexShtf @HannesStaerk That is for sure. Your memory access should be coalesced. But even if you do everything right, shared memory is still much faster than main GPU memory, see, e.g., figure 2 from here aleksagordic.com/blog/matmul

English

Alex Shtoff@AlexShtf·22 Eki

@spozdn @HannesStaerk I think some of the speedup can come from just using the right memory layout, I.e, which dimensions of the learned parameters come in which order. Same for the argument.

English

183

Hannes Stark@HannesStaerk·19 Eki

Reading group session on Monday: "Lookup multivariate Kolmogorov-Arnold Networks" arxiv.org/abs/2509.07103 with Sergey Pozdnyakov On zoom at 9am PT / 12pm ET / 6pm CE(S)T: portal.valencelabs.com/starklyspeaking

English

3.7K

Sergey Pozdnyakov@spozdn·22 Eki

@AlexShtf @HannesStaerk I would say that the main speedup comes from using shared memory (L1 cache), and it is hardly achievable if using pure python. Nevertheless, there could be, indeed, some opportunities to speed up, if not so dramatically.

English

Alex Shtoff@AlexShtf·21 Eki

@spozdn @HannesStaerk @spozdn care to contribute to torchcurves? Id like to keep it pure python (no custom kernels) at this stage, but im pretty sure you have some learnings and expertise that transfers across implementation types.

English

148

Sergey Pozdnyakov@spozdn·21 Eki

@AlexShtf @HannesStaerk preliminary 50x faster than non-compiled torchcurves, and 13-15x faster than compiled torchcurves

English

Alex Shtoff@AlexShtf·21 Eki

@spozdn @HannesStaerk I see. Nice! I think it would be interesting to see a comparison of your custom kernels versus: - direct vectorized implementation of cox-deboor algorithm in pytorch (torch.searchsorted for lookup, then cox deboor for computation). - a torch.compile variant of the above.

English

Sergey Pozdnyakov@spozdn·21 Eki

(*) CUDA kernels so that lmKANs > MLPs on modern GPUs and (**) multivariate extension. GPUs & current ecosystem are made for dense matrix multiplications, not for lookup tables, so a naive implementation is very slow. With custom CUDA kernels, we made lmKANs Pareto optimal compared to MLPs on modern GPUs. Also, the proposed multivariate extension of KANs relying on 2d functions performs way better than standard 1D KANs.

English

Alex Shtoff@AlexShtf·20 Eki

By the way, I do not understand the main contributions of the work. Tensor product splines were known for decades. Efficient algorithms for parsnetric spline surfaces were written since the 90s in games and CAD apps. So is it the pyrorch port of these efficient algorithms? Or is it the observation that tensor product splines actually work well in practice for KANs? Or is it something else I missed? The paper's text isn't very explicit about it.

English

Sergey Pozdnyakov@spozdn·19 Eki

See you there! 🤗

Hannes Stark@HannesStaerk

English

280

Sergey Pozdnyakov retweetledi

Adil Kabylda@kabylda_·15 Eki

Our new work, “QCell: Comprehensive Quantum-Mechanical Dataset Spanning Diverse Biomolecular Fragments,” is now out on arXiv! 🌱 1/7

English

520

Sergey Pozdnyakov retweetledi

Philippe Schwaller (he/him)@pschwllr·14 Eki

Exciting postdoc opportunity in the @SchwallerGroup at EPFL! We're hiring a postdoc to advance ML-driven synthesis planning after Zlatko Joncev’s successful exit to co-found B-12 (YC '25) 🚀 Work on: - LLMs for strategic synthesis planning - Chemical reasoning at scale - Building the next-gen framework for retrosynthesis Our recent preprint shows that LLMs can guide synthesis planning with natural-language strategies — combining AI reasoning with traditional chemical tools (arxiv.org/abs/2503.08537). Join us at the intersection of chemistry & AI. Up to 2 years. Based in Lausanne 🇨🇭 Apply: forms.fillout.com/t/nnxVE3RcPpus #ChemTwitter #MachineLearning #SynthesisPlanning #PostdocPosition

English

7.7K

Sergey Pozdnyakov@spozdn·29 Eyl

@chaitjo Well, you need your NN to be able to express many body features of sufficiently large order to be expressive enough. Also, I don't think that restricting the body order really works as a preconditioning. Thus, I would say, the more the better, until it is computationally cheap.

English

Chaitanya K. Joshi@chaitjo·24 Eyl

ML Force Fields people - what is current consensus on importance of many body features? All the top models only use 2 GNN layers. Some decouple features body order from layers but most don’t…

English

3.2K

Sergey Pozdnyakov@spozdn·15 Eyl

Interesting question. Kolmogorov-Arnold Representation Theorem also requires one hidden layer, similarly to Cybenko's one. The difference is that the former states so for the finite (2n) number of neurons in the hidden layer. So, I would look into (and nearly certainly it is) that spline look-up-table KANs can fold the space similarly without the limit of an infinite number of neurons, as in the case of MLP.

English

Andrei Mircea@mirandrom·11 Eyl

@spozdn Cool idea! Makes me think of this visualization on how stacked ReLU linear layers fold a 2D space from youtube.com/watch?v=qx7hir…). Is there a way in which spline lookup tables are doing something similar with a single layer?

YouTube

English

Sergey Pozdnyakov@spozdn·10 Eyl

High-dimensional linear mappings, or linear layers, dominate both the parameter count and inference cost in most deep learning models. We propose a general-purpose drop-in replacement with a substantially better capacity - inference cost ratio. Check it out!🧵

GIF

English

3.7K

Sergey Pozdnyakov retweetledi

Xuan-Vu Nguyen@XuanVuNguyen18·13 Eyl

You don’t like molecular dynamics? We get it. That’s why at this year’s LLM hackathon for Chemistry and Materials Science, we built not one, but ✨two✨ AI agents for molecular dynamics 👇

English

12.4K

Sergey Pozdnyakov@spozdn·10 Eyl

Shout-out to my supervisor @pschwllr, and huge thanks to all the members of the LIAC group @SchwallerGroup ♥️♥️♥️

English

234

Sergey Pozdnyakov@spozdn·10 Eyl

We did it for Convolutional Neural Networks, and it works too. lmKAN-based CNNs are Pareto optimal on both the CIFAR-10 and ImageNet-1k datasets, achieving 1.6-2.1× reduction in inference FLOPs at matched accuracy.

English

275

Keşfet

@qubit_pharma @JeffGuo__ @efssh @pschwllr @SchwallerGroup @NCCR_Catalysis @AlexShtf @HannesStaerk