caithmac

311 posts

caithmac

@caithmac

I will build in public! What tho? My life. PhD Computer Science (ongoing) Computer Vision,ML/DL, Philosophy,History.

Katılım Mart 2024

393 Takip Edilen36 Takipçiler

Sabitlenmiş Tweet

caithmac@caithmac·2 Nis

We just published our work on an explainable active learning framework for ligand–protein binding affinity prediction in Digital Discovery. 🔗 pubs.rsc.org/en/content/art… Here’s a quick breakdown of what we did and why 👇

English

501

caithmac@caithmac·1 May

@DdelAlamo In our paper... 650M works. I was co-author of this paper. I just work with ESM35M. I even prefer just tokenising amino acid with 20 letters. And honestly speaking internally it works. Will need to get compute and try ESM15B now.

English

Diego del Alamo@DdelAlamo·1 May

@caithmac You jest but I've been on many interview panels and seen all kinds of workflows you wouldn't believe

English

144

Diego del Alamo@DdelAlamo·1 May

FTFY

Imran S. Haque (@ihaque@{bsky,genomic}.social)@ImranSHaque

just sayin

Dansk

caithmac@caithmac·11 Nis

@stu_dying6 No.

189

lehen ki bodi@stu_dying6·10 Nis

if you are an indian man in academia there's a 75% chance that you are wearing a half sleeve cotton printed shirt right now which is probably from fabindia

English

1.8K

38.5K

caithmac@caithmac·11 Nis

@yajnadevam Easy peasy.

English

yajnadevam@yajnadevam·11 Nis

week1. you get a distance r, which means it is ON a circle of radius r of which you are the center. Week2. Move to any point in the circle. Now you get a second value n, the check is on the intersection of the two circles. There are exactly two points, look at them both.

Hazel Appleyard@HazelAppleyard

Absolutely not????

English

1.2K

caithmac@caithmac·9 Nis

@WhateverVishal There is.

English

Oxygen 💨@WhateverVishal·7 Nis

This shade of blue and green is no where in Vishwaguru

DreamyCityScapes✨@TheDreamyCity

London, england

English

134

225

152.7K

caithmac@caithmac·8 Nis

@OwainEvans_UK Have applied, and gave your name as a mentor! Hope to work with you.

English

113

Owain Evans@OwainEvans_UK·8 Nis

This is best way to collaborate with my group. Previous fellows have been hired for full-time roles.

🚀Henry is leading AI Safety Research Programs@sleight_henry

🚀 Applications are now open: Constellation's Astra Fellowship 🚀 Fully funded, 5-month fellowship at our Berkeley research institute. Pair with mentors across empirical AI safety research, strategy, and governance at @ConstellOrg! 📅 Apply by May 3rd (begins Sep 2026) 🔗 constellation.org/programs/astra…

English

187

22.8K

caithmac@caithmac·5 Nis

@kllatr4x No.

372

Æ@kllatrzx·5 Nis

Take the L, you're the one cvcking yourself. You grind your ass raw for the republic & in return it just rawdogs your gaping asshole. My one piece of advice to GCs: loot this shithole as hard as you can, squeeze every last ₹ out of the oppression machine. IND can go fuck itself.

Awasthi@Awasthiii18

Meanwhile that deshbhakt GC who worked his ass off to build the republic.

English

151

920

38.2K

caithmac@caithmac·2 Nis

SHAP analysis revealed chemically meaningful features driving predictions. The model learns to focus on SAR-relevant motifs over time. We identified key fragments for high affinity (e.g., halogens for TYK2).

English

caithmac@caithmac·2 Nis

One interesting observation: The samples selected by our method are not just “uncertain” they often correspond to meaningful interaction patterns. This gives more confidence in the active learning loop.

English

caithmac@caithmac·2 Nis

We evaluate the framework across multiple settings and compare against standard baselines. Key takeaway: We can maintain (or improve) performance while gaining explainability.

English

caithmac@caithmac·2 Nis

Big thanks to all collaborators and reviewers who helped improve the work. @gorantlarohan @ppxasjsm If you’re working on: • drug discovery • active learning • explainable ML we’d love to hear your thoughts!

English

caithmac@caithmac·2 Nis

This is not a final solution, but a step toward: • More transparent ML models • Better human–AI collaboration in science • Active learning systems that scientists can actually trust

English

caithmac@caithmac·2 Nis

Instead of just predicting affinity, the model provides insight into: 👉 Which parts of the ligand and protein matter 👉 Why a sample is selected during active learning This helps move from “prediction” → “understanding”.

English

caithmac@caithmac·2 Nis

Our goal was simple: 👉 Build an active learning framework that is not only effective 👉 But also explainable So that model decisions can be inspected, trusted, and potentially acted upon.

English

caithmac@caithmac·2 Nis

Most active learning methods are black boxes. They may pick good samples, but don’t tell us why those samples are useful. In drug discovery, that lack of interpretability is a real limitation.

English

caithmac@caithmac·2 Nis

Predicting binding affinity is central to drug discovery, but data is expensive. Active learning helps by selecting which experiments to run next, instead of blindly collecting more data. But there’s a problem…

English

caithmac@caithmac·2 Nis

English

501

caithmac retweetledi

bioRxiv Biophysics@biorxiv_biophys·20 Ara

Explainable Active Learning Framework for Ligand BindingAffinity Prediction biorxiv.org/content/10.648… #biorxiv_biophys

English

390

caithmac@caithmac·1 Nis

@TheOneKloud @SchmidhuberAI When I read BYOL, I was impressed and at first glance saw how similar it was to JEPA(how JEPA was similar to BYOL), so I know now the whole lore!

English

397

Pierre Richemond 🇪🇺@TheOneKloud·31 Mar

1/8:Huge fan of @SchmidhuberAI's foundational work. Since BYOL is being discussed, I wanted to share some thoughts as a co-author who helped shape its intellectual foundations on where it comes from and why it works — which I think involves a confluence of three distinct pillars. 🧵

Jürgen Schmidhuber@SchmidhuberAI

Dr. LeCun's heavily promoted Joint Embedding Predictive Architecture (JEPA, 2022) [5] is the heart of his new company. However, the core ideas are not original to LeCun. Instead, JEPA is essentially identical to our 1992 Predictability Maximization system (PMAX) [1][14]. Details in reference [19] which contains many additional references. Motivation of PMAX [1][14]. Since details of inputs are often unpredictable from related inputs, two non-generative artificial neural networks interact as follows: one net tries to create a non-trivial, informative, latent representation of its own input that is predictable from the latent representation of the other net’s input. PMAX [1][14] is actually a whole family of methods. Consider the simplest instance in Sec. 2.2 of [1]: an auto encoder net sees an input and represents it in its hidden units (its latent space). The other net sees a different but related input and learns to predict (from its own latent space) the auto encoder's latent representation, which in turn tries to become more predictable, without giving up too much information about its own input, to prevent what's now called “collapse." See illustration 5.2 in Sec. 5.5 of [14] on the "extraction of predictable concepts." The 1992 PMAX paper [1] discusses not only auto encoders but also other techniques for encoding data. The experiments were conducted by my student Daniel Prelinger. The non-generative PMAX outperformed the generative IMAX [2] on a stereo vision task. The 2020 BYOL [10] is also closely related to PMAX. In 2026, @misovalko, leader of the BYOL team, praised PMAX, and listed numerous similarities to much later work [19]. Note that the self-created “predictable classifications” in the title of [1] (and the so-called “outputs” of the entire system [1]) are typically INTERNAL "distributed representations” (like in the title of Sec. 4.2 of [1]). The 1992 PMAX paper [1] considers both symmetric and asymmetric nets. In the symmetric case, both nets are constrained to emit "equal (and therefore mutually predictable)" representations [1]. Sec. 4.2 on “finding predictable distributed representations” has an experiment with 2 weight-sharing auto encoders which learn to represent in their latent space what their inputs have in common (see the cover image of this post). Of course, back then compute was was a million times more expensive, but the fundamental insights of "JEPA" were present, and LeCun has simply repackaged old ideas without citing them [5,6,19]. This is hardly the first time LeCun (or others writing about him) have exaggerated LeCun's own significance by downplaying earlier work. He did NOT "co-invent deep learning" (as some know-nothing "AI influencers" have claimed) [11,13], and he did NOT invent convolutional neural nets (CNNs) [12,6,13], NOR was he even the first to combine CNNs with backpropagation [12,13]. While he got awards for the inventions of other researchers whom he did not cite [6], he did not invent ANY of the key algorithms that underpin modern AI [5,6,19]. LeCun's recent pitch: 1. LLMs such as ChatGPT are insufficient for AGI (which has been obvious to experts in AI & decision making, and is something he once derided @GaryMarcus for pointing out [17]). 2. Neural AIs need what I baptized a neural "world model" in 1990 [8][15] (earlier, less general neural nets of this kind, such as those by Paul Werbos (1987) and others [8], weren't called "world models," although the basic concept itself is ancient [8]). 3. The world model should learn to predict (in non-generative "JEPA" fashion [5]) higher-level predictable abstractions instead of raw pixels: that's the essence of our 1992 PMAX [1][14]. Astonishingly, PMAX or "JEPA" seems to be the unique selling proposition of LeCun's 2026 company on world model-based AI in the physical world, which is apparently based on what we published over 3 decades ago [1,5,6,7,8,13,14], and modeled after our 2014 company on world model-based AGI in the physical world [8]. In short, little if anything in JEPA is new [19]. But then the fact that LeCun would repackage old ideas and present them as his own clearly isn't new either [5,6,18,19]. FOOTNOTES 1. Note that PMAX is NOT the 1991 adversarial Predictability MINimization (PMIN) [3,4]. However, PMAX may use PMIN as a submodule to create informative latent representations [1](Sec. 2.4), and to prevent what's now called “collapse." See the illustration on page 9 of [1]. 2. Note that the 1991 PMIN [3] also predicts parts of latent space from other parts. However, PMIN's goal is to REMOVE mutual predictability, to obtain maximally disentangled latent representations called factorial codes. PMIN by itself may use the auto encoder principle in addition to its latent space predictor [3]. 3. Neither PMAX nor PMIN was my first non-generative method for predicting latent space, which was published in 1991 in the context of neural net distillation [9]. See also [5-8]. 4. While the cognoscenti agree that LLMs are insufficient for AGI, JEPA is so, too. We should know: we have had it for over 3 decades under the name PMAX! Additional techniques are required to achieve AGI, e.g., meta learning, artificial curiosity and creativity, efficient planning with world models, and others [16]. REFERENCES (easy to find on the web): [1] J. Schmidhuber (JS) & D. Prelinger (1993). Discovering predictable classifications. Neural Computation, 5(4):625-635. Based on TR CU-CS-626-92 (1992): people.idsia.ch/~juergen/predm… [2] S. Becker, G. E. Hinton (1989). Spatial coherence as an internal teacher for a neural network. TR CRG-TR-89-7, Dept. of CS, U. Toronto. [3] JS (1992). Learning factorial codes by predictability minimization. Neural Computation, 4(6):863-879. Based on TR CU-CS-565-91, 1991. [4] JS, M. Eldracher, B. Foltin (1996). Semilinear predictability minimization produces well-known feature detectors. Neural Computation, 8(4):773-786. [5] JS (2022-23). LeCun's 2022 paper on autonomous machine intelligence rehashes but does not cite essential work of 1990-2015. [6] JS (2023-25). How 3 Turing awardees republished key methods and ideas whose creators they failed to credit. Technical Report IDSIA-23-23. [7] JS (2026). Simple but powerful ways of using world models and their latent space. Opening keynote for the World Modeling Workshop, 4-6 Feb, 2026, Mila - Quebec AI Institute. [8] JS (2026). The Neural World Model Boom. Technical Note IDSIA-2-26. [9] JS (1991). Neural sequence chunkers. TR FKI-148-91, TUM, April 1991. (See also Technical Note IDSIA-12-25: who invented knowledge distillation with artificial neural networks?) [10] J. Grill et al (2020). Bootstrap your own latent: A "new" approach to self-supervised Learning. arXiv:2006.07733 [11] JS (2025). Who invented deep learning? Technical Note IDSIA-16-25. [12] JS (2025). Who invented convolutional neural networks? Technical Note IDSIA-17-25. [13] JS (2022-25). Annotated History of Modern AI and Deep Learning. Technical Report IDSIA-22-22, arXiv:2212.11279 [14] JS (1993). Network architectures, objective functions, and chain rule. Habilitation Thesis, TUM. See Sec. 5.5 on "Vorhersagbarkeitsmaximierung" (Predictability Maximization). [15] JS (1990). Making the world differentiable: On using fully recurrent self-supervised neural networks for dynamic reinforcement learning and planning in non-stationary environments. Technical Report FKI-126-90, TUM. [16] JS (1990-2026). AI Blog. [17] @GaryMarcus. Open letter responding to @ylecun. A memo for future intellectual historians. Substack, June 2024. [18] G. Marcus. The False Glorification of @ylecun. Don’t believe everything you read. Substack, Nov 2025. [19] J. Schmidhuber. Who invented JEPA? Technical Note IDSIA-3-22, IDSIA, Switzerland, March 2026. people.idsia.ch/~juergen/who-i…

English

19.3K

Keşfet

@DdelAlamo @stu_dying6 @yajnadevam @WhateverVishal @OwainEvans_UK @gorantlarohan @ppxasjsm @elonmusk