Tanmoy Sanyal

250 posts

Tanmoy Sanyal

@hiddenvariable2

Protein design @Amgen. Previously, @novonordisk, @salilab_ucsf, @UCSBCHE, @IITKgp.

San Francisco Bay Area, CA Beigetreten Haziran 2016

753 Folgt286 Follower

Tanmoy Sanyal retweetet

Diego del Alamo@DdelAlamo·24 Eyl

Retraining PLMs with newly deposited sequences doesn't guarantee better performance. Authors trained PLMs on each copy of UniRef100 from 2011 to 2024; the largest performance boost (2021->2022) coincided with the removal of unusually large number of invalid sequences from UniRef

Biology+AI Daily@BiologyAIDaily

Protein Language Models: Is Scaling Necessary? - This paper challenges the prevailing belief that scaling up protein language models (pLMs) is essential for better performance, proposing that careful data curation can achieve comparable results at a fraction of the cost. - The authors introduce AMPLIFY, a protein language model that outperforms state-of-the-art models like ESM2 15B, while being 43 times smaller in terms of parameters and 17 times more efficient in training. - AMPLIFY’s success is attributed to using high-quality, curated datasets rather than simply increasing model size. This allows for better generalization and less overfitting, particularly in tasks like sequence recovery and protein design. - By focusing on natural sequence space and eliminating noise from datasets, AMPLIFY reduces computational costs and energy consumption, democratizing pLM development for smaller research labs. - The paper emphasizes that data quality is more important than model size, with findings showing that models trained on well-curated datasets significantly outperform models trained on larger but noisier datasets. - AMPLIFY exhibits emergent behaviors in tasks like distinguishing real proteins from non-proteins, even in zero-shot settings. It can also handle intrinsically disordered proteins better than structure-based models like AlphaFold2. - The authors call for a shift away from scaling as the main driver of improvement in pLMs, advocating for better dataset curation and efficient architectures to build robust, high-performing models. @apsarathchandar @bnschlz 💻Code: github.com/chandar-lab/AM… 📜Paper: biorxiv.org/content/10.110…

English

101

33.4K

Tanmoy Sanyal retweetet

Rohit Singh@rohitsingh8080·21 Eyl

Let me tell you a story. It'll end up at the current tech-bio and protein design scene. But the story starts about 25 years earlier. Did you know that, commercially, the human genome project precipitated the end and not the start of a genomics boom? 1/

English

424

116.1K

Tanmoy Sanyal retweetet

Ben Kellman bkell1123.bsky.social@bkell1123·16 May

1/5 Today, I’m sharing 7yrs of work. I believe we discovered a comprehensive mapping from protein structure to glycan structure, a genetic encoding for specific glycans, and a new paradigm in extracellular biology. biorxiv.org/content/10.110… #Glycotime

Ben Kellman bkell1123.bsky.social tweet media

English

149

596

131.8K

Tanmoy Sanyal retweetet

Olexandr Isayev 🇺🇦🇺🇸@olexandr·24 Nis

Editors rant: how many more GNN or message-passing architectures do we *really* need to score/predict protein-ligand interactions? A new one appears every day!!! 🤯#compchem

GIF

Pittsburgh, PA 🇺🇸 English

9.7K

Tanmoy Sanyal retweetet

andrew blevins@Andrewdblevins·4 Nis

Working on ML for Drug Discovery I have been frustrated with the size and/or quality of publicly available datasets to train and benchmark models with, so when my co-founder and I started our company we swore we would open-source some data as quickly as possible.

English

211

42K

Tanmoy Sanyal retweetet

Shruthi Viswanath@shruthiLab·1 Şub

arxiv.org/abs/2401.17894 We wrote a review on recent developments in integrative modeling and frontiers in the field for a Springer Mol Bio Handbook. Comments are appreciated! Deep work by co-first authors @s_arvindekar @KartikMajila

English

2.6K

Tanmoy Sanyal retweetet

Chris Bakke@ChrisJBakke·12 Ara

The sad reality is that most people don't have what it takes to work in tech: Up at 4am. Post a pic of my new Eight Sleep in the group chat for sweet, sweet engagement. Hit the gym. Crush 8 jumping jacks. 35 minute cold plunge. Rip a My First Million episode at 2x speed. Drink a bottle of Bryan Johnson olive oil. Eat 4 bags of Athletic Greens powder. Feel sick. Power through. Open laptop. Knock out 7-8 emoji reactions to threads on Slack. Grab lunch. (A 2nd bottle of olive oil) Open Jira. Comment "any updates?" on 3 tickets. Wind down for the day. Open Substack and resume writing Part 4 of "Problems with European Work Ethic: a San Francisco perspective" - a banger that got 8 likes on Substack and two thumbs up in the group chat.

English

237

397

7.8K

Tanmoy Sanyal retweetet

Krishnan@cvkrishnan·4 Ara

WAIT!! WHAT?!!

English

725

4.8K

526.9K

Tanmoy Sanyal retweetet

India Research Watch (IRW)@IRWatchdog·27 Eki

𝐈𝐧𝐝𝐢𝐚𝐧 𝐑𝐞𝐬𝐞𝐚𝐫𝐜𝐡 𝐂𝐫𝐢𝐬𝐢𝐬 - 𝐍𝐨𝐭 𝐣𝐮𝐬𝐭 𝐚 𝐒𝐦𝐨𝐤𝐢𝐧𝐠 𝐆𝐮𝐧 𝐛𝐮𝐭 𝐚 𝐃𝐮𝐦𝐩𝐬𝐭𝐞𝐫 𝐅𝐢𝐫𝐞 If you had any doubts about Indian Research being in a deep crisis, this graph should definitively lay them to rest. (1/N) 🧵 #IndiaResearchCrisis

English

104

289

78.2K

Tanmoy Sanyal retweetet

ISRO@isro·23 Ağu

Chandrayaan-3 Mission: 'India🇮🇳, I reached my destination and you too!' : Chandrayaan-3 Chandrayaan-3 has successfully soft-landed on the moon 🌖!. Congratulations, India🇮🇳! #Chandrayaan_3 #Ch3

English

68.4K

269.7K

818.8K

71M

Tanmoy Sanyal@hiddenvariable2·23 Ağu

In 2008 October, I was back from college for a weekend visit to my parents, when the Chandrayaan-1 mission happened. 15 years later life comes full circle: my parents are visiting me on their vacation when Chandrayaan-3 soft-lands an unmanned rover on the moon!

English

519

Tanmoy Sanyal@hiddenvariable2·1 Ağu

Excited to have nearly the entire integrative structural modeling community in one place: tremendous effort by @shruthiLab! Also, I am teaching a brief workshop on writing Python-based custom restraints within the Integrative Modeling Platform (IMP), please attend if interested.

Shruthi Viswanath@shruthiLab

Join us for this exciting conference in computational structural biology at IISER Pune next month!

English

429

Tanmoy Sanyal retweetet

Bojan Tunguz@tunguz·18 Tem

I wonder what does it feel like to be a normie and work in a field where you don’t have to upskill every couple of hours.

English

107

1.4K

252.4K

Tanmoy Sanyal retweetet

Shruthi Viswanath@shruthiLab·14 Haz

Register by this week for conference on Macromolecular assemblies @IISERPune and workshop on integrative modeling with @salilab_ucsf bit.ly/462MmlN

English

4.7K

Tanmoy Sanyal retweetet

Ben Shor@ben_shor·17 May

1/8 Excited to introduce CombFold - an algorithm for predicting large multi-protein complexes using the combinatorial and hierarchical assembly of #AlphaFold models. @DinaSchneidman biorxiv.org/content/10.110…

English

179

28.1K

Tanmoy Sanyal retweetet

Krishnaswamy Lab@KrishnaswamyLab·25 Nis

Some perspective for scientists who may be asked to review computational methods: Computational methods are indeed innovated by taking existing mathematical and algorithmic atoms and putting them together in a novel way. Second, small changes in the steps can have a large effect on the outcome! Examples below: The classic ISOMAP method could be considered a combination of shortest paths on graphs + MDS. Neither of these was novel but ISOMAP was one of the first breakthrough manifold learning algorithms. Spectral clustering is just using graph Laplacian eigenvectors and K-means, again the combo is what makes this able to detect arbitrarily shaped clusters. The difference between SNE and t-SNE was just the "t" (student t-distribution) but look at how much one is used over the other :). The difference between a GAN and a conditional GAN is just a conditioning input signal. Please do comment on anyone's paper that because the atoms they used are not novel, or the change doesn't seem vast, that the entire method is not novel. This is completely unfair and ignores how computational innovation proceeds...

English

135

28.4K

Tanmoy Sanyal retweetet

Dmitry Kobak@hippopedoid·13 Nis

Really excited to present new work by @ritagonmar: we visualized the entire PubMed library, 21 million biomedical and life science papers, and learned a lot about -- THE LANDSCAPE OF BIOMEDICAL RESEARCH biorxiv.org/content/10.110… Joint work with @CellTypist and @benmschmidt. 1/n

English

380

1.3K

796.8K

Tanmoy Sanyal@hiddenvariable2·7 Nis

@ivivek87 What about Denmark? :)

English

111

Vivek Das@ivivek87·7 Nis

Italy’s ChatGPT ban spreads to France, Germany and Ireland Not surprsing at all given how data protection and privacy regulation is viewed in EU as compared to US. This shall to pass, however, monopolization needs a check! 😉 stealthoptional.com/news/italys-ch…

English

945

Tanmoy Sanyal retweetet

Ben Blaiszik@BenBlaiszik·6 Nis

We wrapped up the first LLM hackathon for applications in materials and chemistry last week. The results to me were astounding. We are at the point now where some tasks that took years can now be completed in days. Here is a list of the fantastic submissions!

English

355

2.2K

996.5K

Tanmoy Sanyal retweetet

Dr. Holly Walters@Manigarm·27 Mar

I'm serious. STEM without the Arts, Social Sciences, and Humanities will produce more "innovative" tech bros who giddily reinvent rent, roommates, taxes, and now...roller skates. With complete, straight-faced, sincerity. This is a problem. And I have a list (So, thread 🧵)

The Rundown AI@TheRundownAI

These new AI shoes can make you walk 250% faster. Moonwalkers use AI to learn your step gait/speed and adapt to you. The shoe has two modes: lock, and shift, and will only work when you move.

English

183

7.2K

31.8K

4.4M

Entdecken

@s_arvindekar @KartikMajila @shruthiLab @IISERPune @salilab_ucsf @DinaSchneidman @ritagonmar @benmschmidt