Dan Ofer (Was @ICML,@Worldcon )

4.7K posts

Dan Ofer (Was @ICML,@Worldcon ) banner
Dan Ofer (Was @ICML,@Worldcon )

Dan Ofer (Was @ICML,@Worldcon )

@danofer

#Data scientist, #Researcher, Bioinformatician, Photographer, Geek & Bookworm. PhD #AI #LLM @HebrewU @HyadataLab @liniallab @shebaARC

Israel Katılım Mayıs 2008
1.1K Takip Edilen1.1K Takipçiler
Sabitlenmiş Tweet
Dan Ofer (Was @ICML,@Worldcon )
1/ Our paper “Protein Language Models Expose Viral Immune Mimicry” is now published in Viruses! We show that protein Language Models can identify viral proteins, and those that fool our immune system.
English
2
1
15
1.5K
Kevin K. Yang 楊凱筌
Kevin K. Yang 楊凱筌@KevinKaichuang·
Screen 1M random protein sequences to discover that biology-like folds are accessible from random sequences with surprising frequency @KlaraH_lab
Kevin K. Yang 楊凱筌 tweet mediaKevin K. Yang 楊凱筌 tweet media
English
3
43
280
22.7K
Dan Ofer (Was @ICML,@Worldcon ) retweetledi
University of California
University of California@UofCalifornia·
Pancreatic cancer is one of the most dire diagnoses in medicine with few available treatments. Until now, thanks to university research, including @UCSF scientists, and federal investment in science research. Read about this huge breakthrough via @nytimes nyti.ms/4wfziXs
University of California tweet media
English
2
32
119
40.9K
Jonathan Blow
Jonathan Blow@Jonathan_Blow·
Something we've been working on...
English
271
728
9.8K
1.2M
Will Bui
Will Bui@will_ea·
@danofer @LLMenjoyer It is different. The kernels in our package are for Block AttnRes. For multiple queries per block, it is more efficient. The kernels in FLA are more suited for Full AttnRes.
English
1
0
1
33
Dan Ofer (Was @ICML,@Worldcon )
@miangoar Weird in that for some benchmarks, model size | precompute | Being pretrained at all (vs random init), doesn't yield better performance. (Lots of XORs)
English
0
0
0
11
GAMA Miguel Angel 🐦‍⬛🔑
@danofer That’s interesting. Weird in what sense? For example, something like showing a double descent behavior when training a PLM? Or weird in the sense that a smaller PLM shows better performance than its larger version?
English
1
0
0
12
Frank Hutter
Frank Hutter@FrankRHutter·
The data science revolution is here now. TabPFN-3 is live, taking tabular foundation models to enterprise scale 🤩 1M training rows on a single H100. No training. No tuning. Load and predict. 🧵 1/5 #tabpfn #tabularfoundationmodels #priorlabs
Frank Hutter tweet media
English
9
46
280
25.7K
Dan Ofer (Was @ICML,@Worldcon )
@mayukh_panja It depends what you mean by balance. There's a gaping difference between "I want to do a job simultaneously" to "I can't meet up this week(end), we're doing a submission"; to "I left the lab before 11PM twice this month, and I stress cried because everyone was still working".
English
0
0
0
852
Mayukh
Mayukh@mayukh_panja·
I don’t agree. A PhD student should not prioritize work-life balance. Getting to do a PhD is a privilege. You are paid to think. There is no pressure for you to be economically useful. It is a unique opportunity to push the boundaries of human knowledge and produce something ground breaking. And nothing great ever happens without complete devotion. Look at everything that moved and shaped the world. Every single person who created anything meaningful, in science, in arts, in music, in movies, devoted their lives to their craft. Extraordinary outcomes require extraordinary inputs and some degree of sacrifice. Sure, have work-life balance during your PhD. But be content a mediocre outcome.
Dr. Manabendra Saharia@m_saharia

Yesterday, I was giving an intro talk to our dept's new PhD students. Technical things aside, my number 1 suggestion has remained the same over the years: Treat your PhD like a job. - Avoid 1.5h lunch and three tea breaks. - Avoid gossiping and loitering at work. - Lab at 9 am and leave at 6 pm. Being productive till 11 pm in the lab is a lie people till themselves when their day starts at 1 PM. Everything worth doing can be done with high intensity focus during work hours. And having fun in life is the secret to being productive in a marathon.

English
359
277
3K
1.1M
(((ل()(ل() 'yoav))))👾
generally, experiments with LLMs suck. even (esp?) from big players like anthropic. one failure mode is an experiment in which you change the input in some controlled way, and see a change in output. say, in 10% of cases you changed male to female, the response got ruder. you conclude that the model is rude to females. but... if you just did some other change (say, change active to passive voice), you also see that in 10% of cases the model got ruder. in other words, we failed to control the experiment. this is experimentation 101, but new results falling for this. i guess CS people just kinda suck at experimentation. anyways, extremely common. this work documents the issues, and offers guidelines on how to improve. (i'm on this paper, but did very little. i am strongly supportive of the message though)
Zihao (Gavin) Yang@ZihaoGavinYang

1/ (New paper!) If swapping the gender in an input prompt makes the AI model give a different answer it means that it has to have a gender bias, right? Wrong. 🧵on counterfactual prompting for LLM evals: Paper: arxiv.org/abs/2605.01048

English
8
25
289
39K
(((ل()(ل() 'yoav))))👾
"I've been doing AI for 20 years and ..." and nothing. LLMs are new. LLM-Agents are new. our 20+ years experience with AI/ML/NLP may be marginally useful for understanding aspects of their training, but thats about it. we need new tools and experiences. we dont deserve authority.
English
29
31
401
23.1K
GAMA Miguel Angel 🐦‍⬛🔑
“entire PDB archive is conservatively estimated at ∼US$20B, assuming an average cost of ∼US$100K for regenerating each experimental structure” academic.oup.com/nar/article/51… For Uniprot, the annual economic value is estimated between €332M - €524M ebi.ac.uk/about/news/ann…
GAMA Miguel Angel 🐦‍⬛🔑 tweet media
Shae McLaughlin@shae_mcl

It’s estimated that the Protein Data Bank (PDB) cost around $13B to create. Alphafold was only possible because of it. If we want ML to solve biology, we should be funding the creation of databases and the development of new assay technologies. ML is nothing without data.

English
2
1
26
3.2K
Kevin K. Yang 楊凱筌
Kevin K. Yang 楊凱筌@KevinKaichuang·
We did 370 experiments to discover that protein language models primarily learn structure and won't scale for protein function prediction. We need new pretraining tasks! Work led by @francescazfl with @avapamini @yisongyue @alexijielu See Alex's thread + the paper for more!
Kevin K. Yang 楊凱筌 tweet mediaKevin K. Yang 楊凱筌 tweet mediaKevin K. Yang 楊凱筌 tweet mediaKevin K. Yang 楊凱筌 tweet media
Alex Lu@alexijielu

Announcing our preprint understanding transfer learning for protein language models (PLMs), led by former MSRNE intern @francescazfl, with @KevinKaichuang @avapamini @yisongyue Key takeaway: PLMs do not scale for anything except structure! 🧵👇 biorxiv.org/content/10.110…

English
14
133
718
150K
XxCapitanEsxX
XxCapitanEsxX@B_Rockin_98·
Yeah dude same 😭
English
514
5.2K
162.8K
6.4M