Ziheng Lu

34 posts

Ziheng Lu

Ziheng Lu

@luzihen

Ex. Principal Researcher @ Microsoft Research AI4Science, materials science, deep learning.

Katılım Mart 2021
99 Takip Edilen106 Takipçiler
Ziheng Lu
Ziheng Lu@luzihen·
@BenBlaiszik Good point. Big techs are hard to convince as this involves too much commercial interest. Something doable might be calling a special issue of some Nature sister journal dedicated to data… just to raise the incentives of academics.
English
0
0
1
35
Ben Blaiszik
Ben Blaiszik@BenBlaiszik·
The incentive structure in academia for sharing data has bothered me for years, and there's a clear first step we could take to make it a bit better. What's the problem? We reward articles and citations - this is good. But, we have a ~much~ lower reward for shared data that is now central for new discovery models. With the rise of data-first research, shown via many recent unicorn-valued science startups, it's clear that datasets are no longer a side product, but are foundational. It's my contention that lack of dataset sharing incentives in academia is an artificial roadblock that is reducing efficiency and holding back breakthroughs. Generating and curating TB of raw experimental data, annotating high-fidelity simulations, preparing cleaned ML datasets for reuse: these are major intellectual contributions that take significant time, create significant value, and deserve to be rewarded and fully valued. There's a first easy step that I think we could take in the right direction. Major aggregators like Google Scholar should change their stances and simply include datasets as first class research objects. Yes, this might inflate the number of contributions, but citations would sort out which are major contributions and dataset citations would become a lot easier with first-class indexing. What other small or big steps could we take? How could we convince Google and other aggregators to adopt this shift?
Ben Blaiszik tweet media
English
2
0
19
1.4K
Ziheng Lu
Ziheng Lu@luzihen·
@ruben_laplaza ESEN trained on Omat24 performs nicely on hessians. Not sure about UMA…. But I did notice the repo by FAIR is messed up: old ckpts are nowhere to find; training code is missing; no one responds to issues….
English
0
0
1
52
Rubén Laplaza
Rubén Laplaza@ruben_laplaza·
I keep seeing MACE omol hessians succeed where UMA omol hessians fail (many weird negative frequencies in minima, etc.), both backpropagated in the same way. I wonder what is up with that; is it something to do with the MoE or deeper?
English
9
2
12
1.9K
Ziheng Lu
Ziheng Lu@luzihen·
@jrib_ @Kavanagh_Sean_ @bkoz37 Comparison between Nequip and Allegro is very interesting. Were the same network parameters used (to make them comparable)?
English
2
0
1
151
Ziheng Lu
Ziheng Lu@luzihen·
@BenBlaiszik @NSF @ENERGY Good stuff! Are we sure the price on MatterSim is correct though… I would not expect it to be 10x cheaper than other MLFFs on T4….
English
1
0
2
101
Ben Blaiszik
Ben Blaiszik@BenBlaiszik·
🚀 We asked: What if the time to set up and run the best Machine-Learned Interatomic Potentials (MLIPs) took seconds, not days? Today, we release the MLIP Garden v0.1. What you can do now: - Experiment ~instantly - Scale deployments on experimental @NSF and @ENERGY systems
Ben Blaiszik tweet media
English
6
7
64
7K
Ziheng Lu
Ziheng Lu@luzihen·
@jrib_ Amazing project! Good visualization leads to good discovery! You are now officially my favorite developer! XD
English
1
0
3
112
Janosh
Janosh@jrib_·
Two updates to share, one personal and one about open source which I'm very excited about! Personal: My days in NYC are over. 4 weeks ago, Radical AI and I parted ways. Since then, I've gone into full hermit mode working on MatterViz (github.com/janosh/matterv…).
English
8
24
164
11.4K
Sam Blau
Sam Blau@SamMBlau·
The Open Molecules 2025 dataset is out! With >100M gold-standard ωB97M-V/def2-TZVPD calcs of biomolecules, electrolytes, metal complexes, and small molecules, OMol is by far the largest, most diverse, and highest quality molecular DFT dataset for training MLIPs ever made 1/N
Sam Blau tweet media
English
4
104
425
44.2K
Ziheng Lu
Ziheng Lu@luzihen·
@jrib_ Also, a quick suggestion would be to provide a DFT vs DFT plot on the top (or the bottom) as a reference. It might help understand the task.
English
0
0
0
68
Janosh
Janosh@jrib_·
here's what the symmetry mismatch Sankey diagrams look like: #sankey-diagrams-for-ml-vs-dft-spacegroups" target="_blank" rel="nofollow noopener">…tbench-discovery.materialsproject.org/tasks/geo-opt#… another thing i've thought about plotting with Sankey diagrams is DFT vs MLIP stability mismatch, taking into account both thermodynamic and kinetic stability which gives 4 options: 1. on convex hull, no imaginary phonons 2. above convex hull, no imaginary phonons 3. on convex hull, with imaginary phonons 4. above convex hull, with imaginary phonons would be cool to see how different models fare at this classification task
Janosh tweet media
Rhys Goodall@RhysGoodall

@jwt0625 Sankey diagrams are implemented via plotly in pymatviz github.com/janosh/pymatviz but beyond symmetry changes we haven’t really found any good to plot in this format

English
2
3
13
1.3K
Ziheng Lu
Ziheng Lu@luzihen·
@jrib_ Cool plot! I guess the result will depend on the threshold when determining symmetry?
English
1
0
1
115
Ziheng Lu retweetledi
Microsoft Research
Microsoft Research@MSFTResearch·
Thermal conductivity is critical in modern electronics, but in a post-Moore’s Law world, the need for novel structures that surpass the heat transfer properties of silicon is essential. Learn how AI is helping scientists discover these next-gen materials. msft.it/6013SnPAN
Microsoft Research tweet media
English
1
8
26
9.8K
Adeesh Kolluru
Adeesh Kolluru@AdeeshKolluru·
🚨 Thrilled to share Efficient Geometric Interatomic Potential (EGIP) - an efficient and accurate ML model for materials! Even though non-conservative MLIPs are not physically grounded, they offer significant efficiency. This work demonstrates that they are quite accurate on K_SRME as well. Blog - radical-ai.com/news/EGIP
English
3
7
55
5.8K
Ziheng Lu
Ziheng Lu@luzihen·
New work in collab w/ great @nanophononics - Saying “hey, stop searching for that material, it does not exist!” is often harder than finding the material itself. With AI and some careful constraints, we try to probe the upper limit of heat transfer in matter
Davide Donadio@nanophononics

Using @MSFTResearch MatterSim model, we have explored the upper limits of bulk materials' thermal conductivity. While we found several highly conductive materials, none has a thermal conductivity as high as diamond. @ZNanotheory @luzihen @HongxiaHao arxiv.org/abs/2503.11568

English
0
2
13
897
Janosh
Janosh@jrib_·
@funroll_loops not atm, but potentially in future iterations of the dataset. there's definitely enough to do for years into the future. maybe we can aim for a new matPES release every year from 2024 and successively implement more data variety
English
1
0
2
80
Janosh
Janosh@jrib_·
that's exactly what the MatPES colab with @professorong is about! ofc i'm biased but I think that workflow makes good accuracy/cost tradeoffs and using DFT only for statics means it's very efficient.could be used to generate an even higher quality dataset github.com/materialsproje…
English
1
0
1
1.7K
Ziheng Lu retweetledi
Frank Noe
Frank Noe@FrankNoeBerlin·
The BioEmu-1 model and inference code are now public under MIT license!!! Please go ahead, play with it and let us know if there are issues. github.com/microsoft/bioe…
Frank Noe@FrankNoeBerlin

Super excited to preprint our work on developing a Biomolecular Emulator (BioEmu): Scalable emulation of protein equilibrium ensembles with generative deep learning from @MSFTResearch AI for Science. #ML #AI #NeuralNetworks #Biology #AI4Science biorxiv.org/content/10.110…

English
5
95
352
31.3K
Xiang Fu
Xiang Fu@xiangfu_ml·
For existing MLIPs, lower test errors do not always translate to better performance in downstream tasks. We bridge this gap by proposing eSEN -- SOTA performance on compliant Matbench-Discovery (F1 0.831, κSRME 0.321) and phonon prediction. arxiv.org/abs/2502.12147 1/6
English
7
23
98
12.9K