Hong Ge

217 posts

Hong Ge

@Hong_Ge2

Senior Research Fellow at University of Cambridge

Cambridge, England Katılım Kasım 2018

258 Takip Edilen414 Takipçiler

Hong Ge retweetledi

Atılım Güneş Baydin@atilimgunes·21 Mar

Tim Hargreaves from @Cambridge_Uni and Charles Knipp from @federalreserve gave a contributed talk on state space model programming in Turing.jl #LAFI2025 @poplconf

English

314

Hong Ge retweetledi

Cameron@cameron_pfiffer·1 Mar

New publication out that I am a (tangential) coauthor on! I did some work on Turing.jl years ago. Big thanks to all my colleagues who did all the work in getting this prepped. dl.acm.org/doi/10.1145/37… #julialang

English

835

Hong Ge@Hong_Ge2·28 Oca

@sirbayes @sirbayes, I'm curious why efficient implementation helps learning here: Are you hinting that previous attempts suffer from insufficient convergence due to longer runtime?

English

237

Kevin Patrick Murphy@sirbayes·28 Oca

I read the R1 zero paper and the method is very simple , just a tweak to PPO to fine tune deepseek v3 base using a verifiable sparse binary reward. The fact that they got it to work even though others failed is likely due to better data and/or their very efficient implementation

thebes@voooooogel

why did R1's RL suddenly start working, when previous attempts to do similar things failed? theory: we've basically spent the last few years running a massive acausally distributed chain of thought data annotation program on the pretraining dataset. deepseek's approach with R1 is a pretty obvious method. they are far from the first lab to try "slap a verifier on it and roll out CoTs." but it didn't used to work that well. all of a sudden, though, it did start working. and reproductions of R1, even using slightly different methods, are just working too--it's not some super-finicky method that deepseek lucked out finding. all of a sudden, the basic, obvious techniques are... just working, much better than they used to. in the last couple of years, chains of thought have been posted all over the internet (LLM outputs leaking into pretraining like this is usually called "pretraining contamination"). and not just CoTs--outputs posted on the internet are usually accompanied by linguistic markers of whether they're correct or not ("holy shit it's right", "LOL wrong"). this isn't just true for easily verifiable problems like math, but also fuzzy ones like writing. those CoTs in the V3 training set gave GRPO enough of a starting point to start converging, and furthermore, to generalize from verifiable domains to the non-verifiable ones using the bridge established by the pretraining data contamination. and now, R1's visible chains of thought are going to lead to *another* massive enrichment of human-labeled reasoning on the internet, but on a far larger scale... the next round of base models post-R1 will be *even better* bases for reasoning models.

English

455

60.2K

Hong Ge@Hong_Ge2·6 Kas

Cambridge is a great place to do a PhD in ML.

Cambridge MLG@CambridgeMLG

✨Applications are now open for PhDs at the Cambridge Machine Learning Group!✨ We're looking for outstanding candidates interested in fundamental ML research and applications to scientific domains! More info: mlg.eng.cam.ac.uk/phd_programme_… 🧵Find more about PIs & focus areas below!

English

712

Hong Ge retweetledi

John Hopfield@HopfieldJohn·28 Eki

This award is very hard to respond to. I have received many hundred congratulatory notes, from former students, post-docs, Princeton University juniors and seniors, funding agencies and foundations, authors, signature collectors, amateurs, elementary school neural network followers, and on and on. An astonishing fraction of them has found their way into useful and interesting Neural Network careers by a casual interaction in class, at a meeting, hearing what I had to say about their ideas, learning from thinking about how I worked with a class, or from being my teaching assistants... There are some whom I remember well, and others for whom my reaction is “are they certain that our interaction sparked a single usable thought?” Yet they go on and comment “you changed my life” and follow on to explain that they heard me lecture when they were 15, and have been a member of the Neural Network brigade of the research army ever afterward. I cannot make detailed comments to most of my letter writers. In sum I can only say that I tremendously enjoyed the interactions that the Neural Network community provided me with; that the mutual interactions have given me much pleasure over the years; that the community interested both in brain and in artificial brain has proved a good way for science to develop even if institutions have not always been sympathetic. Often these institutions found the enthusiasm infectious, after a period of doubt. In short, we often have won--. No, perhaps all we know is that we have not yet lost. I still believe that finding mind lodged in biological matter is the most profound question that physics can pose. And that the breadth of physics is a good base from which to begin.

English

174

1.3K

146.9K

Hong Ge retweetledi

Darwin College@DarwinCollege·17 Eki

In last week's flurry of Nobel Prizes with @Cambridge_Uni affiliations, it was poignant to recognise the connections to the late David Mackay in the two honorees for Physics. We talked to Dr Ramesh Mackay about two of her husband's "personal heroes". darwin.cam.ac.uk/news/physics-n…

English

Hong Ge@Hong_Ge2·3 Eki

It is good to see Turing (@turinglang) and Julia (@julialang) helpful in astrophysics!

Marco Bonici@marcobonici

It's paper day! In a new paper, led by my colleague @hanyuzhang17 at @UWaterlooAstro , we work on improving the priors for EFTofLSS analysis by taking advantage of information coming from HOD galaxy mocks. Here the main highlights in the 🧵!

English

826

Hong Ge@Hong_Ge2·6 Eyl

@sethaxen @inferencelab @JuliaLanguage @ChadScherrer @torfjelde Yes it is equivalent.

English

Seth Axen 🪓@sethaxen·6 Eyl

@inferencelab @JuliaLanguage @ChadScherrer @Hong_Ge2 @torfjelde Is DynamicPPL.fix in Turing equivalent to the do operator? #DynamicPPL.fix" target="_blank" rel="nofollow noopener">turinglang.org/DynamicPPL.jl/…

English

105

Hong Ge@Hong_Ge2·2 Eyl

@cameron_pfiffer Hopefully, Tapir will fix all the bugs! It is meant to be a ReverseDiff + Zygote rewrite to achieve ReverseDiff level performance or better, and do perform language-level autograd like Zygote. github.com/compintell/Tap…

English

Cameron@cameron_pfiffer·30 Ağu

Admittedly, this has been buggy at times. But it's also promising. Imagine being able to differentiate general purpose programs, not just neural nets or other models.

English

597

Hong Ge@Hong_Ge2·27 Ağu

Exciting postdoc position with Matthew Juniper and myself at @turinginst, on Programmable Inference for PDEs in Turing.jl. cc @CambridgeMLG cezanneondemand.intervieweb.it/turing/jobs/re…

English

2.4K

Hong Ge@Hong_Ge2·15 Tem

@matvil @TuringLang @ZoubinGhahrama1 Thanks. We have some support from @turinginst but are still seeking collaborations and longer-term funding.

English

Mattias Villani@matvil·15 Tem

@Hong_Ge2 @TuringLang @ZoubinGhahrama1 It is such a great project! How is the funding nowadays?

English

108

Hong Ge@Hong_Ge2·14 Tem

It was a cold start when I started recruiting contributors for @TuringLang using a small budget from @ZoubinGhahrama1, but things went quite well.

Cameron@cameron_pfiffer

Here's @torfjelde with the obligatory first Bayes slide with the proportional posterior. This talk is about @TuringLang, which is how I got into Julia seriously!

English

2.4K

Hong Ge@Hong_Ge2·10 Tem

@wellingmax Uncertain beliefs are less loveable on social media since they sound unconfident.

English

Max Welling@wellingmax·10 Tem

As I read these posts I find it intriguing to see how confident people are about their beliefs; on both sides, while the future fate of AI seems truly very uncertain to me. I see parallels with religion (my God the only real one). At least one must have badly calibrated beliefs.

Gary Marcus@GaryMarcus

Nope. Like Gates himself said, we might see two more cycles of improvement but scaling what we have got will not get us to AGI. Don’t believe the hype.

English

158

41K

Hong Ge retweetledi

Cameron@cameron_pfiffer·10 Tem

The State of Julia talk, which is my favorite one of each year. Quick summary: @JuliaConOrg

English

Hong Ge retweetledi

Pushmeet Kohli@pushmeet·12 Haz

The Fusion team @GoogleDeepMind has open sourced Torax: our fast & differentiable Tokamak simulator to accelerate the use of AI in the development of practical systems for Fusion energy generation. @jon_citrin Paper: arxiv.org/abs/2406.06718 Code: github.com/google-deepmin…

English

395

48.6K

Hong Ge@Hong_Ge2·23 Haz

@sirbayes I am also slightly puzzled as to why calling it "game over" if it succeeds. Isn't that a win for humanity?

English

426

Kevin Patrick Murphy@sirbayes·23 Haz

I don’t know anything about this field , but this seems like an important and promising development. I’m not surprised if China is ahead of West given their ability to build physical stuff so fast.

David P. Goldman@davidpgoldman

globaltimes.cn/page/202406/13… If China achieves fusion power before we do, it's game over.

English

8.4K

Hong Ge retweetledi

Neil Lawrence@lawrennd·9 May

ZXX

19.9K

Hong Ge@Hong_Ge2·13 May

It would be even better if we could run some variant of parallel tempering to accelerate mixing.

Stephan Mandt@StephanMandt

Matt Hoffman (Google) presenting on “ Running Many-Chain MCMC on Cheap GPUs”. #AISTATS2024

English

1.2K

Hong Ge retweetledi

Turing (https://bayes.club/@TuringLang)@TuringLang·26 Mar

We are excited to announce that Turing.jl is participating in Google Summer of Code 2024 under @JuliaLanguage organization. Check out the projects at julialang.org/jsoc/gsoc/turi… and in 🧵. Contributors will be able to interact with the Alan Turing Institute @turinginst. #GSoC

English

3.8K

Hong Ge@Hong_Ge2·21 Oca

@cameron_pfiffer @JeppeJohansen3 I think @xukai92 managed to run Turing on much larger datasets using Turing and distributed computing. Turing is more friendly for larger-scale tasks due to its interoperability with the Julia ecosystem.

English

Cameron@cameron_pfiffer·20 Oca

@JeppeJohansen3 Two: 1. At the time we started the project, Julia did not support LKJ priors which we knew we would want. 2. Julia PPL performance can be difficult to do, especially at the scale we're working at. We picked something a little more established.

English

105

Cameron@cameron_pfiffer·20 Oca

Spent a long fuckin' time (literally months now) trying to optimize stan code for about 32 million observations. It's a big weird discrete choice model. Well, turns out that there's only 2.2 million unique observations. FUCK that's going to be so fast now but COME ON

English

12.2K

Keşfet

@Cambridge_Uni @federalreserve @poplconf @sirbayes @turinglang @julialang @sethaxen @JuliaLanguage