Pablo de Castro

246 posts

Pablo de Castro

Pablo de Castro

@pablodecm

Building https://t.co/5jQL7rMYRV (YC W25). Prev - Product and Tech @ Reforestum. Data and ML @ Treelogic. Physics PhD Statistical Learning and Inference @ LHC at CERN.

Spain, Europe Katılım Nisan 2014
2.2K Takip Edilen475 Takipçiler
Sabitlenmiş Tweet
Pablo de Castro
Pablo de Castro@pablodecm·
A bit over a year ago, we launched Awen. We’ve been improving it every single week since. It’s now much simpler, much stronger, and used by thousands of people every day. You can try it here: awen.ai Would love to hear what you think.
Y Combinator@ycombinator

awen (@awen_ai) translates intent into exceptional visuals through conversation. Congrats to the team on the launch! ycombinator.com/launches/Ppr-a…

English
0
0
2
236
Pablo de Castro
Pablo de Castro@pablodecm·
@ycombinator @awen_ai A bit over a year ago, we launched Awen. We’ve been improving it every single week since. It’s now much simpler, much stronger, and used by thousands of people every day. You can try it here: awen.ai Would love to hear what you think.
English
0
0
1
140
Pablo de Castro
Pablo de Castro@pablodecm·
Really powerful video to build (or refresh) some intuitions and scaling notions around Large Language Models (LLMs) like GPT. Slightly technical yet it is amenable for people with expertise in adjacent disciplines (e.g. physics, statistics, engineering, etc).
Sasha Rush@srush_nlp

LLMs in Five Formulas: A somewhat idiosyncratic tutorial on LLMs. The goal is to highlight five independent areas in LLMs that we kinda understand, while being humble about how hard the rest is. youtube.com/watch?v=KCXDr-…

English
0
0
0
380
Pablo de Castro
Pablo de Castro@pablodecm·
It is so easy to fall into a pure consumer/reader mode in this network if you are curiosity-driven, so much interesting stuff all the time... Thus, after a four-year long hiatus, I am back into creating and sharing stuff here often! 🚀 Anyone in a similar situation? Any tips?
English
1
0
2
206
Lukas Heinrich
Lukas Heinrich@lukasheinrich_·
1/3 It's official! I'll move from @CERN back to Germany to start as a Professor for "Data Science in Physics" at the @TU_Muenchen as part of the ORIGINS Excellence Cluster, where I'll be focusing on the intersection of Science, Computing & ML.
English
34
19
325
0
Pablo de Castro retweetledi
Dan Casas
Dan Casas@dancasas·
📢 Open PhD positions👩‍🎓👨‍🎓 Fully-funded Ph.D. positions in the areas of Machine Learning, Computer Vision, and Physics-Based Simulation for 3D human modeling, human interaction, and understanding of crowds. All details: dancasas.github.io/jobs Likes and RTs highly appreciated!
English
8
178
401
0
Giles Strong
Giles Strong@Giles_C_Strong·
Ecstatic to say that I passed my PhD defence!
Giles Strong tweet mediaGiles Strong tweet media
English
16
1
79
0
Pietro Vischia
Pietro Vischia@pietrovischia·
Very happy because I just won a "Chargé de Recherche" position by the Belgian FNRS!!! Will use (keep using, in fact) advanced statistical techniques to probe the Top-Higgs Yukawa couplings! frs-fnrs.be/fr/resultats-d…
English
12
0
46
0
Pablo de Castro retweetledi
QuantStack
QuantStack@QuantStack·
An open-source server for hosting conda packages. A fast replacement to the conda command line utility. Check out our new blog post on "Open Software Packaging for Science". @QuantStack/open-software-packaging-for-science-61cecee7fc23" target="_blank" rel="nofollow noopener">medium.com/@QuantStack/op…
English
3
87
211
0
Pablo de Castro
Pablo de Castro@pablodecm·
@pietrovischia @Giles_C_Strong After a few private communications back and forth with the authors they updated the preprint and removed the claims about different applicability in the related work section. They still do not attribute ideas in intro or methodology though, hoping journal editors notice...
English
0
0
2
0
Pablo de Castro
Pablo de Castro@pablodecm·
So #AcademicTwitter, a preprint takes on same problem with same conceptual solution but with a minor change from one of my PhD publications. Cite in related work incorrectly claiming different applicability and does not attribute ideas in intro or methodology. Any advice?
English
2
2
3
0
Gilles Louppe
Gilles Louppe@glouppe·
Assume training a model takes weeks, but testing is fast. Estimating the variance of the performance of this model is not realistically feasible using cross-validation. How would you proceed to evaluate the uncertainty around the performance estimate?
English
4
4
11
0
Pablo de Castro
Pablo de Castro@pablodecm·
@glouppe For a fixed model I guess a bootstrap estimate will approximate the variance of test set sampled from the same distribution. I see no showstoppers but maybe I am missing something (as you said it would likely be more used otherwise given how cheap is bootstrapping).
English
1
0
3
0
Gilles Louppe
Gilles Louppe@glouppe·
@pablodecm Yes, the variance due to the training data would not be captured. But at least I'd like to know whether performance would change much had the test set been different? (for that fixed model)
English
1
0
1
0
Pablo de Castro
Pablo de Castro@pablodecm·
@glouppe Thinking out loud here prompted by your tweet, but wouldn't that miss part of the total variance due to the different model trainings that cross-validation could capture? I guess is fine for a generalization error variance estimate of a given model but not of a methodology.
English
1
0
5
0
Gilles Louppe
Gilles Louppe@glouppe·
I am thinking about bootstrapping the test set in order to build an empirical distribution of performance estimates. Would that make sense? I have not actually seen this anywhere, and probably there is a good reason for that?
English
6
1
6
0
Pietro Vischia
Pietro Vischia@pietrovischia·
@Giles_C_Strong @pablodecm It's complicated (e.g. the whole POG often claims authorship based on "yes but you build on our expertise"). Overall, the system is rigged towards merging individuals into "the collaboration"; it may be necessary but kills motivation and alienates people from their own analyses.
English
2
0
3
0
Pablo de Castro
Pablo de Castro@pablodecm·
Been reading lots of stats and ML methods papers in high-energy physics for a review lately. Maybe is the clarity of almost one year doing something else, but it seems the barrier for much better data analyses is mainly a (software) tooling and integration problem. Any takes?
English
4
2
13
0
Pablo de Castro retweetledi
Gabriel Peyré
Gabriel Peyré@gabrielpeyre·
I have updated my course notes on automatic differentiation (last section of the PDF). Now includes dual numbers, adjoint state method, argmin layers, envelope theorem, reversible architectures. Thx to @PierreAblin for the constructive criticisms :) mathematical-tours.github.io/book-sources/o…
Gabriel Peyré tweet media
English
6
184
818
0
Pablo de Castro
Pablo de Castro@pablodecm·
@phi_nate @Giles_C_Strong Yeah, that sucks... I have come to believe that it is a combination of code shame and intelectual protectionism bias when done in the context scientific research, what is your take?
English
1
0
1
0
Giles Strong
Giles Strong@Giles_C_Strong·
Definitely this. I lost a good deal of time last year and this year disproving a flawed result. It would have taken minutes to show, had the authors released their code.
English
2
1
8
0