Alex Fischer

316 posts

Alex Fischer

Alex Fischer

@s3alfisc

data science @trivago. Working on #pyfixest and #rstats and #python packages for clustered errors

Düsseldorf, Germany Katılım Şubat 2022
484 Takip Edilen368 Takipçiler
Sabitlenmiş Tweet
Alex Fischer
Alex Fischer@s3alfisc·
PyFixest is looking for contributors 🚀 If you'd like to help build PyFixest, please reach out (DMs are open)! Please don't worry if you've never contributed to an OSS project - I'd be super happy to onboard you to the codebase/git/OSS development, or really anything else =)
English
2
11
33
13.2K
Alex Fischer
Alex Fischer@s3alfisc·
With the new version of pyfixest, you can access it by simply running `pf.feols(fml, data, demeaner_backend = "rust-cg").
English
0
0
0
26
Alex Fischer
Alex Fischer@s3alfisc·
If you work on very large and very sparse fixed effects problems (worker-firm panels, doctor-patient relations, trade networks), we'd love to learn if it works for you / speeds up your regressions!
English
1
0
0
19
Alex Fischer
Alex Fischer@s3alfisc·
I am very excited that PyFixest 0.50.0 is on PyPi, including a new graph-based solver for demeaning that makes fixed effects estimation in PyFixest significantly faster for "sparse" fixed effects structures.
Alex Fischer tweet media
English
1
1
3
167
Alex Fischer
Alex Fischer@s3alfisc·
Small tutorial here: #alternative-plug-in-extractor-format" target="_blank" rel="nofollow noopener">py-econometrics.github.io/maketables/doc…
English
0
0
0
21
Alex Fischer
Alex Fischer@s3alfisc·
`maketables` now has plug-in tooling - this means you can support maketables with your library without changing the maketables code base. This will hopefully be pretty useful for package developers, but also researchers developing their own estimators / in their own codebase.
English
1
0
1
28
apoorva.lal
apoorva.lal@Apoorva__Lal·
kind of fascinating how small these models are. paper doesn't mention number of parameters, for some reason. you could bundle them with a package on pypi like smollm.
apoorva.lal tweet media
BURKOV@burkov

This paper really is groundbreaking. It solves a long-standing embarrassment in machine learning: despite all the hype around deep learning, traditional tree-based methods (XGBoost, CatBoost, random forests, etc) have dominated tabular data—the most common data format in real-world applications—for two decades. Deep learning conquered images, text, and games, but spreadsheets remained stubbornly resistant. This paper's (published in Nature by the way) main contribution is a foundation model that finally beats tree-based methods convincingly on small-to-medium datasets, and does so very fast. TabPFN in 2.8 seconds outperforms CatBoost tuned for 4 hours—a 5,000× speedup. That's not incremental; it's a different regime entirely. The training approach is also fundamentally different. GPT trains on internet text; CLIP trains on image-caption pairs. TabPFN trains on entirely synthetic data—over 100 million artificial datasets generated from causal graphs. TabPFN generates training data by randomly constructing directed acyclic graphs where each edge applies a random transformation (using neural networks, decision trees, discretization, or noise), then pushes random noise through the root nodes and lets it propagate through the graph—the intermediate values at various nodes become features, one becomes the target, and post-processing adds realistic messiness like missing values and outliers. By training on millions of these synthetic datasets with very different structures, the model learns general prediction strategies without ever seeing real data. The inference mechanism is also unusual. Rather than finetuning or prompting, TabPFN performs both "training" and prediction in a single forward pass. You feed it your labeled training data and unlabeled test points together, and it outputs predictions immediately. There's no gradient descent at inference time—the model has learned how to learn from examples during pretraining. The architecture respects tabular structure with two-way attention (across features within a row, then across samples within a column), unlike standard transformers that treat everything as a flat sequence. So, the transformer has basically learned to do supervised learning. Talk to the paper on ChapterPal: chapterpal.com/s/a1899430/acc… Download the PDF: nature.com/articles/s4158…

English
2
0
8
2.3K
apoorva.lal
apoorva.lal@Apoorva__Lal·
Is this enshittification?
apoorva.lal tweet media
English
1
0
0
211
apoorva.lal
apoorva.lal@Apoorva__Lal·
agents know ball
apoorva.lal tweet media
English
1
0
6
718
Alex Fischer
Alex Fischer@s3alfisc·
@Apoorva__Lal @johnjhorton @Uber Admittedly the Rust rewrite was at least 50% motivated by curiosity for the new shiny thing, but also had some technical reason (numba is a "heavy" dependency that locks to last gen numpy plus no JIT compilation with Rust) But ofc "written in Rust" just sounds so cool ;-)
English
1
0
2
58
apoorva.lal
apoorva.lal@Apoorva__Lal·
@johnjhorton @Uber I'm susceptible to runtime golf as well (eg github.com/apoorvalal/pye…), @s3alfisc implemented the iterative demeaning parts of pyfixest in rust. The rewrite in rust meme for venerable old C/Cpp stuff is especially funny; its basically a movement of computer puritans
English
2
0
1
150
Alex Fischer retweetledi
apoorva.lal
apoorva.lal@Apoorva__Lal·
The entire paper (down to the manuscript, written entirely in typst) is replicable here and I have a PR out for pyfixest that makes the joint test easy to do github.com/apoorvalal/Tes…
English
0
3
16
2.7K
apoorva.lal
apoorva.lal@Apoorva__Lal·
why is scipy.sparse a subpackage in numpy?
English
1
0
1
660
Alex Fischer
Alex Fischer@s3alfisc·
𝗣𝘆𝗙𝗶𝘅𝗲𝘀𝘁 𝟬.𝟮𝟳: (𝗣𝗿𝗲-𝗖𝗵𝗿𝗶𝘀𝘁𝗺𝗮𝘀 🎅) 𝗥𝗲𝗹𝗲𝗮𝘀𝗲 Hi all, a new version of PyFixest (0.27) has found it's way to PyPi! Highlights: we now support Gelbach's regression decomposition & Westfall and Young's multiple testing correction.
English
1
2
17
829
Alex Fischer
Alex Fischer@s3alfisc·
We’ve released a patch to pyfixest (0.26.2). Some highlights: - feols and fepois have new split and fsplit arguments (just as in fixest!) - you can check for separation in Poisson models via the „iterative rectifier“ - we’ve fixed a 🐞 in the predict method for WLS
English
2
1
9
536