Francesco A. Fabozzi

27 posts

Francesco A. Fabozzi

@FAFabozzi

Education & Research on LLMs and Quant Finance | Research Director @ Yale ICF | Managing Editor @ Journal of Financial Data Science | Data Science PhD

Katılım Mayıs 2020

176 Takip Edilen135 Takipçiler

Sabitlenmiş Tweet

Francesco A. Fabozzi@FAFabozzi·7 Kas

x.com/i/article/1853…

ZXX

366

Francesco A. Fabozzi retweetledi

himanshu@himanshustwts·3 Ara

this is lit. so satisfying to see conv nets in action man!

English

386

265K

Francesco A. Fabozzi@FAFabozzi·1 Ara

@svpino Have you checked out @ZeroGpt ? I've found it useful for detection on academic work.

English

578

Santiago@svpino·30 Kas

The more I think about the increase in LLM-generated content worldwide, the more convinced I am that "human proof" is necessary. I think we need a straightforward way to determine if a piece of content was generated by a human or not. Is this even possible?

English

117

316

49.9K

Francesco A. Fabozzi@FAFabozzi·16 Kas

@MattiasLamotte @bugabtc Instead of trimming out features, try PCA or other dimension reduction techniques!

English

779

Mattias Lamotte@MattiasLamotte·15 Kas

a lot. The dataset I built had approximately 1650 features, and I usually try to trim them to between 100 and 200 features. Many of the 1650 features are correlated/cointegrated so part of the work is to find a method to trim them. Implied vols, realised vols, bond yield movements, underlying market movements etc...

English

54K

Mattias Lamotte@MattiasLamotte·15 Kas

My 5 day models have turned bearish in last 48hrs (presumably, futures traders are trading similar signals as mine given that ES is down -0.3% since the close). I took a small short bet on ES on the open of the last session.

English

3.5K

Francesco A. Fabozzi@FAFabozzi·13 Kas

Finding edgartools to be the best Python package for downloading financial statements from the SEC. This video provides a great overview👇 youtube.com/watch?v=mI6KDe…

YouTube

English

190

Francesco A. Fabozzi@FAFabozzi·12 Kas

I often see researchers reaching for LLMs when smaller, embeddings-based models would not only suffice but could also outperform at a fraction of the cost. Before choosing an approach, ask if you have a well-labeled training dataset—sometimes a simpler model is the better fit.

English

112

Francesco A. Fabozzi@FAFabozzi·12 Kas

@quantscience_ Great thread! There’s also clustering for pricing illiquid bonds

English

Quant Science@quantscience_·12 Kas

7. Event-Based Trading: Re-cluster stocks after major events (earnings, economic announcements) to observe any shifts 8. Anomaly Detection: Use autoencoder embeddings to spot outliers or anomalies

English

658

Quant Science@quantscience_·12 Kas

Using Auto Encoders with Python for investing. This is how (Python Code):

English

198

17.3K

Francesco A. Fabozzi@FAFabozzi·8 Kas

On this point, an under-appreciated paper related to this work is the "Deep Regression Ensembles" paper by Didisheim, Kelly, and Malamud. Link: arxiv.org/abs/2203.05417

English

Francesco A. Fabozzi@FAFabozzi·8 Kas

Financial markets are complex, and linear approximations of input variables to asset returns fail to capture this complexity. Predictability can be improved by examining large-scale interactions among predictive variables, even when fitting with a linear model—just don’t forget to regularize!

Clifford Asness@CliffordAsness

From my colleagues: “Can Machines Build Better Stock Portfolios?” aqr.com/Insights/Resea…

English

251

Francesco A. Fabozzi@FAFabozzi·8 Kas

A sad reality... #QuantFinance #AI

English

190

Francesco A. Fabozzi@FAFabozzi·8 Kas

For those looking to get started in #FinancialNLP, the Financial Phrasebank dataset on Huggingface is a great place to start playing with LLMs! huggingface.co/datasets/takal…

English

Francesco A. Fabozzi@FAFabozzi·8 Kas

@quantscience_ Correct! ML usually comes with more risks😂 proper CV is a necessity

English

Quant Science@quantscience_·5 Kas

ML is not without risk. - Universe selection is critical - Volatility can be a double-edged sword - Overfitting can kill profitability That's why Time Series Cross-Validation is essential.

English

1.8K

Quant Science@quantscience_·5 Kas

Why Machine Learning in Finance? This is why. 🧵

English

278

30K

Francesco A. Fabozzi@FAFabozzi·8 Kas

@pyquantnews I always tell beginners... start with a project, not syntax!

English

PyQuant News 🐍@pyquantnews·7 Kas

What EVERY Python beginner forgets: Python is a tool to get a job done. And the job is probably not printing "Hello World" to the screen. Solve problems you can use in real life. Here's a primer:

English

7.1K

Francesco A. Fabozzi@FAFabozzi·5 Kas

When using LLMs in financial backtesting, a common question is: "Doesn't ChatGPT have forward-looking bias?" 🤔 Paul Glasserman and Caden Lin dive into this in their paper "Assessing Look-Ahead Bias in Stock Return Predictions Generated by GPT Sentiment Analysis." Their study backtests ChatGPT-based sentiment trading strategies, comparing portfolios with and without anonymized headlines (masking company identifiers). Interestingly, portfolios using anonymized headlines outperform those with original headlines. Why? 🧠 Their findings suggest that including company names creates a “distraction effect,” where the model fixates on names rather than sentiment—a stronger effect than any look-ahead bias! 📉 This work is critical for advancing GPT models in trading and portfolio construction. I’ve observed similar results in my own research too. Paper: pm-research.com/content/iijjfd…

English

138

Francesco A. Fabozzi@FAFabozzi·5 Kas

Paper link: papers.ssrn.com/sol3/papers.cf… Associated repo: github.com/francescoafabo…

English

102

Francesco A. Fabozzi@FAFabozzi·5 Kas

These @notebooklm_pods podcasts are crazy! @ProfJimLiew just sent me a podcast he generated based on my paper, "Cut the Chit-Chat". A bit too much embellishment but otherwise amazing!

English

166

Francesco A. Fabozzi@FAFabozzi·4 Kas

6/ Follow me here for more insights into cutting-edge applications of LLMs for Finance, and stay tuned for tools, research, and practical tips! #GenAI #Finance #LLMs #Python #SentimentAnalysis

English

Francesco A. Fabozzi@FAFabozzi·4 Kas

5/ 💻 To make Logit Extraction accessible, I've developed a Python package, TokenProbs, now available on GitHub and PyPi. PyPi: pypi.org/project/TokenP… GitHub: github.com/francescoafabo…

English

Francesco A. Fabozzi@FAFabozzi·4 Kas

1/ 🚀 Sentiment analysis in finance is evolving. As we move from encoder-only (i.e., embeddings) models to generative language models (GLMs), we must update the research toolset for leveraging this new and rapidly growing class of large language models.

English

187

Keşfet

@svpino @ZeroGpt @MattiasLamotte @bugabtc @quantscience_ @pyquantnews @elonmusk @BarackObama