The LLM Data Company

@llmdataco

Frontier models for critical domains

San Francisco Katılım Ekim 2025

0 Takip Edilen63 Takipçiler

The LLM Data Company@llmdataco·4 Mar

Kos is the birthplace of Hippocrates

Daanish Khazi@bertgodel

We’re announcing Kos-1 Lite, a medical model that achieves SOTA on HealthBench Hard at 46.6%. As a medium sized language model (~100B), it achieves these results at a fraction of the serving cost of frontier trillion-parameter models.

English

501

The LLM Data Company retweetledi

Daanish Khazi@bertgodel·5 Şub

We're excited to partner with @perplexity_ai on their latest release. We were impressed with how performant their new Deep Research product was on early benchmarks and are thrilled that this work is being open sourced.

Aravind Srinivas@AravSrinivas

Today, we're rolling out an Advanced version of Perplexity Deep Research, achieving state-of-the-art performance on external and internal benchmarks, beating every other deep research tool on accuracy, usability, and reliability across all verticals.

English

136

9.3K

The LLM Data Company@llmdataco·3 Ara

🙏🤝

Baseten@baseten

Baseten is proud to support training jobs for The LLM Data Company 💪

ART

453

The LLM Data Company retweetledi

Daanish Khazi@bertgodel·3 Ara

(1/5) New post: "Mismatch Praxis: Rollout Settings and IS Corrections". We pressure-tested solutions for inference/training mismatch. Inference/training mismatch in modern RL frameworks creates a hidden off-policy problem. To resolve the mismatch, various engineering (e.g., FP16 unification, deterministic kernels) and algorithmic (e.g., importance sampling) fixes have been proposed. In this work, we examine how rollout settings (temp, top-p, and top-k) affect mismatch, and how importance sampling corrections bear out in practice. We find that while Sequence-TIS is theoretically optimal, it can succumb to catastrophic variance in long-horizon contexts. Additionally, non-standard rollout settings create subtle mismatch patterns that require careful engineering fixes. Token-TIS with default rollout settings proved to be the most robust setting for long-horizon training.

English

137

30.4K

Keşfet

@perplexity_ai @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine