Jefferson Enrique Hernandez Cevallos

491 posts

Jefferson Enrique Hernandez Cevallos

@jefehern

Opinions are my own. Ph.D. student at @RiceCompSci in @vislang. Previously @RealityLabs @AdobeResearch, @InariAILab, @AdaVivInc.

Houston, Texas Entrou em Aralık 2017

345 Seguindo66 Seguidores

Jefferson Enrique Hernandez Cevallos retweetou

Massimiliano Viola@massiviola01·2d

Thread on VJEPA 2.1🤟 This DEFINITELY flew under the radar: just a few days ago, @AIatMeta released V-JEPA 2.1, taking a massive step toward closing the gap between image and video domains. For a long time, image backbones were the only option for solving dense vision tasks. This model disagrees, showing that universal spatial understanding also emerges from large-scale video models!🎥

English

574

53.3K

Jefferson Enrique Hernandez Cevallos retweetou

Ji-Ha@Ji_Ha_Kim·3d

Blog post - Transformers as Constrained Optimization Rewriting pre-norm decoder-only transformers as solutions to regularized objectives. Changing regularization to hard constraint gives a canonical temperature, generalizing to KL-divergence, ideas of cross-layer interaction.

English

601

29.1K

Jefferson Enrique Hernandez Cevallos retweetou

Ksenia_TuringPost@TheTuringPost·14 Mar

Is RL dead for post-training? Of course not, but there are other interesting options for fine-tuning ▪️ Evolution Strategies (ES) is a gradient-free optimization method that tests random parameter changes and moves the model toward the best-performing ones. - It creates a small population of models by adding random perturbations to the parameters - Perturbed models' outputs are scored with a reward function/verifier - Model parameters are updated in the direction of perturbations that achieved best rewards The best thing is that ES can scale to billion-parameter models and shows clear gains over RL • On Countdown benchmark: ES raised Qwen-2.5-3B to 60.5% (vs 32.5% GRPO) and Llama-3.1-8B to 61.2% (vs ~51% RL) • ARC-AGI 0.2% → 29.5% and Sudoku 2.5% → 69.5% improvement And everything without computing gradients through backpropagation So don’t overlook other approaches in favor of RL alone - there is a lot to explore. Here we’ve gathered the new fine-tuning stack for LLMs with ES and most promising LoRAs -> turingpost.com/p/beyondrl

English

249

22K

Jefferson Enrique Hernandez Cevallos retweetou

Seungwook Han@seungwookh·12 Mar

Can language models learn useful priors without ever seeing language? We pre-pre-train transformers on neural cellular automata — fully synthetic, zero language. This improves language modeling by up to 6%, speeds up convergence by 40%, and strengthens downstream reasoning. Surprisingly, it even beats pre-pre-training on natural text! Blog: hanseungwook.github.io/blog/nca-pre-p… (1/n)

English

259

1.7K

240.8K

Jefferson Enrique Hernandez Cevallos retweetou

Minh Dinh@minhinhtrng·11 Mar

Modern vision models lacks robustness when objects appear in unusual poses. @StphTphsn1 and I study latent equivariant operators as a remedy and discuss caveats of these operators. Below is a summary of the work, accepted at the GRaM Workshop at ICLR @iclr_conf 2026. 🧵

GIF

English

9.9K

Jefferson Enrique Hernandez Cevallos retweetou

sam henri gold@samhenrigold·10 Mar

We’re acquiring Promptfoo. We’re buying Safetysnoot. Proud to announce that the Ragschlorp team is joining us. We own a majority stake in Chunkwad.

OpenAI@OpenAI

We’re acquiring Promptfoo. Their technology will strengthen agentic security testing and evaluation capabilities in OpenAI Frontier. Promptfoo will remain open source under the current license, and we will continue to service and support current customers. openai.com/index/openai-t…

English

827

18K

672.2K

Jefferson Enrique Hernandez Cevallos retweetou

dr. jack morris@jxmnop·9 Mar

x.com/i/article/2031…

ZXX

159

1.9K

391.2K

Jefferson Enrique Hernandez Cevallos retweetou

George Bredis@BredisGeorge·4 Mar

Most imagination-based world models learn representations by reconstructing pixels. But reconstruction may not be the right objective for control. In our new paper we explore a different idea: 👉 predict the next embedding instead of reconstructing observations. Introducing NE-Dreamer. Project page: corl-team.github.io/nedreamer/ Paper: arxiv.org/pdf/2603.02765 Code: github.com/corl-team/nedr…

GIF

English

366

46.7K

Jefferson Enrique Hernandez Cevallos retweetou

Dimitris Papailiopoulos@DimitrisPapail·26 Şub

Two days ago I didn't know if a 10K transformer could add 10-digit numbers. AdderBoard is now at 36 hand-coded @alexlitzenberge (🏆) !! 311 trained @reza_byt (🏆) !! The gap is v interesting: humans can still construct solutions that gradient descent can't find. For now

English

274

29.3K

Jefferson Enrique Hernandez Cevallos retweetou

Gabriele Berton@gabriberton·25 Şub

Apparently counterintuitive result, but obvious if you think about it: Thinking longer is bad for OCR Just like thinking longer is bad for VQA Too much thinking and the image is too far from the answer, and the answer's tokens attend the reasoning trace, not the image [1/3]

Generative History@HistoryGPT

Why the difference? Not sure but it’s what we’ve seen with other models too (Claude and OpenAI) as they are optimized more for thinking based pipelines. Remember: increasing thinking budgets always decreases accuracy on handwriting recognition. (4/5)

English

3.3K

Jefferson Enrique Hernandez Cevallos retweetou

Dimitris Papailiopoulos@DimitrisPapail·25 Şub

x.com/i/article/2026…

ZXX

1.1K

292K

Jefferson Enrique Hernandez Cevallos retweetou

Yehonathan Litman@yehonation·17 Şub

Excited to share our new work EditCtrl! We introduce a disentangled local-global control video inpainting framework that dynamically allocates compute where needed - achieving up to 10x compute savings over full-attention while matching or exceeding SOTA editing quality. 🧵

English

18.3K

Jefferson Enrique Hernandez Cevallos retweetou

Emiliano Penaloza@emilianopp_·19 Şub

x.com/i/article/2024…

ZXX

512

149.6K

Jefferson Enrique Hernandez Cevallos retweetou

Amrith Setlur@setlur_amrith·15 Şub

I'll admit, going in I was not 100% sure this was possible: we trained a tiny 4B model (QED-Nano) to prove math theorems at the Olympiad level! Today, we release the full recipe, from the data curation done for SFT to our RL algorithm that explicitly optimizes for test-time scaling over millions of tokens (i.e., we train QED-Nano to continually improve as we apply modern day test-time scaffolds like DeepSeekMath-agent over it). 🧵⬇️

English

148

23.3K

Jefferson Enrique Hernandez Cevallos retweetou

Vincent Sitzmann@vincesitzmann·16 Şub

In my recent blog post, I argue that "vision" is only well-defined as part of perception-action loops, and that the conventional view of computer vision - mapping imagery to intermediate representations (3D, flow, segmentation...) is about to go away. vincentsitzmann.com/blog/bitter_le…

English

157

366.4K

Jefferson Enrique Hernandez Cevallos retweetou

Haoyu Han@Hyhan0118·10 Şub

Why is RL training with policy-gradient methods often unstable—especially near the optimum? Our new work studies this through the noise-to-signal ratio (NSR) of the REINFORCE gradient estimator (defined as the estimator variance (noise) normalized by the squared norm of the true gradient (signal)). We show that the NSR is highly non-uniform and often increases as training progresses, which can trigger instability or even policy collapse. Paper link: arxiv.org/abs/2602.01460 Long thread👇

GIF

English

124

17.4K

Jefferson Enrique Hernandez Cevallos retweetou

Ziming Liu@ZimingLiu11·9 Şub

🚨Transformers don't learn Newton's laws? They learn Kepler's laws! Like us, transformers don't predict a flying ball via a differential equation, but by fitting a curve. Moreover, reducing context length steers a transformer from Keplerian to Newtonian. Compression in play.

English

206

1.2K

115.5K

Jefferson Enrique Hernandez Cevallos retweetou

Zilin Xiao@ZilinXiao2·6 Şub

🚀 Two papers accepted to #ICLR2026 on test-time scaling for vision-language systems (retrieval + reasoning)! 1) MetaEmbed (Oral Presentation): Meta Tokens + Matryoshka multi-vector training → flexible late interaction, choose #vectors at test time for accuracy↔efficiency. Paper: arxiv.org/abs/2509.18095 Work done at @AIatMeta with amazing collaborators: Qi Ma, @Mengting_Gu, Jason Chen, Xintao Chen, @vislang and @MohanVijaimohan! 2) ProxyThinker: training-free test-time guidance from small “slow-thinking” visual reasoners → self-verification / self-correction via distribution-level guidance. Paper: arxiv.org/abs/2505.24872 Work done with @JaywonK17250, @Siru_Ouyang, @jefehern, @yumeng0818 and @vislang! While I won't be able to travel to Brazil🇧🇷, please say Hi to the team :-) #MultimodalRetrieval #VisualReasoning #VisionLanguage #TestTimeCompute #Embeddings

English

20K

Jefferson Enrique Hernandez Cevallos retweetou

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr·6 Şub

Rewards as Labels: Revisiting RLVR from a Classification Perspective "we propose Rewards as Labels (REAL), a novel framework that revisits verifiable rewards as categorical labels rather than scalar weights, thereby reformulating policy optimization as a classification problem. "

Tanishq Mathew Abraham, Ph.D. tweet media

English

118

6.6K

Jefferson Enrique Hernandez Cevallos retweetou

Rishabh Agarwal@agarwl_·5 Şub

And RL on imagenet is such a nice way to show this: every task can be formulated as RL but should you? Here they show how Reinforce has terrible efficiency but this MaxRL thingy with *large number of rollouts* don't.

English

4.1K

Descobrir

@AIatMeta @StphTphsn1 @iclr_conf @alexlitzenberge @reza_byt @Mengting_Gu @vislang @MohanVijaimohan