DailyPapers

5.2K posts

DailyPapers banner
DailyPapers

DailyPapers

@HuggingPapers

Tweeting interesting papers submitted at https://t.co/rXX8x0HzXV. Submit your own at https://t.co/QhbJKXBd4Q, and link models/datasets/demos to it!

Anywhere Katılım Mart 2025
4 Takip Edilen17.1K Takipçiler
DailyPapers
DailyPapers@HuggingPapers·
Learning while Deploying: Fleet-Scale RL for Generalist Robot Policies A new framework that turns robot deployment into a continuous training loop, enabling 16 dual-arm robots to improve from real-world experience and achieve 95% success on long-horizon tasks like brewing tea and making cocktails.
DailyPapers tweet media
English
1
0
0
20
DailyPapers
DailyPapers@HuggingPapers·
UniVidX: A Unified Multimodal Framework for Versatile Video Generation Enables omni-directional generation across RGB, intrinsic maps, and alpha channels using diffusion priors with stochastic condition masking—trained on fewer than 1,000 videos for SIGGRAPH 2026.
DailyPapers tweet media
English
1
10
32
2.2K
DailyPapers
DailyPapers@HuggingPapers·
Discuss: huggingface.co/papers/2604.15… Hallucinations arise from semantic interference during fine-tuning. Self-distillation mitigates this by regularizing output distributions.
English
0
0
10
962
DailyPapers
DailyPapers@HuggingPapers·
Fine-tuning increases hallucinations New research shows SFT causes factual errors by interfering with pre-trained knowledge. The authors propose self-distillation to learn new facts without forgetting, plus selective parameter freezing to reduce hallucinations while preserving performance.
DailyPapers tweet media
English
2
18
71
3.6K
DailyPapers
DailyPapers@HuggingPapers·
Edit-R1: Reasoning verifier-based RL for image editing Moves beyond simple scorers to chain-of-thought verifiers that break instructions into verifiable principles. Trains editing models via GRPO with fine-grained rewards, outperforming Seed-1.5-VL and scaling up to 7B parameters.
DailyPapers tweet media
English
1
7
23
1.7K
DailyPapers
DailyPapers@HuggingPapers·
NVIDIA just released AETC on Hugging Face 44k multi-task video annotations with chain-of-thought reasoning for traffic anomaly detection.
DailyPapers tweet media
English
2
6
25
1.5K
DailyPapers
DailyPapers@HuggingPapers·
Recursive Multi-Agent Systems, Agentic World Modeling, and AI Organizations: Top Papers of the Week - Recursive Multi-Agent Systems: A new framework scaling agent collaboration through recursive latent-space computation (242 upvotes) - Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond - A comprehensive taxonomy for AI environment modeling (219 upvotes) - Heterogeneous Scientific Foundation Model Collaboration (Eywa): Bridging language models with scientific domain foundation models (192 upvotes) - From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company - The OneManCompany framework (116 upvotes) - World-R1: Reinforcing 3D Constraints for Text-to-Video Generation (115 upvotes) - GLM-5V-Turbo by Zhipu AI: Toward native foundation models for multimodal agents (90 upvotes)
DailyPapers tweet media
English
8
26
97
11.4K
DailyPapers
DailyPapers@HuggingPapers·
13 frontier models evaluated: Claude Opus 4.6 leads at 66.7%, GPT-5.4 at 63.8%, Gemini 3.1 Pro at 53.3%. Gap is clear—workspace repair is near-ceiling but HR, finance, and multi-system orchestration remain unsolved. Paper: huggingface.co/papers/2604.28… Leaderboard: claw-eval-live.github.io
English
0
3
7
1K
DailyPapers
DailyPapers@HuggingPapers·
Claw-Eval-Live A live benchmark for workflow agents that refreshes quarterly from real marketplace signals. 105 tasks across CRM, HR, finance, and workspace repair show even the best models struggle—Claude Opus 4.6 hits just 66.7% pass rate, with HR and management workflows failing most.
DailyPapers tweet media
English
1
4
22
1.4K
DailyPapers
DailyPapers@HuggingPapers·
Intern-Atlas traces 60 years of AI method evolution Built from 1 million papers into a graph with 9 million causal edges, mapping how techniques emerge, relate, and advance across machine learning history.
DailyPapers tweet media
English
1
16
32
2.1K
DailyPapers
DailyPapers@HuggingPapers·
RoundPipe Full fine-tune 32B models or LoRA fine-tune 235B models on a single 24GB GPU with 64K+ context length. Achieves 1.5-2.2× speedups over SOTA baselines by dynamically dispatching stages in a round-robin manner for near-zero pipeline bubbles.
DailyPapers tweet media
English
3
16
100
6.4K
DailyPapers
DailyPapers@HuggingPapers·
Allen AI just released OlmPool architectural variants on Hugging Face 7-8B parameter models exploring how minor architectural choices impact long context extension.
DailyPapers tweet media
English
1
2
21
1.8K