Ksenia_TuringPost

18.7K posts

Ksenia_TuringPost

@TheTuringPost

Newsletter exploring AI&ML - AI 101, Agentic Workflow, Business insights. From ML history to AI trends. Led by @kseniase_ Know what you are talking about👇🏼

Join over 102,000 readers شامل ہوئے Haziran 2020

11.4K فالونگ82.9K فالوورز

Ksenia_TuringPost@TheTuringPost·12h

A new paper from @ylecun and others – V-JEPA 2.1 It changes the recipe of V-JEPA so the model learns both: • Global semantics – what is happening in the scene • Dense spatio-temporal structure – where things are and how they move The idea is to supervise not just masked tokens but the visible ones too There are 4 key ingredients for V-JEPA 2.1: - Dense prediction loss on both masked and visible tokens - Deep self-supervision across intermediate layers - Modality-specific tokenizers (2D for images, 3D for videos) within a shared encoder - Model + data scaling The workflow turns into: masked image/video → encode visible tokens → predict latent representations for both masked and visible tokens → supervise at multiple layers Here are the details:

English

145

23K

Ksenia_TuringPost@TheTuringPost·1d

NemoClaw – NVIDIA’s contribution to the emerging OpenClaw ecosystem and one of the biggest announcements at NVIDIA GTC It's a framework for long-running autonomous agents. ▪️ The idea: Install OpenClaw together with Nemotron models and OpenShell (NVIDIA’s new security runtime) in a single command. NemoClaw gives agents a sandboxed execution environment that: - runs OpenClaw inside a secure container – OpenShell - enforces policies on network, filesystem, and processes - routes all model calls via NVIDIA cloud - provides CLI tools to manage agents In other words, NVIDIA is no longer aiming only to power the model. It wants to sit under the agent itself.

English

130

8.9K

Ksenia_TuringPost@TheTuringPost·1d

OpenViking – filesystem memory for AI agents It gives agents a structured navigable context system that: - replaces flat vector storage with a filesystem (viking://) - unifies memory, resources, and skills - loads context in layers (L0/L1/L2) to save tokens - retrieves info via directory-aware search (not flat RAG) - makes retrieval traceable and debuggable → So it's a combination of structured navigation + semantic (embedding-based) retrieval This approach delivers better retrieval accuracy, up to 80–96% lower token cost and self-improving memory over time.

English

108

7.5K

Ksenia_TuringPost@TheTuringPost·10h

@ylecun Follow @TheTuringPost for more. Get deep analysis, guides & breakdowns of what AI is about now. Join 100,000+ readers from top AI labs, VC funds & universities.: turingpost.com/subscribe

English

488

Ksenia_TuringPost ری ٹویٹ کیا

Ksenia_TuringPost@TheTuringPost·21h

NVIDIA's Nemotron 3 is an architectural response to the 2 pressures: - Long-context cost as agentic interactions scale - Repeated reasoning cost from invoking full models for small subtasks Nemotron 3 proposes several design decisions to solve this: ▪️ Hybrid architecture: Transformer + Mamba 2 layers for efficient long-context processing ▪️ Mixture-of-Experts (MoE) and LatentMoE on top of it to get cheaper experts ▪️ Multi-token prediction ▪️ NVFP4 precision = 4.75 bits used for inference and pre-training, allowing Nemotron pre-training dataset achieve up to 4× faster convergence than standard open web datasets. This is all about one key idea – "Acceleration is intelligence" Here is the tech stack explained and what the Nemotron Coalition is – NVIDIA has just announced that this alliance of leading players like Cursor, Mistral, Black Forest Labs, etc., is gathering to develop the Nemotron family of open models → turingpost.com/p/nemotroncoal…

English

4.9K

Ksenia_TuringPost@TheTuringPost·12h

V-JEPA 2.1 = V-JEPA made denser, deeper, more multimodal and more scalable arxiv.org/abs/2603.14482 Code: github.com/facebookresear…

English

393

Ksenia_TuringPost@TheTuringPost·12h

8. So V-JEPA 2.1 looks strong across both prediction and dense visual understanding (even with the encoder kept frozen) Some of the results: • +20% robot grasping success over V-JEPA 2 in zero-shot real-world manipulation • 10× faster navigation planning, with 5.687 ATE on Tartan Drive And new SOTA: • 7.71 mAP on Ego4D short-term object interaction anticipation • 40.8 Recall@5 on EPIC-KITCHENS action anticipation

English

430

Ksenia_TuringPost ری ٹویٹ کیا

Ksenia_TuringPost@TheTuringPost·19h

It was a busy week @NVIDIAGTC! Celebrating my birthday on the road 🎉

English

2.3K

Ksenia_TuringPost@TheTuringPost·14h

As I see from the docs, yes, it is hierarchical, and the current retrieval strategy is branch-local, which does risk orphaning context spread across branches. would love to hear your thoughts, @jerryjliu0, @hwchase17 and @simonw, if it really works like this and is worth addressing

English

Hans C Nelson 🗽@HansCNelson·1d

From what I understand, the viking fs is purely hierarchical w/ no cross folder referencing, correct? Does that lead to orphaned retrieval of relevant context is spread across multiple hierarchical branches? Is the lack of cross branch memory retrieval a problem worth addressing?

English

137

Ksenia_TuringPost@TheTuringPost·15h

@tricalt It’s technically open (NVIDIA says agents like Claude and Codex can run). But NemoClaw routes inference through NVIDIA Cloud and defaults to Nemotron, so it will probably favor their own models more

English

Vasilije@tricalt·23h

@TheTuringPost will it be open to any model? or are they going to push for the ones that they work closely with

English

Ksenia_TuringPost ری ٹویٹ کیا

Ksenia_TuringPost@TheTuringPost·3d

Must-read AI research of the week: ▪️ OpenClaw-RL ▪️ Meta-Reinforcement Learning with Self-Reflection for Agentic Search ▪️ Agentic Critical Training ▪️ Video-Based Reward Modeling for Computer-Use Agents ▪️ AutoResearch-RL ▪️ Neural Thickets ▪️ Training Language Models via Neural Cellular Automata ▪️ The Curse and Blessing of Mean Bias in FP4-Quantized LLM Training ▪️ Lost in Backpropagation: The LM Head is a Gradient Bottleneck ▪️ IndexCache ▪️ Attention Residuals ▪️ REMIX: Reinforcement Routing for Mixtures of LoRAs in LLM Finetuning ▪️ Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections ▪️ Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs ▪️ How Far Can Unsupervised RLVR Scale LLM Training? ▪️ Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training ▪️ Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs ▪️ Scale Space Diffusion Find the full list and the main AI news and updates from NVIDIA GTC here: turingpost.com/p/fod144

English

118

552

26.9K

دریافت کریں

@ylecun @NVIDIAGTC @jerryjliu0 @hwchase17 @simonw @tricalt @elonmusk @BarackObama