Ksenia_TuringPost

18.7K posts

Ksenia_TuringPost banner
Ksenia_TuringPost

Ksenia_TuringPost

@TheTuringPost

Newsletter exploring AI&ML - AI 101, Agentic Workflow, Business insights. From ML history to AI trends. Led by @kseniase_ Know what you are talking about👇🏼

Join over 102,000 readers شامل ہوئے Haziran 2020
11.4K فالونگ82.9K فالوورز
Ksenia_TuringPost
Ksenia_TuringPost@TheTuringPost·
A new paper from @ylecun and others – V-JEPA 2.1 It changes the recipe of V-JEPA so the model learns both: • Global semantics – what is happening in the scene • Dense spatio-temporal structure – where things are and how they move The idea is to supervise not just masked tokens but the visible ones too There are 4 key ingredients for V-JEPA 2.1: - Dense prediction loss on both masked and visible tokens - Deep self-supervision across intermediate layers - Modality-specific tokenizers (2D for images, 3D for videos) within a shared encoder - Model + data scaling The workflow turns into: masked image/video → encode visible tokens → predict latent representations for both masked and visible tokens → supervise at multiple layers Here are the details:
Ksenia_TuringPost tweet media
English
5
37
145
23K
Ksenia_TuringPost
Ksenia_TuringPost@TheTuringPost·
NemoClaw – NVIDIA’s contribution to the emerging OpenClaw ecosystem and one of the biggest announcements at NVIDIA GTC It's a framework for long-running autonomous agents. ▪️ The idea: Install OpenClaw together with Nemotron models and OpenShell (NVIDIA’s new security runtime) in a single command. NemoClaw gives agents a sandboxed execution environment that: - runs OpenClaw inside a secure container – OpenShell - enforces policies on network, filesystem, and processes - routes all model calls via NVIDIA cloud - provides CLI tools to manage agents In other words, NVIDIA is no longer aiming only to power the model. It wants to sit under the agent itself.
Ksenia_TuringPost tweet media
English
10
22
130
8.9K
Ksenia_TuringPost
Ksenia_TuringPost@TheTuringPost·
OpenViking – filesystem memory for AI agents It gives agents a structured navigable context system that: - replaces flat vector storage with a filesystem (viking://) - unifies memory, resources, and skills - loads context in layers (L0/L1/L2) to save tokens - retrieves info via directory-aware search (not flat RAG) - makes retrieval traceable and debuggable → So it's a combination of structured navigation + semantic (embedding-based) retrieval This approach delivers better retrieval accuracy, up to 80–96% lower token cost and self-improving memory over time.
Ksenia_TuringPost tweet media
English
12
30
108
7.5K
Ksenia_TuringPost ری ٹویٹ کیا
Ksenia_TuringPost
Ksenia_TuringPost@TheTuringPost·
NVIDIA's Nemotron 3 is an architectural response to the 2 pressures: - Long-context cost as agentic interactions scale - Repeated reasoning cost from invoking full models for small subtasks Nemotron 3 proposes several design decisions to solve this: ▪️ Hybrid architecture: Transformer + Mamba 2 layers for efficient long-context processing ▪️ Mixture-of-Experts (MoE) and LatentMoE on top of it to get cheaper experts ▪️ Multi-token prediction ▪️ NVFP4 precision = 4.75 bits used for inference and pre-training, allowing Nemotron pre-training dataset achieve up to 4× faster convergence than standard open web datasets. This is all about one key idea – "Acceleration is intelligence" Here is the tech stack explained and what the Nemotron Coalition is – NVIDIA has just announced that this alliance of leading players like Cursor, Mistral, Black Forest Labs, etc., is gathering to develop the Nemotron family of open models → turingpost.com/p/nemotroncoal…
Ksenia_TuringPost tweet media
English
4
14
94
4.9K
Ksenia_TuringPost
Ksenia_TuringPost@TheTuringPost·
8. So V-JEPA 2.1 looks strong across both prediction and dense visual understanding (even with the encoder kept frozen) Some of the results: • +20% robot grasping success over V-JEPA 2 in zero-shot real-world manipulation • 10× faster navigation planning, with 5.687 ATE on Tartan Drive And new SOTA: • 7.71 mAP on Ego4D short-term object interaction anticipation • 40.8 Recall@5 on EPIC-KITCHENS action anticipation
Ksenia_TuringPost tweet mediaKsenia_TuringPost tweet mediaKsenia_TuringPost tweet media
English
1
0
0
430
Ksenia_TuringPost ری ٹویٹ کیا
Ksenia_TuringPost
Ksenia_TuringPost@TheTuringPost·
It was a busy week @NVIDIAGTC! Celebrating my birthday on the road 🎉
Ksenia_TuringPost tweet media
English
4
1
32
2.3K
Ksenia_TuringPost
Ksenia_TuringPost@TheTuringPost·
As I see from the docs, yes, it is hierarchical, and the current retrieval strategy is branch-local, which does risk orphaning context spread across branches. would love to hear your thoughts, @jerryjliu0, @hwchase17 and @simonw, if it really works like this and is worth addressing
English
0
0
1
52
Hans C Nelson 🗽
Hans C Nelson 🗽@HansCNelson·
From what I understand, the viking fs is purely hierarchical w/ no cross folder referencing, correct? Does that lead to orphaned retrieval of relevant context is spread across multiple hierarchical branches? Is the lack of cross branch memory retrieval a problem worth addressing?
English
2
0
0
137
Ksenia_TuringPost
Ksenia_TuringPost@TheTuringPost·
@tricalt It’s technically open (NVIDIA says agents like Claude and Codex can run). But NemoClaw routes inference through NVIDIA Cloud and defaults to Nemotron, so it will probably favor their own models more
English
1
0
0
57
Vasilije
Vasilije@tricalt·
@TheTuringPost will it be open to any model? or are they going to push for the ones that they work closely with
English
1
0
0
44
Ksenia_TuringPost ری ٹویٹ کیا
Ksenia_TuringPost
Ksenia_TuringPost@TheTuringPost·
Must-read AI research of the week: ▪️ OpenClaw-RL ▪️ Meta-Reinforcement Learning with Self-Reflection for Agentic Search ▪️ Agentic Critical Training ▪️ Video-Based Reward Modeling for Computer-Use Agents ▪️ AutoResearch-RL ▪️ Neural Thickets ▪️ Training Language Models via Neural Cellular Automata ▪️ The Curse and Blessing of Mean Bias in FP4-Quantized LLM Training ▪️ Lost in Backpropagation: The LM Head is a Gradient Bottleneck ▪️ IndexCache ▪️ Attention Residuals ▪️ REMIX: Reinforcement Routing for Mixtures of LoRAs in LLM Finetuning ▪️ Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections ▪️ Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs ▪️ How Far Can Unsupervised RLVR Scale LLM Training? ▪️ Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training ▪️ Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs ▪️ Scale Space Diffusion Find the full list and the main AI news and updates from NVIDIA GTC here: turingpost.com/p/fod144
Ksenia_TuringPost tweet media
English
14
118
552
26.9K