Ankit Khandelwal

1.6K posts

Ankit Khandelwal

Ankit Khandelwal

@ankk98

AI x Robotics x Data • Founder https://t.co/TSPpuQCR5P • Ex-Founder @jiitodc • Writing at https://t.co/KeCSVzC46r

Bangalore, India Katılım Ekim 2017
1.1K Takip Edilen150 Takipçiler
Ankit Khandelwal
Ankit Khandelwal@ankk98·
@paulg Bullet points are great. They naturally provide logical breaks. I don't think optimising for whitespace provides any benefits.
English
0
0
0
2
Paul Graham
Paul Graham@paulg·
The fact that AIs tend to answer you in bulleted lists tells us something important, though somewhat depressing: people can't read. They don't do this by accident. What you're seeing is an implicit portrait of the median user.
English
446
130
2.4K
193.3K
Harveen Singh Chadha
Harveen Singh Chadha@HarveenChadha·
all models suck at long form writing, how tf are people writing books with LLMs to get a first draft, 10k-15k words is the ideal limit
English
12
1
92
4.2K
Ankit Khandelwal
Ankit Khandelwal@ankk98·
@lossfunk Whenever I see any post starting with the word 'Shocking', my mind says it's a clickbait.
English
0
0
1
113
Lossfunk
Lossfunk@lossfunk·
🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵
English
149
282
2.2K
1.2M
Luke Metro
Luke Metro@luke_metro·
Nvidia GTC 2026:
Luke Metro tweet media
Português
15
24
651
39.6K
Eddy Xu
Eddy Xu@eddybuild·
today, we’re open sourcing the largest egocentric dataset in history. - 10,000 hours - 2,153 factory workers - 1,080,000,000 frames the era of data scaling in robotics is here. (thread)
English
267
675
6.5K
2.1M
Rohan Paul
Rohan Paul@rohanpaul_ai·
Egocentric-10K, the largest egocentric dataset from Build AI just dropped on @huggingface a real factory first-person video dataset with 10,000 hours, 1.08B frames, and 2,138 workers, licensed Apache 2.0 and hosted on Hugging Face. Egocentric here means the camera is on a worker’s head so the model sees hands, tools, and objects from the doer’s view, which is the right signal for learning manipulation and step-by-step tasks. prior large egocentric sets like Ego4D had around 3,000+ hours in daily life scenes rather than production lines, so this is a big jump in both scale and domain specificity for factories. The dataset emphasizes high hand visibility and dense manipulation, which are exactly the pixels imitation learning and visuomotor policies need to map observations to actions. Clips are 1080p at 30 fps, organized as WebDataset shards with paired JSON metadata, totaling 16.4 TB, and they can be streamed directly via Hugging Face for training without full downloads. An evaluation split with 30,000 frames is also provided to benchmark hand and object interaction tasks in this domain.
Rohan Paul tweet media
Eddy Xu@eddybuild

today, we’re open sourcing the largest egocentric dataset in history. - 10,000 hours - 2,153 factory workers - 1,080,000,000 frames the era of data scaling in robotics is here. (thread)

English
2
5
20
8.2K
Eddy Xu
Eddy Xu@eddybuild·
today, build ai is open sourcing the largest egocentric dataset in the world - 100,000 hours - 14,228 workers - 10,800,000,000 frames each pixel is a unique video (try below)
English
54
64
801
216.2K
Aritra 🤗
Aritra 🤗@ariG23498·
When you run a @PyTorch model on a GPU, the acutal work is executed through kernels. These are low-level, hardware-specific functions designed for GPUs (or other accelerators). If you profile a model, you'll see a sequence of kernel launches. Between these launches, the GPU can sit idle, waiting for the next operation. A key optimization goal is therefore to minimize gaps between kernel execution and keep the GPU fully utilized. One common approach is `torch.compile`, which fuses multiple operations into fewer kernels, reducing overhead and improving utilization. Another approach is to write custom kernels tailored to specific workfloads (e.g., optimized attention or fused ops). However, this comes with significant challenges: > requires deep expertise in kernels writing > installation hell > integration with the model is non-trivial To address this,@huggingface introduces the `kernels` library. With this one can: > build custom kernels (with the help of a template) > upload them to the Hub (like models or datasets) > integrate them to models with ease Let's take a look at how the transformers team use the kernels library to integrate it into the already existing models. (more in the thread)
English
19
88
1.2K
82.3K
Ankit Khandelwal
Ankit Khandelwal@ankk98·
Introducing **Kriya-Egocentric-100K**: Action100M-style, fully automatic video action annotations for a 5-video preview of Build AI’s Egocentric-100K dataset on Hugging Face. First-person manual-labor videos (head-mounted fisheye camera) with hierarchical temporal segments + LLM-generated captions & GPT summaries. Super promising for video understanding, robotics, world models & physical AI! huggingface.co/datasets/ankk9… #EgocentricVision #ActionRecognition #VideoAI #LLM #HuggingFace #PhysicalAI #Robotics
English
1
0
1
112
Ankit Khandelwal
Ankit Khandelwal@ankk98·
Introducing Kriya-EPIC-KITCHENS 🎥 – Action100M-style, automatic video action annotations for a preview subset of EPIC-KITCHENS-100 on Hugging Face. Results on egocentric kitchen videos look very promising for downstream video understanding. huggingface.co/datasets/ankk9…
English
4
0
0
98
VraserX e/acc
VraserX e/acc@VraserX·
V-JEPA 2 is Meta’s new world model. Trained on 1M+ hours of video + robot data, it doesn’t predict pixels but abstract states. That means it learns physics, causality, anticipation. Less guessing frames, more understanding reality.
VraserX e/acc tweet media
English
2
1
6
438
AI at Meta
AI at Meta@AIatMeta·
Introducing V-JEPA 2, a new world model with state-of-the-art performance in visual understanding and prediction. V-JEPA 2 can enable zero-shot planning in robots—allowing them to plan and execute tasks in unfamiliar environments. Download V-JEPA 2 and read our research paper ➡️ ai.meta.com/vjepa/
GIF
English
32
136
702
55.7K
Ksenia_TuringPost
Ksenia_TuringPost@TheTuringPost·
A new comer in the family of world models — V-JEPA-2 By combining 1M+ hours of internet videos and a little bit of robot interaction data, @AIatMeta built an AI that can: • Watch • Understand • Answer questions • Help robots plan and act in physical world V-JEPA 2 shows true success of self-supervised learning and efficient scaling of everything. Here is how it actually works:
Ksenia_TuringPost tweet media
AI at Meta@AIatMeta

Introducing V-JEPA 2, a new world model with state-of-the-art performance in visual understanding and prediction. V-JEPA 2 can enable zero-shot planning in robots—allowing them to plan and execute tasks in unfamiliar environments. Download V-JEPA 2 and read our research paper ➡️ ai.meta.com/vjepa/

English
3
3
31
3K
elvis
elvis@omarsar0·
NEW: Meta releases V-JEPA 2, their new world model! Foundation world models aim to accelerate physical AI, the next frontier. Why is this a big deal? Let's break it down:
elvis tweet media
English
12
103
595
80.1K
Ankit Khandelwal
Ankit Khandelwal@ankk98·
This version of the API focuses on replicating the Action100M pipeline. An improved pipeline that builds on it while addressing its shortcomings is coming soon.
English
0
0
0
28