Imama

634 posts

Imama banner
Imama

Imama

@Caffeinix_alche

Gpu heavy thoughts at 1024 threads, token-deep focus and a coffee devotee.

404👾 🌎 ☁️ Katılım Eylül 2021
1.2K Takip Edilen141 Takipçiler
Imama
Imama@Caffeinix_alche·
@cwolferesearch @ManavGarkel Valid point on the logging approach! Also do you find evaluating the reasoning quality at each handoff changes how you structure your transcripts? Asking because Superbryn are tackling agent observability from that angle curious how the community is thinking about this.
English
0
0
0
3
Cameron R. Wolfe, Ph.D.
Cameron R. Wolfe, Ph.D.@cwolferesearch·
@ManavGarkel you need to log details of all agent actions / handoffs and then you can add checks to the transcript to verify aspects of how handoffs / problem solving occured.
English
1
0
1
313
Cameron R. Wolfe, Ph.D.
Cameron R. Wolfe, Ph.D.@cwolferesearch·
I just published a detailed guide on evaluating agents. It covers: 1. Agent fundamentals (everything from basic concepts to complex ideas like multi-agent systems). 2. Common evaluation patterns / frameworks observed in practice. 3. Case studies of popular agent benchmarks (e.g., Tau-Bench and Terminal-Bench series). Building high-quality evaluation capabilities is now more important than ever due to the growing adoption of agents in high-stakes applications like coding and medicine. Although evaluation is time-consuming and difficult, learning how to properly evaluate agents is incredibly valuable. Rigorously measuring performance and not relying on anecdotal checks allows us to rapidly improve agent capabilities.
Cameron R. Wolfe, Ph.D. tweet media
English
9
45
348
36.6K
Imama
Imama@Caffeinix_alche·
@PereMartra @ManningBooks Really cool result reminds me of NVIDIA's Minitron approach prune low-importance blocks first, then distill the capability back in. You're compressing redundancy not intelligence. Has anyone tried pushing this further with layer-sharing before the distillation step?
English
1
0
1
48
Kaivan Shah
Kaivan Shah@fshadowbuilds·
Got an idea? Find your co-founder. Create your project and get matched with founders who align on equity, commitment, and vision. Takes 2 minutes.
English
41
22
469
787K
Imama
Imama@Caffeinix_alche·
@EliyaHabba Interesting work .Benchmarks are starting to look like distributed systems 😅 You can’t recompute the whole state every time something new arrives. Using anchor tasks + calibration feels a lot like versioning API evolve the system without breaking comparability.
English
1
0
0
22
Eliya Habba
Eliya Habba@EliyaHabba·
New datasets keep coming, New models keep coming. Frustrating! How can we evaluate everything on everything? How do we keep scores comparable over time? We propose a way to grow benchmark suites without losing comparability. Details:👇🧵
Eliya Habba tweet media
English
3
12
38
2K
Imama
Imama@Caffeinix_alche·
@nikkithashanker @FarzaTV @heyclicky Feels a lot like the shift from command lines to GUIs😀 once the interface becomes more natural adoption becomes easier.For voice agents latency is the new loading screen and reliability is the new UX.
English
1
0
1
31
Imama
Imama@Caffeinix_alche·
past few weeks taught me rest isn't a reward at the end of an episode it's part of the policy 🧠..survived on minimal sleep💀 laptop stays closed this weekend. one day left and today already feels like freedom fr (and yes my jokes are RL brained now..🤖)
English
0
0
3
28
Imama
Imama@Caffeinix_alche·
@sarahookr @adaption_ai hey this sounds super exciting the idea of shaping adaptive intelligence + environments is 🔥 what you’re building at Adaption AI feels really fresh curious... what kind of problems or research directions are you most focused on hiring for right now?
English
0
0
0
89
Sara Hooker
Sara Hooker@sarahookr·
Stop inheriting intelligence. Shape it. Join us to build the next era of intelligence @adaption_ai Hiring across research and engineering. Very exciting time to join. We cooking something very special for upcoming releases bridging adaptive intelligence + environments.
English
21
24
330
23.5K
Imama
Imama@Caffeinix_alche·
never knew writing LaTeX would be this frustrating… one small change and suddenly the whole doc decides to break 😭
English
1
0
5
88
yashika
yashika@YashikaChugh4·
i think the best part about being in office has been the gadgets > colorful keyboard > mini gaming laptop from an alley in SG > 3D printer in motion
yashika tweet mediayashika tweet mediayashika tweet media
English
24
1
175
8.6K
Imama
Imama@Caffeinix_alche·
@maharshii This is more like you summoned an ancient GPU deity. At this point if you whisper “wgmma” three times into your terminal Jensen Huang probably appears in a leather jacket and asks about your memory bandwidth. Go hydrate, your registers are spilling 🫠
English
0
0
3
208
maharshi
maharshi@maharshii·
triton, gluon, cutedsl, hopper, blackwell, tensorcores, layouts, composition, local_tile, partitionS, partitionD, wgmma, tcgen05, TMA, block scaling, coalesced access, ampere, ada lovelace, cutlass, cublas, cudnn, flash attention, gemm, sgemm, fp16, bf16, mxfp8, nvfp4, int4, quantization, mixed precision, occupancy, reductions, warp divergence, bank conflicts, memory coalescing, shared memory, global memory, texture memory, constant memory, unified memory, epilogues, kernel fusion, graph optimization, tensorrt, torch compile, dynamo, inductor, graph capture, thread blocks, warps, SIMT, streaming multiprocessors, L1 cache, L2 cache, register spilling, thread divergence, memory bandwidth, compute capability, CUDA cores, ldg, stg, ncu, nsys, atomic operations, syncthreads, cooperative groups, dynamic parallelism, persistent kernels, vectorized loads, static quantization, tensors, swizzling, predication, instruction throughput, memory latency hiding...
maharshi tweet media
English
20
15
439
15K
Imama
Imama@Caffeinix_alche·
@abhisheknaironx Haha lowkey smart marketing 😄 people might look it up and realize it’s something else entirely.
English
0
0
0
100
Abhishek Nair
Abhishek Nair@abhisheknaironx·
my frnd saw the pink wispr flow autos in Koramangala and was convinced that it's a sanitary pads company 🙂 I didn't take any photos, so here's a stolen one.
Abhishek Nair tweet media
English
63
29
1.2K
81.7K
Imama
Imama@Caffeinix_alche·
@DalRotiForLife Me too still waiting for this 😭😭never would have thought that I will wait this desperately in Blr
English
0
0
0
112
Kanika
Kanika@DalRotiForLife·
can’t wait for the first shower of the year in blr, and then rushing to nearest thindi for hot dosae & kapi🤤🤤🤤
English
15
2
166
7.3K
Imama
Imama@Caffeinix_alche·
@Ramneet_Singhh try this:Passing an empty torch_cuda_arch_list makes vLLM auto-detect your current GPU and build only for that arch kills 60%+ of nvcc work. VLLM_USE_PRECOMPILED pulls pre-built kernel wheels. sccache over ccache. GPU-accelerated compilation of GPU code remains a cursed problem💀
English
0
0
1
38
Ramneet Singh
Ramneet Singh@Ramneet_Singhh·
VLLM build from source is reminding me of the nightmare that is LLVM build from source. GPU-accelerated compilation please, anyone?
English
2
0
4
314
Imama
Imama@Caffeinix_alche·
@nikkithashanker @NotTheCh05en1 @Parthjain_01 Looks like a casual ‘weekend chill’ turned into a full blown brainstorming session 🫡 Respect the hustle, but also… does the whiteboard at least get weekends off? 🙃
English
0
0
0
37
Nikkitha Shanker
Nikkitha Shanker@nikkithashanker·
The urge to write a LinkedIn style post about working on weekends is always there, on X
Nikkitha Shanker tweet media
English
2
2
9
166
Ritu Joon
Ritu Joon@ritujoon2j·
I'm getting bored with this now. Suggest other instant coffees besides this.
Ritu Joon tweet media
English
230
10
335
227.8K