Imama

634 posts

Imama

@Caffeinix_alche

Gpu heavy thoughts at 1024 threads, token-deep focus and a coffee devotee.

404👾 🌎 ☁️ Katılım Eylül 2021

1.2K Takip Edilen141 Takipçiler

Imama@Caffeinix_alche·13m

Aww cute ....

Vikram@Vikram10729557

His laugh was very cute and real😂❤

English

Imama@Caffeinix_alche·3h

@cwolferesearch @ManavGarkel Valid point on the logging approach! Also do you find evaluating the reasoning quality at each handoff changes how you structure your transcripts? Asking because Superbryn are tackling agent observability from that angle curious how the community is thinking about this.

English

Cameron R. Wolfe, Ph.D.@cwolferesearch·6h

@ManavGarkel you need to log details of all agent actions / handoffs and then you can add checks to the transcript to verify aspects of how handoffs / problem solving occured.

English

313

Cameron R. Wolfe, Ph.D.@cwolferesearch·18h

I just published a detailed guide on evaluating agents. It covers: 1. Agent fundamentals (everything from basic concepts to complex ideas like multi-agent systems). 2. Common evaluation patterns / frameworks observed in practice. 3. Case studies of popular agent benchmarks (e.g., Tau-Bench and Terminal-Bench series). Building high-quality evaluation capabilities is now more important than ever due to the growing adoption of agents in high-stakes applications like coding and medicine. Although evaluation is time-consuming and difficult, learning how to properly evaluate agents is incredibly valuable. Rigorously measuring performance and not relying on anecdotal checks allows us to rapidly improve agent capabilities.

English

348

36.6K

Imama@Caffeinix_alche·4d

Interesting..LLMs rediscovered Fourier analysis to do arithmetic.numbers encoded as frequencies on a circle. addition = phase shift. not memorizing 6+7 interfering wave patterns until the answer emerges. what other frameworks is it running that we haven't discovered yet?

Goodfire@GoodfireAI

Neural networks do math by rotating shapes. We found a shape-rotating calculator hidden inside an LLM – and it’s used for more than just math! (1/6)

English

Imama@Caffeinix_alche·4d

@kalamazooooo @aigrantsindia @smallest_AI @nvidia @jarvislabsai @0xBosky @nova_residency Interesting 👀 any specific domains/projects you’re hoping people build around or is it fully open-ended chaos and creativity? The “Uber’s on us” part is actually wild 😂

English

122

Manasa Kalaimalai@kalamazooooo·4d

nova's 1st partner just dropped @aigrantsindia is covering: - voice ai credits @smallest_AI - 90+ open models @nvidia - 5000 GPU hrs @jarvislabsai - everyone's ride to the hacker house covered yes. your uber is on us. thanks @0xBosky :D

English

105

6.1K

Imama@Caffeinix_alche·5d

@PereMartra @ManningBooks Really cool result reminds me of NVIDIA's Minitron approach prune low-importance blocks first, then distill the capability back in. You're compressing redundancy not intelligence. Has anyone tried pushing this further with layer-sharing before the distillation step?

English

pere martra@PereMartra·5d

A Gemma model with 4 fewer Transformer blocks achieves better PPL and Winogrande than the original. That's what pruning + Knowledge Distillation achieve together. Chapters 5 & 6 of Rearchitecting LLMs dropped on @ManningBooks . hubs.la/Q040twqq0 #LLMs #MachineLearning

English

198

Imama@Caffeinix_alche·5d

@fshadowbuilds Interesting idea.Lets connect

English

Kaivan Shah@fshadowbuilds·11 May

Got an idea? Find your co-founder. Create your project and get matched with founders who align on equity, commitment, and vision. Takes 2 minutes.

English

469

787K

Imama@Caffeinix_alche·8 May

@EliyaHabba Interesting work .Benchmarks are starting to look like distributed systems 😅 You can’t recompute the whole state every time something new arrives. Using anchor tasks + calibration feels a lot like versioning API evolve the system without breaking comparability.

English

Eliya Habba@EliyaHabba·7 May

New datasets keep coming, New models keep coming. Frustrating! How can we evaluate everything on everything? How do we keep scores comparable over time? We propose a way to grow benchmark suites without losing comparability. Details:👇🧵

English

Imama@Caffeinix_alche·8 May

@nikkithashanker @FarzaTV @heyclicky Feels a lot like the shift from command lines to GUIs😀 once the interface becomes more natural adoption becomes easier.For voice agents latency is the new loading screen and reliability is the new UX.

English

Nikkitha Shanker@nikkithashanker·8 May

Mark my words, voice will become the new interface for software We are already seeing this with @FarzaTV 's @heyclicky. Even companies like Zillow are building agents to help people buy homes Every interface shift needs two things: 1. Lower cost. 2. Higher reliability.

OpenAI@OpenAI

Introducing GPT-Realtime-2 in the API: our most intelligent voice model yet, bringing GPT-5-class reasoning to voice agents. Voice agents are now real-time collaborators that can listen, reason, and solve complex problems as conversations unfold. Now available in the API alongside streaming models GPT-Realtime-Translate and GPT-Realtime-Whisper — a new set of audio capabilities for the next generation of voice interfaces.

English

2.8K

Imama@Caffeinix_alche·7 May

past few weeks taught me rest isn't a reward at the end of an episode it's part of the policy 🧠..survived on minimal sleep💀 laptop stays closed this weekend. one day left and today already feels like freedom fr (and yes my jokes are RL brained now..🤖)

English

Imama@Caffeinix_alche·3 May

@miniapeur

GIF

QME

217

Mathieu@miniapeur·3 May

It is NeurIPS deadline in 4 days so yeah it is not possible to take a day off.

The PhD Place@ThePhDPlace

Do NOT feel guilty for taking a day off at the weekend

English

215

21K

Imama@Caffeinix_alche·1 May

@sarahookr @adaption_ai hey this sounds super exciting the idea of shaping adaptive intelligence + environments is 🔥 what you’re building at Adaption AI feels really fresh curious... what kind of problems or research directions are you most focused on hiring for right now?

English

Sara Hooker@sarahookr·30 Nis

Stop inheriting intelligence. Shape it. Join us to build the next era of intelligence @adaption_ai Hiring across research and engineering. Very exciting time to join. We cooking something very special for upcoming releases bridging adaptive intelligence + environments.

English

330

23.5K

Imama@Caffeinix_alche·1 May

never knew writing LaTeX would be this frustrating… one small change and suddenly the whole doc decides to break 😭

English

Imama@Caffeinix_alche·29 Nis

@YashikaChugh4 Damn the keyboard looks amazing 🤩 I am already sold @c_engines if this is the vibe

English

yashika@YashikaChugh4·29 Nis

i think the best part about being in office has been the gadgets > colorful keyboard > mini gaming laptop from an alley in SG > 3D printer in motion

English

175

8.6K

Imama@Caffeinix_alche·29 Nis

@maharshii This is more like you summoned an ancient GPU deity. At this point if you whisper “wgmma” three times into your terminal Jensen Huang probably appears in a leather jacket and asks about your memory bandwidth. Go hydrate, your registers are spilling 🫠

English

208

maharshi@maharshii·29 Nis

triton, gluon, cutedsl, hopper, blackwell, tensorcores, layouts, composition, local_tile, partitionS, partitionD, wgmma, tcgen05, TMA, block scaling, coalesced access, ampere, ada lovelace, cutlass, cublas, cudnn, flash attention, gemm, sgemm, fp16, bf16, mxfp8, nvfp4, int4, quantization, mixed precision, occupancy, reductions, warp divergence, bank conflicts, memory coalescing, shared memory, global memory, texture memory, constant memory, unified memory, epilogues, kernel fusion, graph optimization, tensorrt, torch compile, dynamo, inductor, graph capture, thread blocks, warps, SIMT, streaming multiprocessors, L1 cache, L2 cache, register spilling, thread divergence, memory bandwidth, compute capability, CUDA cores, ldg, stg, ncu, nsys, atomic operations, syncthreads, cooperative groups, dynamic parallelism, persistent kernels, vectorized loads, static quantization, tensors, swizzling, predication, instruction throughput, memory latency hiding...

English

439

15K

Imama@Caffeinix_alche·28 Nis

@abhisheknaironx Haha lowkey smart marketing 😄 people might look it up and realize it’s something else entirely.

English

100

Abhishek Nair@abhisheknaironx·28 Nis

my frnd saw the pink wispr flow autos in Koramangala and was convinced that it's a sanitary pads company 🙂 I didn't take any photos, so here's a stolen one.

English

1.2K

81.7K

Imama@Caffeinix_alche·28 Nis

@DalRotiForLife Me too still waiting for this 😭😭never would have thought that I will wait this desperately in Blr

English

112

Kanika@DalRotiForLife·28 Nis

can’t wait for the first shower of the year in blr, and then rushing to nearest thindi for hot dosae & kapi🤤🤤🤤

English

166

7.3K

Imama@Caffeinix_alche·28 Nis

@Ramneet_Singhh try this:Passing an empty torch_cuda_arch_list makes vLLM auto-detect your current GPU and build only for that arch kills 60%+ of nvcc work. VLLM_USE_PRECOMPILED pulls pre-built kernel wheels. sccache over ccache. GPU-accelerated compilation of GPU code remains a cursed problem💀

English

Ramneet Singh@Ramneet_Singhh·27 Nis

VLLM build from source is reminding me of the nightmare that is LLVM build from source. GPU-accelerated compilation please, anyone?

English

314

Imama@Caffeinix_alche·27 Nis

@cheshta_rajora @fmrbangalore @BangaloreRoomi @GruhamBot @FlatsnFlatmates @Flashmateshq Hey! I know a place that should fit your budget pretty well and is in a decent location too. DM me I’ll share the details 👍

English

Cheshta@cheshta_rajora·27 Nis

Hello Bengaluru! Looking for a 1/2bhk for myself to move in from June 1st. Budget up to ₹17k. Preferred within a 6–7 km radius of Lalbagh. Slightly further if metro is walking distance. Please DM with leads. @fmrbangalore @BangaloreRoomi @GruhamBot @FlatsnFlatmates @Flashmateshq

English

696

Imama@Caffeinix_alche·26 Nis

@nikkithashanker @NotTheCh05en1 @Parthjain_01 Looks like a casual ‘weekend chill’ turned into a full blown brainstorming session 🫡 Respect the hustle, but also… does the whiteboard at least get weekends off? 🙃

English