Herumb Shandilya 🦀

1.7K posts

Herumb Shandilya 🦀

@krypticmouse

Research @ScalingIntelLab @HazyResearch | Incoming Research Engineer @mixedbreadai | Building DSRs | MSCS, ColBERT, DSPy @Stanford

Stanford Katılım Aralık 2013

512 Takip Edilen2.4K Takipçiler

Sabitlenmiş Tweet

Herumb Shandilya 🦀@krypticmouse·10 Eyl

DSRs, @DSPyOSS for Rust is here🚀 Happy to finally announce the stable release of DSRs. Over the past few months, I’ve been building DSRs with incredible support and contributions from folks Maguire Papay, @tech_optimist, and @joshmo_dev. A big shout out to @lateinteraction and @ChenMoneyQ who were the first people to hear my frequent rants on this!! Couldn't have done this without all of them. DSRs originally started as a passion project to explore true compilation and as it progressed I saw it becoming more. I can’t wait to see what the community builds with it. DSRs is a 3 phase project: 1. API Stabilization. We are nearly done with this and it was mostly implementing the API design. We kept the DSPy style in mind and tried to keep it close to it so it's easier to onboard and while at it we tried to improve it and make it a bit more idiomatic and intuitive! 2. Performance Optimisation with benchmarking vs DSPy. We want to benchmark LLMs performance vs DSPy, with API design finalized we want to improve performance in every front. We'll improve the latency and improve the templates and optimizers in DSRs. 3. True Module Compilation. Why should you optimize signature when you can optimize and fuse much more? This is the idea of the final phase of DSRs. A true LLM workflow compiler. More on this after Phase 2. Really grateful for @PrimeIntellect offering compute to drive Phase 2 and 3 experimentation for this! Big shoutout to them and @johannes_hage for this!!! But what is DSRs? What does it offer? Let's see.

English

212

39.5K

Herumb Shandilya 🦀@krypticmouse·14h

Some string compaction and unsafe code(might dump this)...

English

Herumb Shandilya 🦀@krypticmouse·17h

Some operation fusing will give you this:

English

Herumb Shandilya 🦀@krypticmouse·22h

Kinda amazing what a simple data structure switch can do 🙂

English

500

Herumb Shandilya 🦀@krypticmouse·3d

@swayaminsync 0️⃣ build a decent benchmark…

English

Swayam Singh@swayaminsync·4d

Developing Benchmarks: A First-Time Parent's Guide 1️⃣ Think and create panics to log for all things that can go wrong during a run (out-of-context, invalid parsing, no-response, etc) 2️⃣ If possible make the setup to be able to run concurrently with multi-threads/processes 3️⃣ Implement checkpointing to resume a left-off run 4️⃣ Pin every dependency version, model checkpoint hash, and random seed 5️⃣ Log token counts (input/output) per sample 6️⃣ Log all the events in a file (every single one) 7️⃣ Define a retry policy with exponential backoff for transient failures

English

172

Herumb Shandilya 🦀 retweetledi

Joel Dierkes@joeldierkes·6d

Mixedbread just made 115h of videos accessible to my agent. With the new @mixedbreadai v3 release, you can upload any video to your Mixedbread store and make its content accessible to your agent.

English

3.2K

Herumb Shandilya 🦀@krypticmouse·14 Mar

@DhravyaShah

GIF

QME

301

Dhravya Shah@DhravyaShah·14 Mar

BTW this is exactly what's going on in the memory / retrieval space. Everyone's freakin lying we are trying to fix it with memorybench

Ara@arafatkatze

Turns out @openblocklabs is a complete fraud who gamed their Terminal bench SOTA score. They cheated by putting the result verifier values INSIDE the binary before running the eval and then publicly reported that score as their SOTA score. Read the breakdown here

English

132

17.8K

Herumb Shandilya 🦀@krypticmouse·12 Mar

@bclavie Solved LIMIT before GTA 6….

English

299

Ben Clavié@bclavie·12 Mar

I'm so excited to introduce this! We've worked on a million different moving parts to produce this. I'm fairly confident it's the best multimodal model that exists, period -- and it's not too shabby at pushing back the LIMITs of retrieval either...

Mixedbread@mixedbreadai

Introducing Mixedbread Wholembed v3, our new SOTA retrieval model across all modalities and 100+ languages. Wholembed v3 brings best-in-class search to text, audio, images, PDFs, videos... You can now get the best retrieval performance on your data, no matter its format.

English

410

138.9K

Herumb Shandilya 🦀@krypticmouse·12 Mar

Had the most fun I’ve had working on OpenJarvis, learned so much from this project! So happy it’s finally public! Give it a try and let us know any feedbacks you have!

Jon Saad-Falcon@JonSaadFalcon

Personal AI should run on your personal devices. So, we built OpenJarvis: a personal AI that lives, learns, and works on-device. Try it today and top the OpenJarvis Leaderboard for a chance to win a Mac Mini! Collab w/ @Avanika15, John Hennessy, @HazyResearch, and @Azaliamirh. Details in thread.

English

796

Herumb Shandilya 🦀 retweetledi

Mixedbread@mixedbreadai·12 Mar

English

121

946

186.7K

Herumb Shandilya 🦀@krypticmouse·10 Mar

@aaxsh18

GIF

QME

Aamir@aaxsh18·10 Mar

too many people claiming sota these days...

English

813

Herumb Shandilya 🦀@krypticmouse·10 Mar

@halcyonrayes

GIF

QME

Suvaditya Mukherjee@halcyonrayes·10 Mar

@krypticmouse

QME

Herumb Shandilya 🦀@krypticmouse·10 Mar

Idk much about vague posting but... `synapse apply examples/supermemory.mnm` `synapse apply examples/zep.mnm` `synapse apply examples/letta.mnm` Memory is Retrieval. soon.

English

266

Herumb Shandilya 🦀@krypticmouse·7 Mar

@lateinteraction

GIF

QME

902

Omar Khattab@lateinteraction·7 Mar

judging by emails, seemingly every other lab is trying to hire leads for their search teams now; it kind of feels late for that?

English

168

29.3K

Herumb Shandilya 🦀@krypticmouse·27 Şub

@ChenMoneyQ @dbreunig I want to🥲, except I have automata final. I knew I should have pushed harder to get this waived off🥲

English

Chen Qian@ChenMoneyQ·27 Şub

@krypticmouse @dbreunig Mouse you should come😄

English

Drew Breunig@dbreunig·26 Şub

On March 18th, we're hosting another Bay Area DSPy Meetup featuring in-production case studies involving GEPA, tool use, and LLM judges from Dropbox and Shopify. (And we'll talk RLMs, too.) Join us! luma.com/je6ewmkx

English

5.7K

Herumb Shandilya 🦀@krypticmouse·27 Şub

@dbreunig @lateinteraction Hope someone hears this haha. This is my last quarter so I’m done 🙌🏻🙂

English

Drew Breunig@dbreunig·27 Şub

@krypticmouse @lateinteraction Really need to start staggering this quarterly schedule... Or maybe you can convince Stanford to go to Semesters???

English

Herumb Shandilya 🦀@krypticmouse·27 Şub

@lateinteraction @dbreunig Two times in a row 😭

English

Omar Khattab@lateinteraction·27 Şub

@krypticmouse @dbreunig wow that's a record!

English

Herumb Shandilya 🦀 retweetledi

Jon Saad-Falcon@JonSaadFalcon·26 Şub

With intelligence-per-watt (IPW), we propose a unified metric for measuring intelligence efficiency, capturing both the LM capabilities delivered and the energy required to power the AI stack, enabling a better understanding of how we scale local and cloud LLMs. Honored to be part of Slingshots // TWO! It's been a blast working with @LaudeInstitute on the IPW project. Big thanks to @andykonwinski @bradenjhancock @ChrisRytting and the whole Laude team for all the support!

Laude Institute@LaudeInstitute

Intelligence-Per-Watt/@JonSaadFalcon @Avanika15 John Hennessy @hazyresearch @Azaliamirh (@Stanford) - Most queries don't need frontier-model horsepower. This work makes "use the right model for the job" a measurable strategy, quantifying when smaller local models can match frontier quality while cutting energy, cost, and compute.

English

3.5K

Herumb Shandilya 🦀@krypticmouse·26 Şub

@KShivendu_ Lemme know if you come around Stanford🙂

English