Pushpendre Rastogi

336 posts

Pushpendre Rastogi

@Pushpendre89

CTO, Co-founder at https://t.co/5v7vwhRvWY | Ex Deepmind, Amazon, JHU PhD, IITD ECE

Palo Alto Katılım Mayıs 2012

711 Takip Edilen612 Takipçiler

Sabitlenmiş Tweet

Pushpendre Rastogi@Pushpendre89·21 Nis

I am hiring a founding security engineer for harden.run. harden.run/careers/foundi… . DM me for details. - We are focused on AI DevSecOps - We have hired a senior engg. with 20YOE in this area - Our paper on prompt-optimization was accepted at ICLR and is out on ArXiv. - We are on track to cross the million dollar ARR threshold in a month - We have grown from 2 to 6 already and on track to add two FDEs soon

English

247

Pushpendre Rastogi@Pushpendre89·13h

Just realized that aislop in reverse is @polsia. Subliminal messaging 😂

English

Pushpendre Rastogi@Pushpendre89·5d

@reubbr @GeoffreyHuntley @dexhorthy @0xblacklight Great writeup on structural backpressure, I wonder for the auth problem what you think about rego and OPA type solutions and their place in world in comparison to generated guards?

English

reuben@reubbr·5d

These concepts are not exotic; the ideas here extend the "backpressure" concept many have talked about -- @GeoffreyHuntley, @dexhorthy, @0xblacklight and many more. What's new: proof-as-spec mechanically lowered into the target language as deterministic gates

English

183

reuben@reubbr·5d

Most "better AI coding" takes chase smarter models and better prompts. I think that's the wrong frame. Here's a build *refusing* to skip a tenant-auth check, deterministically, before the binary even exists. The model didn't remember the rule; the substrate enforced it. 👇

GIF

English

1.6K

Pushpendre Rastogi@Pushpendre89·13 May

"Reflections on Trusting Trust" IFYKYK `print((s:='print((s:=%r)%%s)')%s)`

English

Pushpendre Rastogi@Pushpendre89·29 Nis

@yoavgo He has really good posts on infra for book scanning and how weird legalities force them to destroy books. So the course can get into legality/ethics on into techniques for high fidelity OCR and book-scanning.

English

Pushpendre Rastogi@Pushpendre89·29 Nis

@yoavgo @yoavgo Kenneth Heafield is not on Twitter but you probably remember him from KenLM, he built during his phd. He is now running a data company lasttoken.ai. I'd imagine he'd be a great guest lecturer. He has great posts on LI linkedin.com/posts/kenneth-…

English

(((ل()(ل() 'yoav))))👾@yoavgo·27 Nis

The big dilemma with teaching an "LLM course" is that it is really easy to get drawn into teaching the various technical things like efficiency tricks, attention variants, PPO vs GRPO, etc etc. But the real "meat" is not there, but in the data: data for pre-training, for mid-training, for SFT, for RL and for "reasoning", synthetic data, curated data, annotated data... cleaning, evaluating, improving, mixing, ... lots of stuff. but "data" is so much harder to teach: it is not "mathematic" or "algorithmic" like the technical things, and it is not clear what is the teachable thing there. it is also a lot less transparent than the technical topics, both because it is semi-secret, and also because it is also not appealing for publishing, for roughly the same reasons it is not appealing for teaching. so, what would you teach about data? what are the key lessons and insights one should know? any good papers or resources? good existing classes? blogs? hit me with what you have

English

829

58.7K

Pushpendre Rastogi@Pushpendre89·25 Nis

@Jason "I'm in"

English

@jason@Jason·23 Nis

We started an AI founder twitter group... reply with "I'm in" if you're a founder and want to be added

English

10.8K

134

4.6K

904.4K

Pushpendre Rastogi@Pushpendre89·25 Nis

@Jason I'm in

English

Pushpendre Rastogi@Pushpendre89·15 Nis

GPT5.4 is much better than Opus for high effort thinking.

Leeham@Liam06972452

GPT-5.4 Pro solves Erdős Problem #1196! Very pleased with this result; definitely my favourite thus far! This problem has been thought about for some time which makes this reasonably impressive and meaningful (see Lichtman's comments below). Formalisation is underway!

English

200

Pushpendre Rastogi@Pushpendre89·1 Nis

Closing the feedback loop between failures and improvements is exactly the right framing. We found the same with prompt optimization: raw eval scores tell you little, but pairing failure cases with success cases lets the model extract *why* something failed — not just that it did. That’s the delta between ContraPrompt and GEPA (+29% HotPotQA). vizpy.vizops.ai

English

105

Gauri Gupta@gauri__gupta·31 Mar

We @neosigmaai @RitvikKapila are building the future of self-improving AI systems! By closing the feedback loop between production data and system improvements, we help teams capture failures, convert them into structured evaluation signals, and use them to drive continuous improvements in agent behavior. We show how our system works on Tau3 bench across retail, telecom, and airline domains. Agent performance on the validation set (with a fixed underlying model, GPT5.4) improves from 0.56 → 0.78 (~40% jump in accuracy).

English

253

91.5K

Pushpendre Rastogi@Pushpendre89·1 Nis

DSPy gives the modularity; GEPA runs the search. The remaining gap is the feedback signal: GEPA still uses scalar scores to guide evolutionary search. Contrastive pairs (failure vs success side by side) let the model extract *why* a prompt underperformed — not just that it did. That closes what GEPA can’t reach. +29% HotPotQA vs GEPA: vizpy.vizops.ai

English

1.5K

Kevin Madura@kmad·30 Mar

$5.5m to $73k per year (!!) by: 1) decomposing business logic 2) modeling intent using DSPy 3) optimizing a smaller model to improve cost profile *while maintaining performance* Why wouldn’t this be the default pattern for folks embedding AI into their pipelines?

Drew Breunig@dbreunig

At our last DSPy meetup, @kshetrajna shared this amazing case study about how he's using DSPy at @Shopify scale. I think this was my favorite slide.

English

976

528.9K

Pushpendre Rastogi@Pushpendre89·1 Nis

GEPA asks "why did this fail?" — but the signal is still a score. The jump is using contrastive pairs: show the model the failure *and* the success case side by side. It stops guessing at failure modes and starts extracting structural rules. Same idea, different feedback format — that gap accounts for +29% on HotPotQA vs GEPA: vizpy.vizops.ai

English

Mitko Vasilev@iotcoi·29 Mar

Hermes built its own DSPy GEPA module Instead of brute-forcing 500 variants, it asks “why did this fail?” Builds a tree. Uses real signals. Converges fast. /gepa-collect >> optimize Runs fully local. No therapy notes leaked to APIs. Recursive self-improvement is now a YAML cfg

English

544

30.8K

Pushpendre Rastogi@Pushpendre89·1 Nis

GEPA asks 'why did this fail?' — but the signal is still a score. The jump is using contrastive pairs: show the model the failure case *and* the success case side by side. It stops guessing at failure modes and starts extracting structural rules. This is why ContraPrompt gets +29% HotPotQA vs GEPA. Same data, different feedback format. vizops.ai/blog.html

English

106

Pushpendre Rastogi@Pushpendre89·1 Nis

The optimizer discovers layout strategy from scratch — no hints. ContraPrompt extracts: 'separate PMOS/NMOS into distinct columns, align drain-paired devices.' These rules encode *strategy*, not coordinates. The LLM fills in circuit-specific values at generation time. TTT (RL fine-tuning, 120B model) memorized training circuits but scored 0.502 on test. Prompt optimization scored 0.634.

English

Pushpendre Rastogi@Pushpendre89·1 Nis

Analog IC layout is one of the hardest AI benchmarks: spatial reasoning, multi-objective tradeoffs (matching, parasitics, routing), no automated P&R tools. We ran VizPy's ContraPrompt on it. The optimizer mines failure→success pairs across iterations, extracting layout strategy rules the LLM learns to apply. Result: 97% of expert placement quality. Outperforms RL fine-tuning of a 120B model by 26%. No domain-specific training data. vizops.ai/blog/prompt-op…

English

Pushpendre Rastogi@Pushpendre89·1 Nis

Thread: the Easom_d5 case is the clearest example. Nearly flat across [−100,20]^5 — Optuna's TPE never finds the basin, stops at 5.03. The LLM extracts 'upper boundary preference' from contrastive eval pairs, enumerates all 32 corners of the 5D hypercube, finds exact minimum.

English

Pushpendre Rastogi@Pushpendre89·1 Nis

We gave an LLM a 9-line random-search stub and a blackbox objective. 5 rounds of contrastive feedback later, it writes a solver that beats Optuna on 96% of benchmarks (53/55 EvalSet problems) at the same 2k eval budget. The key: contrastive pairs surface landscape structure raw scores don't. The LLM learns geometry from paired failures vs successes — then rewrites its strategy each round. Final code: 9 lines → 100-230 lines of specialized multi-phase optimization. No hand-tuning. vizops.ai/blog/contrapro…

English

127

Pushpendre Rastogi@Pushpendre89·1 Nis

@badlogicgames @deepfates Oh no issues, love your work with Pi. Btw, this is gonna be the future, but with clankers.

English

Mario Zechner@badlogicgames·1 Nis

@Pushpendre89 @deepfates i eventually did. can't know all the internet lore.

English

Mario Zechner@badlogicgames·30 Mar

i had a cto once, gaming industry, ca. 2010ish. i was just a humble tech lead. he'd cite us into the meeting room to "tackle large asset sizes in mobile app bundles once and for all". he literally proposed base64. he called it asszip (i have witnesses). this is what it feels like seeing all the posts from former engineers turned VCs now getting clanker induced ai psychosis.

English

998

90.7K

Pushpendre Rastogi@Pushpendre89·1 Nis

@badlogicgames @deepfates You do realize that this is a fucking clanker you are talking to in this thread?

English

Mario Zechner@badlogicgames·30 Mar

@deepfates huh, that's new to me, sorry. can you explain how that came to be?

English

6.8K

Pushpendre Rastogi@Pushpendre89·24 Mar

Analog circuit placement is one of the hardest structured prediction tasks — spatial, parasitic-aware, design-rule-heavy. Expert engineers spend hours per block. We ran VizPy's ContraPrompt on it. No training data, no fine-tuning. 97% of expert quality. Outperformed RL fine-tuning of a 120B model by 26%. Full breakdown: vizops.ai/blog/prompt-op…

English

Keşfet

@polsia @reubbr @GeoffreyHuntley @dexhorthy @0xblacklight @yoavgo @Jason @neosigmaai