aashay sachdeva

4.6K posts

aashay sachdeva

@AashaySachdeva

Model Training @SarvamAI | Built https://t.co/hWenaRkujG

India Katılım Mayıs 2014

538 Takip Edilen3.4K Takipçiler

Sabitlenmiş Tweet

aashay sachdeva@AashaySachdeva·27 Oca

Some Amazing startups trying to solve the Data Quality/ Data Governance/ Data Cataloging issue - 1)First, Made in India - @AtlanHQ . @prukalpa wrote an amazing blog on data catalog 3.0. - towardsdatascience.com/data-catalog-3… they have inMobi as a client 🧐

English

aashay sachdeva@AashaySachdeva·2d

Bangalore makes you insecure but delhi and mumbai does not??😭🫠

Niket Raj Dwivedi@niketrajdwivedi

Bengaluru is a tough city. I have seen 50+ known folks leave the city permanently for different reasons- -finances fell, job lost -mentally challenging because people around are doing super well and insecurity kicks in -only the top tier survives People give reasons that make them feel good but the truth is, it isn’t for everyone. If you just want to party and have fun, you have Delhi. If you want a non-tech career, Mumbai is the place. But Bangalore is tech+great places+insane competition.

English

977

aashay sachdeva@AashaySachdeva·4d

@lossfunk @krkartikay @rawrepeat @ViraajG1 @Pattanaikay @TusharMagar_ @pranay5255 Gg @pranay5255

Lossfunk@lossfunk·4d

6/ Pranay (@pranay5255) is a pre ChatGPT era ML engineer and Meta Llama grant winner, now working on evmSmith (github.com/openai/frontie…), an RL environment for smart contract security. During the residency, he will be exploring whether small 32B scale open source models, given structured test time scaling and curated training, can match frontier agent performance on auditing and exploiting smart contract vulnerabilities.

English

Lossfunk@lossfunk·4d

Kickstarted Batch 8 of @lossfunk 🚀 residency! This time, we leaned towards more research-first projects, aligned with the lab's core focus areas, and our Lossfunk community is pretty excited about the new batch 🧵 with their profiles and goals for the next 6-8 weeks

English

135

19.5K

aashay sachdeva@AashaySachdeva·4d

Reward hacking at it’s finest!

OpenAI@OpenAI

We’re talking about Goblins. openai.com/index/where-th…

English

385

aashay sachdeva@AashaySachdeva·5d

@pHequals7 Indian civil services getting destroyed by data from 1930s

English

229

pH@pHequals7·5d

talkie-1930 thinks India will never become independent from great britain and even if it did our first PM would be an englishman named Sir William Wedderburn (who passed away in 1918😂)

English

aashay sachdeva retweetledi

Tanay Lohia@tanaylohia·5d

We don't even design binders!! But we @BioMandrake just won @adaptyvbio × @gembioworkshop's RBX1 design competition - 1 Strong binder out of 322 tested, selected from 12,000+ submissions. Couldn't make it to @iclr_conf in Rio. Here's how we did it 👇 open.substack.com/pub/mandrakebi…

GIF

English

345

46.5K

aashay sachdeva@AashaySachdeva·5d

@imnottanmay @kingofknowwhere 😆

QME

Tanmay@imnottanmay·5d

@AashaySachdeva @kingofknowwhere time for patriot bench 2.0

English

aashay sachdeva@AashaySachdeva·5d

Everyone should spend some time playing with this. Very interesting. I asked a simple question considering this is the trending topic.

David Duvenaud@DavidDuvenaud

Announcing Talkie: a new, open-weight historical LLM! We trained and finetuned a 13B model on a newly-curated dataset of only pre-1930 data. Try it below! with @AlecRad and @status_effects 🧵

English

2.5K

aashay sachdeva@AashaySachdeva·6d

ethanding.substack.com/p/claude-code-… We are going to start asking a lot more questions on token to ROI correlation this year

English

918

aashay sachdeva retweetledi

Archie Sengupta@archiexzzz·25 Nis

Hiring: Founders I know are hiring 6 extremely cracked AI-native engineers with a product-first mindset. Bar is very high. Founders are veterans - IIT/ISB alums, ex-CTO of very large businesses. They’re building in the coding space. Funded. If you know someone who’s extremely cracked (references only) and looking to join a rocket ship, this is the place. DM me with proof of work so I can make an intro. Don’t DM if they're not cracked.

English

547

56.1K

aashay sachdeva@AashaySachdeva·25 Nis

@NirantK $b soon?

English

141

Nirant@NirantK·25 Nis

cooking something new

English

612

39.5K

aashay sachdeva@AashaySachdeva·25 Nis

@NirantK Yes

Nirant@NirantK·25 Nis

@AashaySachdeva GTA release?

English

127

aashay sachdeva@AashaySachdeva·25 Nis

predictionarena.ai/models/glm-5 GLM is obsessed with GTA release

English

303

aashay sachdeva@AashaySachdeva·25 Nis

@guinnesschen @pHequals7 your competitor

English

178

Guinness Chen@guinnesschen·24 Nis

You can now use your ChatGPT subscription to dictate anywhere on your desktop now! Have fun!

English

115

1.8K

334.2K

aashay sachdeva@AashaySachdeva·24 Nis

There is no wall?

Noam Brown@polynoamial

I'm a manager at @OpenAI, but with GPT-5.5 I'm a more effective IC than I've ever been. I can now write CUDA kernels like a pro. I can rely on it to run my research experiments. And we know how to make it much more powerful from here.

English

849

aashay sachdeva@AashaySachdeva·24 Nis

@HarveenChadha I will take this anyday for 1/10th the cost

English

162

Harveen Singh Chadha@HarveenChadha·24 Nis

when it comes to coding it is still the same outcome everywhere opus beats them all..

English

2.5K

aashay sachdeva@AashaySachdeva·24 Nis

@pHequals7 But but what about the agentic economy thesis

English

168

pH@pHequals7·24 Nis

if you're an AI VC fund and you're not GPU financing shut shop and go home coz what are you even doing at this point... the 1024th generic agentic anthropic derivative startup is not going to save your fund everyone's running a private credit fund at this point

English

987

aashay sachdeva@AashaySachdeva·23 Nis

@sama @scaling01 @sumanthd17 this is your chance to get that lego

English

167

Sam Altman@sama·23 Nis

@scaling01 join the light side and i'll send you a lego x-wing

English

1.4K

64.3K

Lisan al Gaib@scaling01·23 Nis

The GPT-5.5 model family completely dominates the cost-performance frontier on the Artificial Analysis Index

Artificial Analysis@ArtificialAnlys

GPT-5.5 takes OpenAI back to the clear number one in AI. OpenAI’s new model tops the Artificial Analysis Intelligence Index by 3 points, breaking a three-way tie with Anthropic and Google OpenAI gave us pre-release access to test all five reasoning effort levels: xhigh, high, medium, low and non-reasoning. ➤ OpenAI topping five headline evaluations: GPT-5.5 (xhigh) leads Terminal-Bench Hard, GDPval-AA and our newly hosted APEX-Agents-AA. The model trails only other OpenAI models in CritPt and AA-LCR, and comes second to Gemini 3.1 Pro Preview on three additional evaluations. The largest gains are on AA-Omniscience (+14 pts), our knowledge and hallucination benchmark, and τ²-Bench Telecom (+7 pts), a customer service agent benchmark. ➤ 20% more expensive to run our Intelligence Index: Per-token pricing has doubled from GPT-5.4 to $5/$30 per 1M input/output tokens. However, a ~40% token use reduction largely absorbs the hike - resulting in a net ~+20% cost to run our Intelligence Index. ➤ Effort a clear ladder for balancing intelligence and cost: GPT-5.5 (medium) scores the same as Claude Opus 4.7 (max) on our Intelligence Index at one quarter of the cost (~$1,200 vs $4,800) - although Gemini 3.1 Pro Preview scores the same at a cost of ~$900. GPT-5.5 (low) approximates Claude Opus 4.7 (Non-reasoning, high) on our Intelligence Index at half the cost to run (~$500 vs ~$1 ,000). ➤ Number one in GDPval-AA with an Elo of 1785: GPT-5.5 (xhigh) leads Claude Opus 4.7 (max) by ~30 pts and Gemini 3.1 Pro Preview by ~470 pts. GDPval-AA is Artificial Analysis’ benchmark that leverages OpenAI’s GDPval dataset to evaluate models on real-world economically valuable tasks. ➤ Top AA-Omniscience accuracy, but trailing the frontier on hallucination: Our private AA-Omniscience benchmark rewards factual knowledge across diverse topics, but punishes hallucination. GPT-5.5 (xhigh) has the highest accuracy at 57% - meaning the model can recall facts in the Omniscience corpus more effectively than any other model. However, it has a hallucination rate of 86% - vs Opus 4.7 (max) at 36%, and Gemini 3.1 Pro Preview at 50%. This makes it more likely to answer a question when it does not ‘know’ the answer. The 14 pt gain in AA-Omniscience from GPT-5.4 (xhigh) was largely driven by knowledge, with a modest improvement in hallucination. Congratulations to the team at @OpenAI and @sama on the launch

English

877

244.4K

aashay sachdeva@AashaySachdeva·23 Nis

@aannuujX @Swiggy @sumanthd17

QAM

541

dope-a-meme@aannuujX·23 Nis

Introducing Swiggy Builders Club We’re opening @Swiggy commerce infrastructure to developers and enterprises to build on top - build AI agents, apps, and integrations on top of Swiggy’s Food, Instamart, and Dineout ecosystems - with real APIs, real data, and real users. What you get: 3 MCP Servers (Food, Instamart, Dineout) 18+ API tools covering the full convenience stack Production data access from day one Direct engineering support Who it’s for: Individual developers with bold ideas Startups building AI-native commerce products Enterprises looking to integrate Swiggy into their platforms Smart grocery restock bots. AI ordering assistants. Dining recommendation agents. Group ordering tools, health first products. If it makes commerce better for users, we want to see it. Ship something great and we’ll feature it. Ship something exceptional and our recruiting team might reach out.