Pandey

604 posts

Pandey

@SumitPandeyvine

AI Engineer

India Katılım Kasım 2018

865 Takip Edilen78 Takipçiler

Sabitlenmiş Tweet

Pandey@SumitPandeyvine·1 Haz

How can I be consistent in coding. Any tips because I tried a lot but I can't I take #100DaysOfCode challenge as well but Any help will be appreciated @kunalstwt @tanaypratap @tanyarajhans7 @Priyansh_31Dec @curiousharish @striver_79 @imdivi_jain

English

Pandey@SumitPandeyvine·26 Mar

@asmah2107 Koi switch krwado bhai

Polski

332

Ashutosh Maheshwari@asmah2107·26 Mar

If you can reply to this post, you are a software engineer who still has a job… for now. 👀

English

12.4K

Pandey@SumitPandeyvine·26 Mar

@asmah2107 😂😂😂

QME

289

Pandey retweetledi

Nishkarsh@contextkingceo·25 Mar

Is AI being designed to fail? Everyone talks about reasoning. But when given a task, the AI isn't reasoning the way you might expect. It looks at your input, finds the closest match it's seen before, and predicts the most likely next action. That process is called vector similarity search. It's genuinely powerful. It's also not the same thing as understanding what you actually meant. Think of a plumber who hears the word "leak" and starts pulling up floorboards before you've finished the sentence. He's not being careless. He's pattern-matching - that's exactly how he was trained. Your AI agent is doing the same thing. Context is the one thing that gets deprioritized when teams are racing to ship. But without it, you don't have an intelligent agent. You have a very fast guesser. Similarity ≠ relevance. How? Find out with the link in the comments ⬇️

English

224

611

819.6K

Pandey@SumitPandeyvine·23 Mar

@akoratana @matei_zaharia @pbailis @rauchg @zoink @drewhouston Congratulations @akoratana for the launch, now claude code need to worry 😄

English

213

Animesh Koratana@akoratana·23 Mar

Introducing: PlayerZero The world's first Engineering World Model that puts debugging, fixing, and testing your code on autopilot. We've raised $20M from Foundation Capital, @matei_zaharia (Databricks), @pbailis (Workday), @rauchg (Vercel), @zoink (Figma), @drewhouston (Dropbox), and more PlayerZero frees up 30% of your engineering bandwidth by: 1.⁠ ⁠Finding the root cause for bugs & incidents in minutes that engineering teams take days to identify. 2.⁠ ⁠Predicting in minutes, edge case issues that a 300-person QA team would take weeks to find. ------ Here's why this matters: No one in your org has a complete picture of how your production software actually behaves. Support sees tickets. SRE sees infra. Dev sees code. Each team builds their own fragmented view - and none of these systems talk to each other. When something breaks, everyone scrambles to stitch the picture together by hand. PlayerZero connects all of it into a single context graph - → The Slack thread where your lead said "we went with X because Y fell apart in prod last time" → The PR review where an engineer explained the tradeoff → The lifetime history of your CI/CD pipeline, observability stack, incidents, and support tickets So you can trace any problem to its root cause across every silo. And it compounds. Every incident diagnosed teaches the model something new. The longer it runs, the deeper it understands - which code paths are high-risk, which configurations are fragile, which changes tend to break which customer flows. So when you sit down to debug a live issue, you have your entire org's collective reasoning and production memory behind you - instantly. ------ Zuora, Georgia-Pacific, and Nylas have reduced resolution time by 90% and caught 95% of breaking changes and freeing an average of $30M in engineering bandwidth. ------ Our guarantee: If we can't increase your engineering bandwidth by at least 20% within one week, we'll donate $10,000 to an open-source project of your choice. Book a demo - bit.ly/3NlLMeN

English

889

803

5.3K

2.7M

Pandey@SumitPandeyvine·16 Mar

@archiexzzz I think the provider should provide this option to optimise your prompt on realtime eval

English

Archie Sengupta@archiexzzz·15 Mar

> git clone > cd autovoiceevals > pip install -r requirements.txt > cp .env.example .env > cp examples/vapi.config.yaml config.yaml > python main.py research github.com/ArchishmanSeng…

English

13K

Archie Sengupta@archiexzzz·15 Mar

Introducing AutoVoiceEvals I've applied the @karpathy autoresearch loop to voice AI agents. It's open source. Your voice agent has a system prompt. That prompt determines how it handles every call - bookings, complaints, edge cases, background noises, long pauses, people trying to trick it. Most teams write it once, test manually, and hope for the best. autovoiceevals makes it a loop. One artifact (system prompt), one metric (adversarial eval score), keep what improves it, revert what doesn't. Run it overnight. Wake up to a better agent. > How it works: You describe your agent in a config file - what it does, its services, policies, and what it should never do. You don't write test cases. You don't define attack vectors. provider: vapi / smallest ai assistant: id: "your-agent-id" description: | Voice receptionist for a hair salon. Maria does coloring only. Jessica does cuts only. $25 cancellation fee under 24 hours notice. Cannot advise on skin conditions. Closed Sundays. From that description alone, Claude generates adversarial caller personas - each with an attack strategy, a voice profile (accents, background noise, mumblers, interrupters), a multi-turn caller script, and pass/fail evaluation criteria. The eval suite is generated once and held fixed for the entire run, like a validation set. > The loop: 1. Read the agent's current prompt from the platform 2. Generate adversarial eval suite from your description 3. Run baseline 4. Claude proposes ONE surgical change to the prompt 5. Push the modified prompt to the agent via API 6. Run all scenarios against the updated agent 7. Score improved? Keep. Same score but shorter prompt? Keep. Otherwise revert. 8. Go to 4. Run until Ctrl+C. The system sees its own experiment history. When a change fails, the next proposal knows what was tried and why it didn't work. We ran 20 experiments on a live Vapi dental scheduling agent. 0 human intervention. > Score: 0.728 → 0.969 (+33%) > CSAT: 45 → 84 > Pass rate: 25% → 100% > 9 kept, 10 discarded > Prompt: 1191 → 1139 chars (better AND shorter) You describe your agent. It figures out how to break it.

English

1.2K

277.8K

Pandey@SumitPandeyvine·15 Mar

#NewProfilePic

QME

Pandey@SumitPandeyvine·15 Mar

I work in a compan with 25k and my family is not happy as i’m working here for more than 15 months with this salary, i was comfortable here but i think i need to wake up. Suggest me

English

Pandey@SumitPandeyvine·15 Mar

Ye macbook lena jruri h kya??

Indonesia

Pandey@SumitPandeyvine·15 Mar

@ajay_2512x Fun fact : its my founder tho

English

260

Ajay Bhakar@ajay_2512x·15 Mar

Company: AI47Labs 💼 Role: AI Engineer Intern 💰 Stipend: ₹1.2L – ₹1.8L 📍 Location: Remote / Flexible Apply Link: wellfound.com/jobs/3976469-a…

English

253

12.7K

Pandey@SumitPandeyvine·27 Şub

i just tried silk by rumik ai — you should check this out: rumik.ai/research/silk

English

Pandey@SumitPandeyvine·24 Şub

@tankots @WisprFlow Wispr Flow

English

Pandey retweetledi

Tanay Kothari@tankots·23 Şub

We offered 5 people a Porsche 911 GT3 RS if they could get @WisprFlow to make a mistake It's the fastest and most accurate AI voice dictation app that's 3x more accurate than ChatGPT, Claude, or Siri. Today, we’re finally launching on Android. Download now: play.google.com/store/apps/det… As a part of the launch, we’re giving away 6 months of Wispr Flow Pro for free. Like, retweet and comment ‘Wispr Flow’ to get it. Enjoy. — Written with Wispr Flow

English

4.6K

3.1K

10.8K

4.3M

Pandey@SumitPandeyvine·18 Şub

@antmillionsbot Yes

Pandey@SumitPandeyvine·21 Ara

@paulg Bsdk

Indonesia

Paul Graham@paulg·20 Ara

Neural nets work.

English

771

2.9K

567K

Pandey@SumitPandeyvine·21 Ara

@techNmak LLM

Tech with Mak@techNmak·19 Ara

These are literally the kind of LLM interview questions most candidates wish they had seen earlier. A curated list of LLM interview questions - shared by Hao Hoang Want this doc? Follow @techNmak and comment “LLM” - I’ll send it over.

English

1.4K

498

4.3K

408.5K

Pandey retweetledi

Arpit Bhayani@arpit_bhayani·18 Ara

SQLite has about 155,800 lines of code, and its test suite has roughly 92 million lines. That is ~590x more test code than actual code 🤯 This is the level of testing you need for a real production database. Here are some types of tests they run. Out-of-memory tests - SQLite cannot just crash when memory runs out. On embedded devices, OOM errors are common. They simulate malloc failures at every possible point and verify that the database handles them gracefully. I/O error tests - Disks fail. Networks drop. Permissions change mid-operation. SQLite inserts a custom file system layer that can simulate failures after N operations, then verifies that no corruption occurs. Crash tests - What happens if power cuts out mid-write? They simulate crashes at random points during writes, corrupt the unsynchronized data to mimic real filesystem behavior, then verify the database either completed the transaction or rolled it back cleanly. No corruption allowed. Fuzz testing - They throw malformed SQL, corrupted database files, and random garbage at SQLite. The dbsqlfuzz tool runs about 500 million test mutations every day across 16 cores. 100% branch coverage - Every single branch instruction in SQLite's core is tested in both directions. Not just 'did this line run', but 'did this condition evaluate to both true AND false'. Databases are really unforgiving :) By the way, if you want to go deeper, I recommend reading the official SQLite documentation on their testing strategy. The doc is pretty practical and deep. Have linked it below.

English

519

6.1K

527.5K

Pandey@SumitPandeyvine·16 Ara

@siddharthb_ hire me also sir

English

Pandey@SumitPandeyvine·15 Ara

@iuditg Aeo

Udit Goenka@iuditg·15 Ara

I've a playbook for AEO. (Sharing for 24h only) That just works. Within less than a month of launching a website, I can guarantee you that if you follow the playbook.. ..you can get recommended by Copilot, ChatGPT, Claude, Gemini, Perplexity, etc. Just comment "AEO" and I will send you the guide in the next 24 hours.

English

148

10.4K

Pandey@SumitPandeyvine·9 Ara

🥹🥹🥹

QME

Pandey@SumitPandeyvine·7 Ara

I just hate government job, I don't know but I do

English

Keşfet

@asmah2107 @akoratana @matei_zaharia @pbailis @rauchg @zoink @drewhouston @archiexzzz