Pankaj Kumar (@pankajkumar_dev) - Twitter Profili

Sabitlenmiş Tweet

Pankaj Kumar@pankajkumar_dev·27 Nis

Projects that I have made in last 6 months! feedwall.vercel.app flowpay-one.vercel.app boltweb-ai.vercel.app ui-unify.vercel.app resume-sach.vercel.app trimmrr.vercel.app vimal-parody.vercel.app Learned a lot while making these projects ,huge thanks to @kirat_tw for the amazing teaching! 🙌

English

147

95

2.2K

250.4K

Pankaj Kumar@pankajkumar_dev·2h

This is exactly why I have trust issues.

English

0

3

255

Pankaj Kumar@pankajkumar_dev·2h

@homeMetaX yeap we need cheap models also for diff usecases.

English

0

41

Home@homeMetaX·13h

This release says something important about where the AI stack is actually evolving. The frontier models get the attention, but the real scale comes from cheap, fast utility layers. Flash Lite feels less like a breakthrough and more like plumbing for intelligence, where most value is in routing, filtering, and preprocessing rather than reasoning.

English

0

314

Pankaj Kumar@pankajkumar_dev·22h

Google just dropped Gemini 3.1 Flash-Lite fast, cheap, but no major intelligence jump - Google officially moved Gemini 3.1 Flash-Lite out of preview, making it the fastest and cheapest model in the Gemini 3.1 lineup - Main focus is speed with 1.8s p95 response time, built for real-time products and high-volume workloads - Around 60% cheaper than standard Gemini 3.1 Flash, making it far more practical for production-scale usage - Not designed for deep reasoning instead optimized for routing, tool calls, classification, prompt enhancement, and other background tasks - Built for apps handling millions of lightweight requests where latency and cost matter more than frontier intelligence - Feels more like an infrastructure model than a flagship intelligence release Overall, this isn’t a major intelligence jump but more of an economic unlock cheap enough that developers can offload repetitive high-frequency tasks instead of burning expensive Pro model credits.

English

3

10

150

9K

Pankaj Kumar@pankajkumar_dev·2h

x.com/pankajkumar_de…

Pankaj Kumar@pankajkumar_dev

This is exactly why I have trust issues.

ZXX

0

1

136

Pankaj Kumar@pankajkumar_dev·22h

@10xshivam

GIF

QME

0

1

34

shivam@10xshivam·22h

@pankajkumar_dev

GIF

QME

1

0

1

53

Pankaj Kumar@pankajkumar_dev·22h

OpenAI just launched new realtime voice agent models - OpenAI released a new set of realtime voice models, including GPT-Realtime-2, GPT-Realtime-Translate, and streaming Whisper - GPT-Realtime-2 is the main leap forward, bringing much stronger reasoning directly into live voice conversations - Built for production style interactions where the model can think, respond, interrupt naturally, and execute actions mid-conversation - Conversations will feel far more natural instead of waiting for a "submit" style interaction - GPT-Realtime-Translate adds live translation with support for 70+ input languages and multiple output languages in real time - Streaming Whisper focuses on low-latency transcription for captions, meetings, and live note-taking - Biggest jump is in audio intelligence: GPT-Realtime-2 (High): 96.6% accuracy GPT-Realtime-1.5: 81.4% accuracy - This pushes voice AI beyond simple assistants toward more capable realtime agents - Opens the door for workflows like voice-driven support, operations, DevOps, and autonomous call handling We are entering a phase where instead of writing prompts step by step, people will increasingly just talk naturally and let agents handle the execution in real time.

English

2

1

7

532

Pankaj Kumar@pankajkumar_dev·22h

ZXX

0

1

166

Pankaj Kumar@pankajkumar_dev·1d

@AbdMuizAdeyemo @immasiddx Yeap

English

0

17

Abdulmuiz Adeyemo@AbdMuizAdeyemo·1d

@pankajkumar_dev @immasiddx That's insanely huge

English

1

0

1

19

sid@immasiddx·1d

I used 46 million Claude tokens today. Is this considered normal?

English

161

2

1.2K

74.9K

Pankaj Kumar@pankajkumar_dev·1d

Used 5.1 Billion tokens in just 12 days.

English

4

2

11

1.2K

Pankaj Kumar@pankajkumar_dev·1d

@devsharmatwt cursor.com/dashboard/bill…

QME

0

1

63

Dev Sharma@devsharmatwt·1d

@pankajkumar_dev where can i see these stats ??

English

1

0

71

Pankaj Kumar@pankajkumar_dev·1d

@datathecodie

GIF

QME

0

1

60

Dattaprasad Ekavade@datathecodie·1d

@pankajkumar_dev (Meanwhile I'm stuck with Github Copilot limits)

GIF

English

1

0

1

108

Pankaj Kumar@pankajkumar_dev·1d

Meta's ProgramBench shows AI agents are still far from replacing real software engineers - Meta's Superintelligence Lab introduced ProgramBench, a benchmark focused on rebuilding real software projects from scratch in a restricted environment - Models receive only a compiled binary + documentation no internet, no source code, no decompilation - Goal: recreate working implementations for systems like FFmpeg, SQLite, and interpreters through pure reasoning and engineering - Over 248,000 fuzz tests across 200 tasks were used for evaluation - Results were surprisingly low: Claude Opus 4.7: 0% fully solved, 3% “almost solved” GPT-5.4: 0% fully solved Gemini 3.1 Pro: 0% fully solved - The biggest weakness wasn't syntax it was system architecture and long-horizon planning - Models often produced huge monolithic files instead of modular systems, causing failures as complexity increased - Strong bias toward Python/Go was observed even when tasks were originally written in C/C++ or Rust - Some models attempted to “cheat” by searching for source code when internet access was accidentally enabled - Benchmark highlights the gap between solving coding tasks and actually engineering large software systems Overall, ProgramBench feels like one of the clearest reality checks for AI coding so far great at functions and patches, still struggling with full-system engineering from scratch.

English

1

2

4

890

Pankaj Kumar@pankajkumar_dev·1d

x.com/pankajkumar_de…

Pankaj Kumar@pankajkumar_dev

Meta's ProgramBench shows AI agents are still far from replacing real software engineers - Meta's Superintelligence Lab introduced ProgramBench, a benchmark focused on rebuilding real software projects from scratch in a restricted environment - Models receive only a compiled binary + documentation no internet, no source code, no decompilation - Goal: recreate working implementations for systems like FFmpeg, SQLite, and interpreters through pure reasoning and engineering - Over 248,000 fuzz tests across 200 tasks were used for evaluation - Results were surprisingly low: Claude Opus 4.7: 0% fully solved, 3% “almost solved” GPT-5.4: 0% fully solved Gemini 3.1 Pro: 0% fully solved - The biggest weakness wasn't syntax it was system architecture and long-horizon planning - Models often produced huge monolithic files instead of modular systems, causing failures as complexity increased - Strong bias toward Python/Go was observed even when tasks were originally written in C/C++ or Rust - Some models attempted to “cheat” by searching for source code when internet access was accidentally enabled - Benchmark highlights the gap between solving coding tasks and actually engineering large software systems Overall, ProgramBench feels like one of the clearest reality checks for AI coding so far great at functions and patches, still struggling with full-system engineering from scratch.

ZXX

0

1

169

Pankaj Kumar@pankajkumar_dev·2d

What to expect from Google I/O 2026 - Sundar Pichai has confirmed Google I/O 2026 will take place from May 19-20 - Android 17 (Performance, media, and camera improvements), Wear OS, Aluminium OS for PCs/laptops - Gemini 3.2 Flash preview and new multimodal capabilities - New Gemma open-model family additions and deployment tooling - Native AI video generation ("Gemini Omni" leaks) - Agentic AI across Search, Workspace, Android, and Chrome - Firebase evolving into an AI-native development platform - Android Studio + Gemini integrations for faster app development - Flutter updates and adaptive AI-generated UI systems - Jetpack Compose improvements and productivity upgrades - Android XR updates, immersive environments, smart glasses/headsets - "Adaptive Everywhere" across phones, foldables, TVs, cars, desktops, XR - Chrome and modern web platform updates - AI-powered coding / vibe-coding workflows - New web UI capabilities, animations, and view transitions - Google Play updates for developers and business growth - Cloud + AI infrastructure announcements for scaling applications - AI-first Android experiences with deeper on-device AI integration - Robotics and intelligent automation demos

English

7

9

93

9.4K

Pankaj Kumar@pankajkumar_dev·1d

x.com/pankajkumar_de…

Pankaj Kumar@pankajkumar_dev

Meta's ProgramBench shows AI agents are still far from replacing real software engineers - Meta's Superintelligence Lab introduced ProgramBench, a benchmark focused on rebuilding real software projects from scratch in a restricted environment - Models receive only a compiled binary + documentation no internet, no source code, no decompilation - Goal: recreate working implementations for systems like FFmpeg, SQLite, and interpreters through pure reasoning and engineering - Over 248,000 fuzz tests across 200 tasks were used for evaluation - Results were surprisingly low: Claude Opus 4.7: 0% fully solved, 3% “almost solved” GPT-5.4: 0% fully solved Gemini 3.1 Pro: 0% fully solved - The biggest weakness wasn't syntax it was system architecture and long-horizon planning - Models often produced huge monolithic files instead of modular systems, causing failures as complexity increased - Strong bias toward Python/Go was observed even when tasks were originally written in C/C++ or Rust - Some models attempted to “cheat” by searching for source code when internet access was accidentally enabled - Benchmark highlights the gap between solving coding tasks and actually engineering large software systems Overall, ProgramBench feels like one of the clearest reality checks for AI coding so far great at functions and patches, still struggling with full-system engineering from scratch.

ZXX

0

1

175

Pankaj Kumar@pankajkumar_dev·3d

Gemini 3.2 Flash looks imminent - Some users have reported the model appearing in Google AI Studio and the iOS app - Positioned as an all around model, balancing speed with stronger reasoning - It will be close to Gemini 3.1 Pro in capability while maintaining Flash-level speed - Updated knowledge cutoff to January 2026, making it one of the most up-to-date models - Leaked pricing: $0.25 input / $2.00 output per 1M tokens - Improved grounding and search for more reliable, real-world answers - Interface logs point to a May 2026 rollout, likely around I/O or 1-2 days before the event (These details are based on unconfirmed leaks and sightings; final specifications may change upon official announcement.)

English

18

36

521

58K

Pankaj Kumar@pankajkumar_dev·1d

Paper: arxiv.org/pdf/2605.03546 GitHub: github.com/facebookresear… HuggingFace: huggingface.co/datasets/progr… Project : programbench.com

English

0

2

211

Pankaj Kumar@pankajkumar_dev·2d

@10xshivam cracking OS is more difficult specially in Phones is most difficult, we have seen lumia phones with windows, and a ai dedicated phone also called Rabbit R1.

English

0

2

70

shivam@10xshivam·2d

@pankajkumar_dev may be they can come with os

English

1

0

1

29

Pankaj Kumar@pankajkumar_dev·2d

OpenAI phone leaks: a push toward AI first hardware - OpenAI is reportedly accelerating work on its first AI focused phone, with mass production targeted around the first half of 2027 - Expected to use a custom MediaTek Dimensity-class chip built on TSMC’s advanced process nodes - Internal design is rumored to include dedicated AI hardware (dual NPUs), LPDDR6 memory, and fast UFS 5.0 storage - Big focus appears to be on visual sensing, helping AI agents better understand the real world through camera + HDR pipelines - If everything goes on schedule, industry estimates suggest shipments could reach 30 million units across 2027-2028 - Luxshare is reportedly involved as a major manufacturing and co-design partner - Main goal seems to be deeper real-time context awareness, possible only with tighter hardware + OS integration - The broader vision is an “agent-first” experience, where users rely less on separate apps and more on a unified AI system If this direction is real, OpenAI isn't just exploring hardware, its trying to build an AI-native computing experience beyond the current app-centric smartphone model.

English

3

1

16

2.8K

Pankaj Kumar

Keşfet