
Pankaj Kumar
9.4K posts

Pankaj Kumar
@pankajkumar_dev
I build things | Dm for work/collab
เข้าร่วม Ağustos 2024
637 กำลังติดตาม8.3K ผู้ติดตาม
ทวีตที่ปักหมุด

Projects that I have made in last 6 months!
feedwall.vercel.app
flowpay-one.vercel.app
boltweb-ai.vercel.app
ui-unify.vercel.app
resume-sach.vercel.app
trimmrr.vercel.app
vimal-parody.vercel.app
Learned a lot while making these projects ,huge thanks to @kirat_tw for the amazing teaching! 🙌
English

Claude Mythos vs GPT-5.5 Cyber : the security race is getting serious
- When Claude Mythos initially presented, it genuinely felt like Anthropic had something far ahead in cyber reasoning and long horizon analysis
- The Firefox results were genuinely impressive with 423 bugs fixed in a month, 271 found with Mythos help, including sandbox escapes and a 20 year old XSLT issue
- What made Mythos stand out wasn't just bug finding, but validating exploit paths and reasoning through them dynamically
- But GPT-5.5 Cyber entering the picture changes things a lot with 71.4% expert cyber score vs Mythos 68.6%, while also being dramatically cheaper and faster to run
- GPT-5.5 solving reverse engineering tasks in minutes for low compute cost is a pretty big deal for real production usage
- Mythos still seems stronger for extremely deep codebase analysis and longer reasoning chains, but the serving cost sounds massive
Honestly, It feels more like a race to make powerful cyber agents actually usable at scale without high cost or latency.

English

Pankaj Kumar รีทวีตแล้ว

ERNIE 5.1 is one of the most efficient frontier models yet
- Baidu says ERNIE 5.1 reached near frontier-level performance at only 6% of the usual training cost, which is kind of wild.
- They heavily compressed the model too cutting total params to 1/3 and active params to 1/2 while still improving performance
- It hit #4 globally on Arena Search (1223 score), becoming the strongest Chinese model there
- Scored 99.6 on AIME26 (with tools), putting it right behind Gemini 3.1 Pro for math reasoning
- Also beat DeepSeek-V4-Pro on some practical agent benchmarks like τ³-Bench and SpreadsheetBench-Verified
- Writing + knowledge performance is now getting close to Gemini 3.1 Pro territory as well
- Most of the gains reportedly come from smarter training methods rather than brute-force scaling

English

@Rindzay3210 yeap it performing well.
and we need this type of models more.
English

@pankajkumar_dev Ищет и правда очень хорошо, в моих задачах обошла sonnet 4.6
Русский
Pankaj Kumar รีทวีตแล้ว

Revamped UI of my most loved/hated project : Resume Sach 💀
A Hinglish AI Resume Roaster that judges your CV harder than Indian relatives.
Upload resume → AI scans the fluff → choose roast level → emotional damage delivered instantly.
Already 8000+ resumes roasted
Your are NOT surviving this one : resume.pankajk.tech
Disclaimer : Try at your own risk.
English

Such a beautiful rocket
SpaceX@SpaceX
Starship and Super Heavy V3 together at the Starbase launch pad for the first time
English

Meet Vijay Shekhar Sharma (Founder of Paytm)
Started Paytm in 2010
Faced years of losses & criticism
IPO crashed in 2021
Many called Paytm “finished”
But Vijay kept building.
Focused on growth + profitability
Improved the business year after year
After 14 years, Paytm finally reported profit
From being doubted publicly
To building India’s biggest fintech brand.
The comeback nobody saw coming.

English

@nikitabier yes, as i am seeing 90% reach getting from unfollowers.
English


@Artofpuremind @devops_nk nope, u should never share env with any ai, ai tool etc
English

@devops_nk @pankajkumar_dev What about vscode with codex integrations? Can't they access the environment file if we use codex then? (I'm a newbie btw)
English

@pankajkumar_dev No bro it's having credentials and our api keys are very sensitive data we can't share it
English















