Mayur Naik (@AI4Code) - Twitter پروفائل | Zamantika Mersobahis Locabet

Mayur Naik ری ٹویٹ کیا

Damek@damekdavis·6d

Solve harder problems.

English

12

10

111

6.1K

Mayur Naik ری ٹویٹ کیا

a16z@a16z·7 Mar

"Not having a coding experience is becoming an advantage." Replit CEO Amjad Masad: "You don't need any development experience. You need grit. You need to be a fast learner." "If you're a good gamer, if you can jump in a game and figure it out really quickly, you're really good at this." "Coders get lost in the details." "Product people, people who are focused on solving a problem, on making money, they're going to be focused on marketing, they're going to be focused on user interface, they're going to be focused on all the right things." "I think this year it's gonna flip, and I think not having a coding background is gonna be more advantageous for the entrepreneur." @amasad with @jackhneel

English

599

521

4.7K

2.5M

Mayur Naik@AI4Code·18 Şub

4. The best uses of AI will come from those who did not create it.

English

0

107

Mayur Naik@AI4Code·18 Şub

2. Too much CS knowledge can get in the way of trying. Even if I was able to imagine such animations, I would never have believed today's AI can create them for me. 3. Share and learn. Even something as small as helping someone setup up Claude Code could unlock a new capability.

English

1

0

2

178

Mayur Naik@AI4Code·18 Şub

AI this year feels very different. A computer science student in my lab helped a math collaborator setup Claude Code to build a website I was procrastinating on. She built a stunning website with sophisticated math animations within a day. Many lessons here:

English

1

0

7

461

Mayur Naik ری ٹویٹ کیا

Saikat Dutta@saikatdutta2012·26 Oca

📢 Excited to share that our paper "QLCoder: A Query Synthesizer For Static Analysis of Security Vulnerabilities" has been accepted to #ICLR2026! 🥳 Congrats to my co-authors @lambdaclaire, @AI4Code, and @_ziyang_ Preprint: arxiv.org/pdf/2511.08462

English

1

5

27

2.1K

Mayur Naik@AI4Code·19 Ara

Penn-goin! Elated that my amazing daughter Isha was admitted to Penn in the Mechanical Engineering program's Class of 2030! Lots of gratitude to her teachers, our family and friends, my colleagues, and even my students who helped in any way possible. We as a family got to experience the stressful process of US college admissions firsthand. Sending good wishes to others going through it right now. A reminder that admission decisions are never a measure of your worth as a person. thedp.com/article/2025/1… Isha's website with artwork and writings: ishanaikfineart.com #PennEngineeringProud @MEAM_Penn #classof2030

English

12

6

234

20.5K

Mayur Naik ری ٹویٹ کیا

Google Open Source@GoogleOSS·15 Ara

Like a deep-sea anglerfish illuminating its surroundings, ESCA "illuminates" an agent's environment with structured scene graphs ( source). This new framework from UPenn, accelerated by JAX, is a game-changer for embodied AI. goo.gle/esca-screen-gr… #AI #JAX #Robotics #Innovation

English

0

4

16

2.2K

Mayur Naik ری ٹویٹ کیا

Shriram Krishnamurthi (primary: Bluesky)@ShriramKMurthi·14 Ara

Sorry for not offering this earlier, but if you're a parent of a Brown student and they need help (a ride, a coffee, whatever), please email me and I'm happy to try to help. cs.brown.edu/~sk/Contact/

English

7

91

1K

90K

Mayur Naik@AI4Code·14 Ara

@yanndubs @adxtyahq It seems common to underestimate the role of test-time inference in modern LLMs; I made a post about it: x.com/AI4Code/status…

Mayur Naik@AI4Code

It is easy to overlook but hard to overstate how big a role "test-time inference" has played in modern LLM gains in reasoning. I myself wasn't sure, so I put GPT-5 and Gemini-2.5 to a classic programming puzzle called variable shadowing (en.wikipedia.org/wiki/Variable_…): Both models in "Thinking" mode solved it correctly but took around 2 minutes. The thinking was elaborate (and correct): "I'm currently focused on rewriting the user's C-like code, aiming to replace all instances of 'y' with 'x'. However, I've hit a snag. The crucial part is to avoid this substitution in specific scenarios. " More surprisingly, both models in "Instant" mode got it consistently wrong, giving a glimpse into their pre-training levels of reasoning which seem to have hardly improved over the years if at all. Is there a middle ground which reliably answers this simple question correctly without thinking for too long? The best solution today remains prompt engineering. Appending "Think step by step before answering" to the prompt causes both models to engage in a brief monologue which turns out to be crucial to subsequently generating the correct program. The recent "Auto" mode of LLMs is intended to automatically strike this balance and forgo the need for prompt engineering. But it currently errs on the side of over-thinking, or worse, under-thinking. LLMs will undoubtedly get better at calibrating their "Auto"mode but I am equally excited about emerging architectures like diffusion models, which are not constrained by left-to-right generation and can iteratively refine their answer the way humans typically do.

English

0

180

Yann Dubois@yanndubs·14 Ara

@adxtyahq The model evaluated in the blog is the thinking model, try that (even at low thinking)

English

6

0

74

4.3K

aditya@adxtyahq·12 Ara

btw, GPT-5.2 claims to score 100% on AIME LMAOOOO

English

52

3

250

23K

Mayur Naik@AI4Code·4 Ara

@krismicinski Thank you Kris! Means a lot coming from you.

English

0

1

25

Mayur Naik@AI4Code·4 Ara

✨✨Proud of my research group’s NeurIPS 2025 Spotlight paper on improving zero-shot performance of embodied AI agents using neurosymbolic representations! Come see us at the conference in San Diego, play with our live demo / model / dataset on HuggingFace, and check out our code on GitHub. Led by my amazing student Jiani Huang @jiani_huang_ai who is on the academic job market! Collaborators: - PhD students Neelay Velingker @NeelayV, Mayank Keoliya @KeoliyaMayank - undergrads Amish Sethi @sethi_amish who is applying to PhD programs, Matthew Kuo - former PhD student Ziyang Li @_ziyang_, now faculty at JHU CS @JohnsHopkins #neurosymbolic #embodiedAI #robotics #NeurIPS2025

Penn Engineering AI@PennEngAI

@PennEngineers doctoral student @jiani_huang_ai (@cis_penn) presents ESCA at @NeurIPSConf 2025, a system that helps embodied AI agents better understand their surroundings by creating context-aware descriptions of a scene. Research advised by Professor Mayur Naik (@AI4Code).

English

2

5

12

1.5K

Mayur Naik@AI4Code·4 Ara

Links: ESCA paper (NeurIPS 2025 Spotlight): arxiv.org/abs/2510.15963 ESCA dataset + model: huggingface.co/video-fm LASER repo (ICLR 2025): github.com/video-fm/LASER LASER demo: huggingface.co/spaces/jiani-h…

Italiano

0

2

3

491

Mayur Naik ری ٹویٹ کیا

Jiani Huang@jiani_huang_ai·1 Ara

Announcing our ✨ NeurIPS’25 Spotlight ✨ paper: ESCA: Contextualizing Embodied Agents via Scene-Graph Generation TLDR: We introduce a framework that grounds multimodal embodied agents in scene graphs, leading to more reliable perception, stronger reasoning, and better actions.

English

1

7

9

547

Mayur Naik ری ٹویٹ کیا

prof-g@prof_g·12 Kas

i've had mixed results in the past, but the new @grok model (grok-4-fast [beta]) is crushing really hard math problems that other models can't handle. it's amazing to me how fast it's improving.

English

4

6

49

4K

Mayur Naik ری ٹویٹ کیا

prof-g@prof_g·7 Kas

my first technical job was a summer job for the computer division of a bank in cleveland ohio. i was 19 and into assembly, pascal, etc. couldn't wait to see what i would be assigned... === 1988 === my job is in the tape library. stand in front of a green-screen-matrix-terminal. wait for a 6-digit number to appear. memorize it. run to tape library & select tape. load it into the mainframe, replacing old tape. read off number of old tape and file correctly. go back to monitor & wait for a new number. at exciting moments, multiple numbers appear. === i was replaced by technology. my hard work rendered useless. none of my skills from then matter to the new tech. i am completely obsolete. & i do not want to do that job ever again :-) === i can't wait for AI to make much of my job obsolete: > writing & grading tests > creating curricula > managing canvas sites > scheduling meetings > serving on committees > formatting papers > looking up references > diagram chasing > writing slide decks > reading resumes & portfolios >>>

English

2

5

57

5.6K

Mayur Naik ری ٹویٹ کیا

Adam Stein@adamlsteinl·28 Eki

Announcing our NeurIPS paper: Once Upon an Input: Reasoning via Per-Instance Program Synthesis (PIPS) 📝: arxiv.org/abs/2510.22849 Why do LLMs (and LLM agents) still struggle on hard reasoning problems which should be solvable by writing and executing code? We find that the biggest problem with LLM generated “programs” for reasoning is that they don’t compute anything, they just hardcode the answer! PIPS fixes this by 1️⃣ abstracting the input into symbols, 2️⃣ generating code that maps symbols to the answer, and 3️⃣ refining the code with structural feedback. 🧵👇

English

2

8

17

2.2K

Mayur Naik ری ٹویٹ کیا

prof-g@prof_g·12 Eki

@emollick i've spent serious efforts working on developing math problems with an unambiguous (e.g., numerical) answer that gpt-5-pro cannot solve. it is *nontrivial* to do so. it was totally different even 4-6 months ago.

English

12

30

284

263.6K

Mayur Naik@AI4Code·3 Eki

It is easy to overlook but hard to overstate how big a role "test-time inference" has played in modern LLM gains in reasoning. I myself wasn't sure, so I put GPT-5 and Gemini-2.5 to a classic programming puzzle called variable shadowing (en.wikipedia.org/wiki/Variable_…): Both models in "Thinking" mode solved it correctly but took around 2 minutes. The thinking was elaborate (and correct): "I'm currently focused on rewriting the user's C-like code, aiming to replace all instances of 'y' with 'x'. However, I've hit a snag. The crucial part is to avoid this substitution in specific scenarios. " More surprisingly, both models in "Instant" mode got it consistently wrong, giving a glimpse into their pre-training levels of reasoning which seem to have hardly improved over the years if at all. Is there a middle ground which reliably answers this simple question correctly without thinking for too long? The best solution today remains prompt engineering. Appending "Think step by step before answering" to the prompt causes both models to engage in a brief monologue which turns out to be crucial to subsequently generating the correct program. The recent "Auto" mode of LLMs is intended to automatically strike this balance and forgo the need for prompt engineering. But it currently errs on the side of over-thinking, or worse, under-thinking. LLMs will undoubtedly get better at calibrating their "Auto"mode but I am equally excited about emerging architectures like diffusion models, which are not constrained by left-to-right generation and can iteratively refine their answer the way humans typically do.

English

1

0

11

807

Mayur Naik

دریافت کریں