Zhensu Sun

122 posts

Zhensu Sun

@v587su

A Ph.D. student @sgSMU. My research focuses on improving coding productivity with AI techniques.

Katılım Kasım 2016

311 Takip Edilen165 Takipçiler

Zhensu Sun retweetledi

Elon Musk@elonmusk·5d

Yeah

Naval@naval

Software was eaten by AI.

English

3.6K

5.9K

81.2K

103.4M

Zhensu Sun@v587su·7 Şub

The conclusion of this paper is very interesting. Images may be a more friendly representation for AI to understand source code, since it can be easily compressed to reduce cost. 😃

Kevin Lin@KevinQHLin

today most interesting paper: CodeOCR This work provides a good explanation of how code indentation and highlighting are designed to serve the human eye. arxiv.org/pdf/2602.01785

English

205

Zhensu Sun@v587su·22 Eyl

How could two ICSE reviewers think my paper is novel while the remaining one think the paper is incremental? It doesn't make sense😮‍💨

English

102

Zhensu Sun@v587su·27 Ağu

@IndianInExile @alxfazio This paper is accepted by ICSE 2026. You can find it on arxiv: arxiv.org/pdf/2508.13666

English

SQuirreL@IndianInExile·26 Ağu

@alxfazio Where are these research papers published? Any website or any subscription?

English

109

Zhensu Sun retweetledi

alex fazio@alxfazio·25 Ağu

java is the most token-efficient language, let that sink in

English

3.3K

234.6K

Zhensu Sun@v587su·27 Ağu

@presstab_dev @alxfazio The paper is available here: arxiv.org/abs/2508.13666

English

Tom Bradshaw@presstab_dev·25 Ağu

@alxfazio no link to research?

English

2.8K

Zhensu Sun retweetledi

Rohan Paul@rohanpaul_ai·24 Ağu

Stripping code formatting cuts LLM token cost without hurting accuracy. Average input tokens drop by 24.5%, with output quality basically unchanged. The core issue is simple, indentation, spaces, and newlines help humans read but they inflate tokens that models pay to process. They remove only cosmetic formatting while keeping program meaning identical, checked by matching the abstract syntax tree of the code. They test Fill in the Middle code completion, where a model fills a missing block, across Java, C++, C#, and Python. Performance stays stable on unformatted input, big models barely move, smaller ones wobble a bit, Python sees less savings because its layout is part of the language. One surprise, models still print nicely formatted code even when given smashed input, so output token savings are small. To fix that, 2 cheap tactics work, explicit prompts that say output without formatting, and light fine tuning on unformatted samples. With clear instructions or tiny training, output length shrinks by 25% to 36% while pass rate on the first try holds. They also ship a tool that strips formatting before inference then restores it after, so humans read clean code while the model pays less. ---- Paper – arxiv. org/abs/2508.13666 Paper Title: "The Hidden Cost of Readability: How Code Formatting Silently Consumes Your LLM Budget"

English

207

13.8K

Zhensu Sun@v587su·21 Ağu

Want to save your LLM budget without sacrificing performance? Here's a useful trick: removing non-essential code formatting, like indentations, newlines, and extra whitespaces, cuts input tokens by an average of 24.5%! Check out our full study: arxiv.org/abs/2508.13666

English

191

Zhensu Sun@v587su·19 Nis

A very interesting match

The Humanoid Hub@TheHumanoidHub

The humanoid robot half-marathon in Beijing just started!

English

203

Zhensu Sun@v587su·29 Mar

@davidlo2015 @msrconf Well deserved!

English

David Lo@davidlo2015·28 Mar

Thank you very much, @msrconf, award committee, nominator, and endorser for the help, support, and recognition, which are very much appreciated :) Thank you very much to all my mentors, students, and collaborators for the help and support :)

MSR 2025@msrconf

We are thrilled to announce that #MSR2025 Foundational Contribution Award 🏆💫goes to David Lo @davidlo2015 for pioneering, influential, and lasting contributions to transforming bug and test data into insights and automation that improve software quality and productivity.

English

2.4K

Zhensu Sun retweetledi

FORGE@ConfForge·15 Şub

🚨 Big Announcement! 🚨 We’re thrilled to welcome two distinguished keynote speakers to #FORGE2025! ✨ Prem Devanbu @devanbu (@UCDavis Professor) 🔗 cs.ucdavis.edu/~devanbu/ ✨ Graham Neubig @gneubig (@CarnegieMellon Associate Professor ) 🔗 phontron.com

English

2.3K

Zhensu Sun@v587su·24 Ara

I'll ride a dog to lab in a near future.

Unitree@UnitreeRobotics

Unitree B2-W Talent Awakening! 🥳 One year after mass production kicked off, Unitree’s B2-W Industrial Wheel has been upgraded with more exciting capabilities. Please always use robots safely and friendly. #Unitree #Quadruped #Robotdog #Parkour #EmbodiedAI #IndustrialRobot #InspectionRobot #IntelligentRobot #FoundationModels #LeggedRobot #WheeledLegs

English

268

Zhensu Sun retweetledi

BNO News@BNONews·14 Ara

OpenAI whistleblower Suchir Balaji, who accused the company of breaking copyright law, found dead in apparent suicide

English

1.2K

8.3K

53.5K

27.4M

Zhensu Sun retweetledi

FORGE@ConfForge·10 Eki

🎉 Exciting News! 🎉 We are thrilled to announce that ACM SIGSOFT has officially upgraded FORGE from an ICSE Special Event to an ICSE Co-Located Conference! 🚀 We can’t wait to see your submissions for FORGE 2025! See more below👇 #FORGE #FORGE2025 @ICSEconf

English

1.5K

Zhensu Sun retweetledi

Philipp Schmid@_philschmid·26 Ağu

AI is not making any progress"? Look closer. 🙄 GPT-4 level models got 240x cheaper in just 2 years! AI progress isn't linear and is just about bigger models. BERT -> DistilBERT Llama 2 70B -> Llama 3 8B GPT-4 -> GPT-4o-mini Llama 3 405B → Llama 4 70B?? 🤔 Models get bigger, then smaller but equally powerful. It's a cycle of innovation. Today's quality per $ is the most expensive we'll see. Making it cheaper will lead to more people using, learning, and building with AI, which might unlock more potential and “goodput” for everyone than yet another Foundation Model! AI's real progress: Getting into more hands.🤗 [Image credits: @davidtsong]

English

252

29.8K

Zhensu Sun@v587su·6 Ağu

arxiv.org/abs/2408.01055

ZXX

117

Zhensu Sun@v587su·6 Ağu

Our recent work on self-healing software systems is available at Arxiv now🥳: [2408.01055] LLM as Runtime Error Handler: A Promising Pathway to Adaptive Self-Healing of Software Systems (arxiv.org)

English

196

Zhensu Sun@v587su·26 Tem

Impressive

Robert Scoble@Scobleizer

Wow. @Jandodev just showed me a prompt humans can’t read but LLMs understand this language better. The San Francisco AI people are designing a new language. In stealth. You are first to see it.

English

197

Zhensu Sun retweetledi

Robert Scoble@Scobleizer·26 Tem

Wow. @Jandodev just showed me a prompt humans can’t read but LLMs understand this language better. The San Francisco AI people are designing a new language. In stealth. You are first to see it.

English

369

282

2.6K

617.6K

Zhensu Sun retweetledi

Guillaume Lample @ NeurIPS 2024@GuillaumeLample·16 Tem

Today we are releasing two small models: Mathstral 7B and Codestral Mamba 7B. On the MATH benchmark, Mathstral 7B obtains 56.6% pass@1, outperforming Minerva 540B by more than 20%. Mathstral scores 68.4% on MATH with majority voting@64, and 74.6% using a reward model. Codestral Mamba is one of the first open source models with a Mamba 2 architecture. It is the best 7B code model available, and is trained with a context length of 256k tokens. Both models are released under the Apache 2 license. mistral.ai/news/mathstral/ mistral.ai/news/codestral…

Guillaume Lample @ NeurIPS 2024 tweet media

Mistral AI@MistralAI

mistral.ai/news/mathstral/ mistral.ai/news/codestral…

English

104

691

99.1K

Keşfet

@IndianInExile @alxfazio @presstab_dev @davidlo2015 @msrconf @devanbu @ucdavis @gneubig