Atharv Sonwane

41 posts

Atharv Sonwane

Atharv Sonwane

@twm_as

CS PhD @ Cornell Prev. RF @ Microsoft Research India, CS @ BITS Goa AI, PL, Robotics

Katılım Aralık 2018
948 Takip Edilen226 Takipçiler
Varshita Kolipaka
Varshita Kolipaka@VarshitaKolipa1·
teacher-forcing but for agents, nice
Varshita Kolipaka tweet media
English
1
0
4
583
Atharv Sonwane retweetledi
Jatin Prakash
Jatin Prakash@bicycleman15·
What to do when you have zero rewards during RL? We benchmarked RL baselines on a simple star-graph task where they underperform in zero reward scenarios. Turns out, a dead simple data-centric intervention of just adding easy samples of the task helps unlock RL training! 👇
Jatin Prakash tweet media
Anirudh Buvanesh@AnirudhBuvanesh

Zero rewards after tons of RL training? 😞 Before using dense rewards or incentivizing exploration, try changing the data. Adding easier instances of the task can unlock RL training. 🔓📈To know more checkout our blog post here: spiffy-airbus-472.notion.site/What-Can-You-D…. Keep reading 🧵(1/n)

English
2
7
15
856
carlos
carlos@_carlosejimenez·
We just updated the SWE-bench Lite leaderboard with SWE-agent GPT4o! It gets slightly worse accuracy (17%) than GPT4 (18%). Super interested in whether people can build out new tools for SWE-agent with GPT4o to make it better!
English
5
2
19
4.2K
Atharv Sonwane
Atharv Sonwane@twm_as·
@paulgauthier Super interesting work. Quick question: while evaluating on SWE-bench, does aider make use of the "hints text" provided in the dataset?
English
1
0
1
166
Paul Gauthier
Paul Gauthier@paulgauthier·
Aider is SOTA on the main SWE Bench, scoring 18.9% vs Devin at 13.9%, AmazonQ at 13.8% . So aider is now SOTA on both SWE Bench & SWE Bench Lite. Achieved via static code analysis, reliable LLM code editing, auto-fixing lint/test errors; not slow, expensive "agentic" behaviors. aider.chat/2024/06/02/mai…
Paul Gauthier tweet media
English
9
22
132
22.5K
Atharv Sonwane retweetledi
Aditya Kanade
Aditya Kanade@adityakanade0·
Nice to see our work (CORE) on using LLMs to resolve code quality issues flagged by static analysis tools like CodeQL (Python) and Sorald (Java) accepted in FSE 2024! Thanks to all the collaborators for the great effort 👍@FSEconf #FSE2024 Pre-print: arxiv.org/abs/2309.12938
Nalin Wadhwa @ ICLR 2026@nalin_wadhwa

📢 Frustrated with code quality issues? LLMs can Help! 🚀 We introduce COde REvisions (CORE), a language agnostic tool that can help fix issues flagged by static analysis tools with minimal setup. Excited to share that our paper has been accepted at #FSE2024! 🎉

English
1
4
28
2.5K
Atharv Sonwane
Atharv Sonwane@twm_as·
At #NeurIPS2023 and interested in automating repository level coding with LLMs? I'll be at our poster today on CodePlan at the Foundation Models for Decision Making Workshop! Venue: Hall E2 till 5:30 PM
Aditya Kanade@adityakanade0

LLMs are good at localized coding tasks. What if a task spans multiple inter-dependent files? These “repository-level coding tasks” cannot be solved directly using LLMs. We formulate these as a planning problem and design a task-agnostic, neuro-symbolic framework called CodePlan.

English
0
3
27
2.5K
Atharv Sonwane retweetledi
Aditya Kanade
Aditya Kanade@adityakanade0·
LLMs are good at localized coding tasks. What if a task spans multiple inter-dependent files? These “repository-level coding tasks” cannot be solved directly using LLMs. We formulate these as a planning problem and design a task-agnostic, neuro-symbolic framework called CodePlan.
AK@_akhaliq

CodePlan: Repository-level Coding using LLMs and Planning paper page: huggingface.co/papers/2309.12… Software engineering activities such as package migration, fixing errors reports from static analysis or testing, and adding type annotations or other specifications to a codebase, involve pervasively editing the entire repository of code. We formulate these activities as repository-level coding tasks. Recent tools like GitHub Copilot, which are powered by Large Language Models (LLMs), have succeeded in offering high-quality solutions to localized coding problems. Repository-level coding tasks are more involved and cannot be solved directly using LLMs, since code within a repository is inter-dependent and the entire repository may be too large to fit into the prompt. We frame repository-level coding as a planning problem and present a task-agnostic framework, called CodePlan to solve it. CodePlan synthesizes a multi-step chain of edits (plan), where each step results in a call to an LLM on a code location with context derived from the entire repository, previous code changes and task-specific instructions. CodePlan is based on a novel combination of an incremental dependency analysis, a change may-impact analysis and an adaptive planning algorithm. We evaluate the effectiveness of CodePlan on two repository-level tasks: package migration (C#) and temporal code edits (Python). Each task is evaluated on multiple code repositories, each of which requires inter-dependent changes to many files (between 2-97 files). Coding tasks of this level of complexity have not been automated using LLMs before. Our results show that CodePlan has better match with the ground truth compared to baselines. CodePlan is able to get 5/6 repositories to pass the validity checks (e.g., to build without errors and make correct code edits) whereas the baselines (without planning but with the same type of contextual information as CodePlan) cannot get any of the repositories to pass them.

English
3
10
78
18.4K
Atharv Sonwane retweetledi
SAiDL
SAiDL@SforAiDL·
We are excited to present to you the third edition of "AI Symposium," in association with @appcair - the AI Research Lab of @BITSPilaniGoa ! [1/6]
SAiDL tweet media
English
1
15
25
0
Atharv Sonwane retweetledi
Stats of India
Stats of India@Stats_of_India·
Who exactly is Indian middle class? • 90% of Indians make less than 25,000 monthly. • If you're making > 1L a month, you're among the top 3%.
Stats of India tweet media
English
105
908
3.7K
0
Atharv Sonwane retweetledi
Nithin Kamath
Nithin Kamath@Nithin0dha·
How large is the Indian market for B2C tech businesses in terms of users who can generate revenue? Maybe 15 crores max! Here's why, with Fintech as a reference, since some data is available. I guess it is important to know this, so we can all be rationally optimistic. 1/11
English
168
1.5K
6.4K
0
Atharv Sonwane retweetledi
Fifty Two
Fifty Two@FiftyTwoDotIn·
It's time for another 🧵 IIT GIRLS: How women students at IIT Bombay in the 70s were radicalised by the sexism they faced on campus. And went on to change science in India forever.
Fifty Two tweet media
English
26
887
3.6K
0
Atharv Sonwane retweetledi
SAiDL
SAiDL@SforAiDL·
Reminder! The Social session on Gathertown is starting in 15 mins. Please note that only accepted people would be able to join this event.
English
1
2
5
0
Atharv Sonwane retweetledi
SAiDL
SAiDL@SforAiDL·
Prof. Aaditeshwar's talk on: "Developing tech for AI and Social Development" is in progress right now. Link: us02web.zoom.us/j/84198462985?…
English
0
3
6
0