Padarn

282 posts

Padarn

Padarn

@Padarn

London, England Entrou em Mayıs 2009
530 Seguindo45 Seguidores
Padarn
Padarn@Padarn·
@shi_weiyan Thanks! I don't see the recording at the link though?
English
1
0
0
20
Weiyan Shi
Weiyan Shi@shi_weiyan·
Recording: tinyurl.com/ymc5tady Hard to moderate a "multi-agent multi-turn human-AI interaction" session😂 so my answer to q1: “I’d love a moderation agent in 1yr🙏” Let's also put AI on more panels! As Claude said, AI can offer views as “someone who sit on both sides”🤣
English
2
2
7
1K
Weiyan Shi
Weiyan Shi@shi_weiyan·
fun panel with @jaseweston @ysu_nlp @willccbb @xwang_lk @natashajaques - What agents can/can't solve in 1 yr - 1K+ step tasks - Academia & long-horizon tasks - Continual learning: in-context vs weights - Human-AI co-evolution Claude joined as our first AI panelist! Recording🧵
Weiyan Shi tweet mediaWeiyan Shi tweet media
Weiyan Shi@shi_weiyan

Finally with a closing keynote by @ysu_nlp on “Computer Use: Modern Moravec’s Paradox”, we connect the history and the future 🙌 — “symbolic reasoning” vs “Perception & Mobility” in agents — future of AI — dragon-slaying on agent plasticity and reliability

English
9
13
83
24K
Weiyan Shi
Weiyan Shi@shi_weiyan·
The afternoon starts with @jaseweston on “challenges long-horizon tasks” and solutions 🤩 — failure from memory — credit assignment in outcome-only reward — lack of environment & generalization
Weiyan Shi tweet mediaWeiyan Shi tweet mediaWeiyan Shi tweet media
English
1
5
9
7.1K
Padarn
Padarn@Padarn·
@mitsuhiko OpenAI have a minimal implementation of the response API for gpt_oss github.com/openai/gpt-oss… This seems less necessary for open models though: If you have access to the full trace (thinking included) the missing state is the KV-cache which I'd consider closer to an optimization.
English
0
0
0
11
Armin Ronacher ⇌
Armin Ronacher ⇌@mitsuhiko·
Followup to yesterday's post: I'm starting to think as agents and LLM APIs of being a state synchronization problem and that we might look into what the local first folks are doing. Dumped my thoughts here: lucumr.pocoo.org/2025/11/22/llm…
English
15
19
233
57.6K
Padarn
Padarn@Padarn·
@athyuttamre @mitsuhiko Is there any way to extend the retention? Does the whole thread expire at 30d or just the stored state of some response_ids?
English
0
0
0
28
Atty Eleti
Atty Eleti@athyuttamre·
All items are written to the DB and retained for 30d. If the request fails before it reached us, then no changes. If the request fails after it reaches us, you’ll have state mismatch, but your application will always send the previous_response_id that _it_ thinks was the last one, and allow you to continue from where you left off.
English
2
0
2
384
Padarn
Padarn@Padarn·
Great article from the Spotify Experimentation Team Beyond Winning: Spotify’s Experiments with Learning Framework engineering.atspotify.com/2025/9/spotify… Would be really interested to hear how what powered means for a metric. Is there an 'effect size of interest'?
English
0
0
0
17
Padarn
Padarn@Padarn·
@ezyang That’s a good point: I’ve not had this issue when using cursor, presumably because there is a separate model call to fit the edit to the current code?
English
0
0
0
16
Edward Z. Yang
Edward Z. Yang@ezyang·
@Padarn The main problem is the formatter is going to change the structure of the edited code, which means we have to tell the LLM that the code changed, and we are more at risk of the LLM hallucinating the old structure
English
1
0
0
15
Edward Z. Yang
Edward Z. Yang@ezyang·
Interesting codemcp problem: if you have an autoformatter, when should it run? I've currently decided to make it run at the end of the task, so the LLM doesn't get sidetracked fixing formatting errors while it still has N other tasks to do. But maybe Sonnet is up to the task? idk
English
1
0
3
975
Padarn
Padarn@Padarn·
@SlackHQ Hey Slack! I'm wondering if there is anywhere I learn learn about the roadmap for the slack API In particular I want to know when I can get thread information from this event api.slack.com/events/assista…
English
1
0
0
73
Amazon Science
Amazon Science@AmazonScience·
Anomaly detection on graphs is complex because of graph topologies, and training data is scarce. At @WSDMSocial, Amazon researchers showed how to generate anomalous graphs using a variational graph neural network and diffusion modeling in the latent space. amazon.science/blog/anomaly-d…
English
1
2
10
1.5K
Padarn
Padarn@Padarn·
@ezyang I couldn’t find it :sad:
English
0
0
0
48
Edward Z. Yang
Edward Z. Yang@ezyang·
Many thanks to the Rust Programming Languages Discord for helping me figure it out.
English
2
0
3
859
Edward Z. Yang
Edward Z. Yang@ezyang·
Performance puzzle! In Rust, you are iterating lines of file running regex "\d{2}:\d{2}:\d{2} [A-Za-z0-9]+: (.+)". On production data, it is 20MB/s slow. But when you delete (.+) from the regex, your speed jumps up to 1GB/s. It doesn't repro on test data. What's the problem?
English
2
0
7
4.2K
Padarn
Padarn@Padarn·
@rakyll Curious what, if any, tools you're using to do this? I've found copilot only "okay" as an experience to do test driven LLM development.
English
0
0
0
60
Jaana Dogan ヤナ ドガン
LLMs made software development difficult. I'm developing a fairly complicated state machine and my options are: - Making LLMs manage the flow based on some descriptions of state and steps, and evaluate the hell out of it - Generating the state machine code based on the same descriptions - Generating conformace test suite from the same descriptions and implement the state machine myself And various combinations of all three.
English
9
6
85
34.8K
Padarn
Padarn@Padarn·
@ZheqingZhu @AIatMeta Fantastic stuff. Thanks a lot to you and your team for all the writing, really valuable for this field to have practical examples of where RL works today!
English
0
0
1
239
Zheqing (Bill) Zhu
Zheqing (Bill) Zhu@ZheqingZhu·
2023 has been a really fruitful year for the Applied Reinforcement Learning Team at @AIatMeta ! A quick summary of our external research contributions across open-source, recomender systems, ads, infra, experimentation and new algorithms: 1. Open-source software: we released Pearl (github.com/facebookresear…), our flagship OSS on production-ready reinforcement learning, so far obtained 2K stars and 120+ forks on github within 3 weeks of release and presented at NeurIPS 2023. We are also nominated by @tryolabs as runner-ups to top 2023 python libraries. 2. Recommender systems: we've released four RL papers regarding recommender systems across exploration, on-policy RL, offline learning: - On-policy RL: Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning (RecSys 2023, arxiv.org/abs/2305.13747) - Scalable neural bandit: Scalable Neural Contextual Bandit for Recommender Systems (CIKM 2023, arxiv.org/pdf/2306.14834) - Scalable deep neural exploration: Deep Exploration for Recommendation Systems (RecSys 2023, arxiv.org/pdf/2109.12509) - Offline learning: Learning to Bid and Rank Together (Machine Learning, Springer, ArXiv to be released) 3. Ads and auction systems: a new RL based pacing algorithm is released that ensures intepretability and compatibility with classic controller based pacing systems. - Offline Reinforcement Learning for Optimizing Production Bidding Policies (in submission, arxiv.org/abs/2310.09426) 4. Data center and infrastructure: RL has been a really powerful tool for optimization and data center operations. We released two papers that leverage RL for network migration and resource allocation - Resource allocation: Two-tiered Online Optimization of Region-wide Datacenter Resource Allocation via Deep Reinforcement Learning (in submission, arxiv.org/abs/2306.17054) - Network migration: Klotski - Efficient and Safe Network Migration of Large Production Datacenters (ACM SigComm 2023, dl.acm.org/doi/abs/10.114…) 5. Experimentation: to address training data leakage from test group to control group in A/B tests introduced by online exploration, we developed a new experimentation procedure to measure the true impact of exploration methods: - Evaluating Online Bandit Exploration In Large-Scale Recommender System (KDD Workshop 2023, arxiv.org/abs/2304.02572) 6. New RL/Bandit Algorithms: We also spent time to advance methods in non-stationary neural contextual bandits and model-based RL. - Nonstationary neural contextual bandit: Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling (in submission, arxiv.org/abs/2310.07786) - Offline model-based RL: IQL-TD-MPC - Implicit Q-Learning for Hierarchical Model Predictive Control (ICML workshop 2023 and in submission, arxiv.org/abs/2306.00867) Looking forward to 2024 where we will bring more RL magic into the real world!
English
2
6
60
10.5K
Padarn
Padarn@Padarn·
@eugeneyan Got it. It’s interesting it’s not shown up in the discussion at all when people worry about ordering the context in the prompt. Seems like it’d avoid having to guess about which documents were being conditioned on and to what extent.
English
0
0
1
20
Eugene Yan
Eugene Yan@eugeneyan·
@Padarn yes that’s right. it turns out that the simpler approach you mention goes a long way toward though there may be some gains from the techniques in the papers
English
1
0
1
57
Eugene Yan
Eugene Yan@eugeneyan·
Wrote abt patterns for LLM systems/products • Evals: Track performance • RAG: Add external knowledge • Finetuning: Improve specific tasks • Caching: Reduce latency & cost • Guardrails: Ensure output quality • Defensive UX: Anticipate & manage errors eugeneyan.com/writing/llm-pa…
English
38
173
790
141.2K
Padarn
Padarn@Padarn·
@thorstenball Seems like this got Amazon a lot of free advertising for a feature… maybe it’s all working out?
English
0
0
0
11
Thorsten Ball
Thorsten Ball@thorstenball·
So is the myth that every pixel you see on Amazon has been optimized and A/B tested for thousands of lifetimes before you laid your eyes on it — is that actually true? Because, man, I have a hard time accepting that *this* is the best way to present the add-to-wishlist button.
Thorsten Ball tweet media
English
59
5
291
71.7K
Padarn
Padarn@Padarn·
@airvistara hello can you help me with a lost Apple Watch on a flight? The flight was from Delhi to Singapore, UK115, leaving 23:45 27th of November
English
0
0
0
56
Padarn
Padarn@Padarn·
@testingham (Off topic: Sad those abstracts are not published)
English
0
0
1
13
tom cunningham
tom cunningham@testingham·
NEW POST: Thinking about tradeoffs? draw an ellipse. With applications to (1) experiment launch rules; (2) ranking weights in a recommender; and (3) allocating headcount in a company.
tom cunningham tweet media
English
1
1
6
2.2K
Padarn
Padarn@Padarn·
@testingham I remember seeing this in a CODE@MIT abstract last year (I didn’t go but a colleague did) and loving the idea. Glad to see this write up
English
0
0
1
15