Amir Khalesi

2.1K posts

Amir Khalesi

@RetroMl

ML engineer - Trying to find out what is wrong with LLMs - e/acc - AI @ UT

参加日 Mart 2020

516 フォロー中530 フォロワー

Amir Khalesi@RetroMl·25 Şub

@rzdjafari Shit happens to the best of us

English

Reza Jafari@rzdjafari·24 Şub

قبل از قطع اینترنت می‌خواستم از زندگی کارمندی در بیام و بخاطر همین دو کار کسب و کاری مشترک، یکی تو ایران و یکی تو آلمان رو شروع کردم که قطعی اینترنت ترکوندش. این سری دوباره برگشتم به کارمندی و با به شرکت خیلی خوب و بزرگ به توافق رسیدم و بعد از عید کارم رو شروع می‌کنم، فقط امیدوارم جنگ‌ این سری خرابش نکنه.

Reza Jafari@rzdjafari

زیاد گفتن نداره تو این شرایط ولی این قطعی اینترنت همه جریان‌های درآمدی من رو نابود کرد و صفر مطلق شدم نمی‌دونم کی می‌تونم دوباره احیاشون کنم

فارسی

161

31K

Amir Khalesi@RetroMl·14 Eki

@_soeil کدوم هال و کجا هستین یه سر بیایم پیشتون؟

فارسی

204

گوگل‌کُنِ باحقوق@_soeil·13 Eki

شما سخت افزارو نگا با لبات بازی میکنه امسال با لوکا در نورث استار جیتکس هستیم اگه اینجایی یه سر به ما بزن اصفهان، کیلومتر900 جاده شیراز محل دائمی نمایشگاه‌های بین‌المللی دبی مارینا

فارسی

18K

Amir Khalesi@RetroMl·5 Ağu

@ontrader2022 @SarcasticPyDev من برای لپتاپ استفاده میکنم و بنظرم مدل تبلت و امکان ساپورتش رو سرچ کنید. در کل اگر از طریق هاب قابلیت اتصال به مانیتور یا تلویزیون رو داشته باشه، با این کابل هم میتونه وصل بشه.

فارسی

محمدرضا شاهی@ontrader2022·5 Ağu

@RetroMl @SarcasticPyDev دادش تبلت می تونم وصل کنم به تلویزیون و براش دسته بازی بخرم ؟

فارسی

Soroush Moosapour@SarcasticPyDev·5 Ağu

بچه ها کسی با usb hub تایپ C از خروجی HDMI استفاده کرده؟ میخوام افزایش تعداد مانیتور بدم و دارم میگردم

فارسی

4.3K

Amir Khalesi がリツイート

atlas@creatine_cycle·3 Ağu

here is what happens when you take creatine: - 5gs: bigger muscles - 15gs: bigger brain - 70gs: replace sleep - 88gs: remote viewing - 120gs: agentic workflows

English

192

570

15.1K

928.3K

Amir Khalesi@RetroMl·22 Tem

@Mortal__98 حتی آفیسری که داره social media رو مانیتور میکنه هم با شنیدن این حرفت ناراحت شد :))

فارسی

آشفته‌حال بیداربخت@Mortal__98·22 Tem

الان باید آهنگه رو بکنم توی کونم

آشفته‌حال بیداربخت@Mortal__98

آهنگ مناسب برای ویدیو غمگین پروازمو انتخاب کردم‌. الان فقط باید تافلو بدم، مقالمو بنویسم، ادمیشن بگیرم و نهایتا ویزا بگیرم🏃‍♂️

فارسی

454

Amir Khalesi@RetroMl·13 Tem

@karpathy @ibehnias

QAM

Andrej Karpathy@karpathy·13 Tem

Scaling up RL is all the rage right now, I had a chat with a friend about it yesterday. I'm fairly certain RL will continue to yield more intermediate gains, but I also don't expect it to be the full story. RL is basically "hey this happened to go well (/poorly), let me slightly increase (/decrease) the probability of every action I took for the future". You get a lot more leverage from verifier functions than explicit supervision, this is great. But first, it looks suspicious asymptotically - once the tasks grow to be minutes/hours of interaction long, you're really going to do all that work just to learn a single scalar outcome at the very end, to directly weight the gradient? Beyond asymptotics and second, this doesn't feel like the human mechanism of improvement for majority of intelligence tasks. There's significantly more bits of supervision we extract per rollout via a review/reflect stage along the lines of "what went well? what didn't go so well? what should I try next time?" etc. and the lessons from this stage feel explicit, like a new string to be added to the system prompt for the future, optionally to be distilled into weights (/intuition) later a bit like sleep. In English, we say something becomes "second nature" via this process, and we're missing learning paradigms like this. The new Memory feature is maybe a primordial version of this in ChatGPT, though it is only used for customization not problem solving. Notice that there is no equivalent of this for e.g. Atari RL because there are no LLMs and no in-context learning in those domains. Example algorithm: given a task, do a few rollouts, stuff them all into one context window (along with the reward in each case), use a meta-prompt to review/reflect on what went well or not to obtain string "lesson", to be added to system prompt (or more generally modify the current lessons database). Many blanks to fill in, many tweaks possible, not obvious. Example of lesson: we know LLMs can't super easily see letters due to tokenization and can't super easily count inside the residual stream, hence 'r' in 'strawberry' being famously difficult. Claude system prompt had a "quick fix" patch - a string was added along the lines of "If the user asks you to count letters, first separate them by commas and increment an explicit counter each time and do the task like that". This string is the "lesson", explicitly instructing the model how to complete the counting task, except the question is how this might fall out from agentic practice, instead of it being hard-coded by an engineer, how can this be generalized, and how lessons can be distilled over time to not bloat context windows indefinitely. TLDR: RL will lead to more gains because when done well, it is a lot more leveraged, bitter-lesson-pilled, and superior to SFT. It doesn't feel like the full story, especially as rollout lengths continue to expand. There are more S curves to find beyond, possibly specific to LLMs and without analogues in game/robotics-like environments, which is exciting.

English

408

835

8.4K

1.1M

Amir Khalesi@RetroMl·11 Tem

@jamshidpalang علی عزیز اگر خاطرت باشه، چند تابستان به بهانه اردوی انجمن نخبگان / همگام در کنار هم بودیم. با توجه به شناختی که ازت پیدا کردم، به هیچ‌ وجه فردی نبودی و نیستی که به صرف استفاده از سهمیه به جایگاه فعلی رسیده باشی. به این صحبت ها توجه نکن، کسانی که باید به خوبی میشناسنت.

فارسی

8.2K

Amir Khalesi@RetroMl·2 Tem

@ibehnias @mamadou_gamedev @iSegar0 داشتم یه ترد انگلیسی میخوندم راجب ویدیوی کارمک، گفتم تگت کنم یادم افتاد یچیزی تگم کردی، اومدم دیدم همینه :))

فارسی

781

Behnia Soleymani@ibehnias·2 Tem

@mamadou_gamedev @iSegar0 @RetroMl

QAM

865

Amir Khalesi@RetroMl·8 Nis

@jxmnop @ibehnias

QAM

dr. jack morris@jxmnop·7 Nis

pretty mind-blowing fact I just learned about transformer language models: the positional embeddings don't really do anything. you can just get rid of them and the model still works just as well sounds impossible, doesn't it? turns out standard LLMs aren't actually permutation-invariant because of the causal mask. so they just learn somehow to "figure out" what position they're at by counting the number of tokens they can see at a given position p crazy

English

119

1.7K

185.1K

Amir Khalesi@RetroMl·17 Mar

@Mortal__98 پسر خبر رو‌ خوندم اولین کسی که یادم افتاد تو بودی :(

فارسی

129

Amir Khalesi がリツイート

Eldar Kurtić@_EldarKurtic·27 Şub

Quantization in the era of reasoning models: How does quantization impact the reasoning capabilities of DeepSeek-R1 models across distilled Llama and Qwen families? 👇 Check the thread for two surprising findings in evaluations of these models!

English

312

52K

Amir Khalesi@RetroMl·17 Şub

@Nima_PhD_ پیاده سازی در لول بانکی در حال انجامه توسط تیم اگر مقدثر بود خوشحال میشم همکاری داشته باشیم

فارسی

100

Amir Khalesi@RetroMl·17 Şub

@ibehnias @corefpark Nice We should read it

English

Behnia Soleymani@ibehnias·17 Şub

@corefpark @RetroMl

QAM

133

Core Francisco Park @ NeurIPS2025@corefpark·16 Şub

💥New Paper! Algorithmic Phases of In-Context Learning: We show that transformers learn a superposition of different algorithmic solutions depending on the data diversity, training time and context length! 1/n