Andrea | 🇸🇪🇪🇸🇻🇪

208 posts

Andrea | 🇸🇪🇪🇸🇻🇪 banner
Andrea | 🇸🇪🇪🇸🇻🇪

Andrea | 🇸🇪🇪🇸🇻🇪

@aicoding_

Computer Vision Engineer currently working as a Machine Learning Engineer. https://t.co/xLVKLO30rv https://t.co/DiKrU5Eya5

Katılım Temmuz 2016
126 Takip Edilen217 Takipçiler
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
Xiang Yue
Xiang Yue@xiangyue96·
Introducing Critique Fine-Tuning (CFT): a more effective SFT method for enhancing LLMs' reasoning abilities. 📄 Paper: arxiv.org/pdf/2501.17703 CFT is simple: instead of training models to directly answer questions, we train them to critique noisy answers. What's fascinating is that while most approaches focus on using generative critique or reward models to provide feedback for policy models, these critique models can themselves serve as policy models: directly answering questions with stronger reasoning. Interestingly, we also found that CFT saturates quickly: overtraining on critiques can even degrade problem-solving performance. Work led by @YuboWang726 and collaborated with @WenhuChen
Xiang Yue tweet media
English
11
67
306
23.2K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
ILIAS ISM
ILIAS ISM@illyism·
You don't need a reasoning model like R1 or o3, just use this .cursorrules with Claude Sonnet to add a thinking step, works 100x better.
ILIAS ISM tweet media
English
80
273
4.9K
557.7K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
Ivan Fioravanti ᯅ
Ivan Fioravanti ᯅ@ivanfioravanti·
🔥 o3-mini-high beats deepseek r1 and o1-pro! in a p5.js challenge! 03-mini result is so good that deserves a video on its own. deepseek r1 (bad result) and o1-pro (better) in comments below. Prompt in last comment. 1/4
English
70
128
1.2K
463.3K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
Dimitris Papailiopoulos
Dimitris Papailiopoulos@DimitrisPapail·
Transformers can overcome easy-to-hard and length generalization challenges through recursive self-improvement. Paper on arxiv coming on Monday. Link to a talk I gave on this below 👇 Super excited about this work!
Dimitris Papailiopoulos tweet mediaDimitris Papailiopoulos tweet mediaDimitris Papailiopoulos tweet media
English
19
141
1K
166.7K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
Sam Altman
Sam Altman@sama·
o3-mini is out! smart, fast model. available in ChatGPT and API. it can search the web, and it shows its thinking. available to free-tier users! click the "reason" button. with ChatGPT plus, you can select "o3-mini-high", which thinks harder and gives better answers.
English
1.6K
2K
26.1K
3.2M
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
Seunghyun Seo
Seunghyun Seo@SeunghyunSEO7·
what up guys, I made a one-page comparison of MHA and MLA from @deepseek_ai for those who skipped the DS-V2 paper. pls correct me if I'm wrong.
Seunghyun Seo tweet media
English
4
47
363
39.3K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
LangChain
LangChain@LangChain·
📚🤖 Advanced RAG + Agents Cookbook A comprehensive open-source guide delivering production-ready implementations of cutting-edge RAG techniques with AI agents. Built with LangChain and LangGraph, it features advanced implementations like Hybrid, Self, and ReAct RAG. Learn more: github.com/athina-ai/rag-…
LangChain tweet media
English
5
158
703
61.1K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
Andi Marafioti
Andi Marafioti@andimarafioti·
Fuck it, today we're open-sourcing the codebase used to train SmolVLM from scratch on 256 H100s🔥 Inspired by our team's effort to open-source DeepSeek's R1 training, we are releasing the training and evaluation code on top of the weights 🫡 Now you can train any of our SmolVLMs—or create your own custom VLMs!
Andi Marafioti tweet media
English
34
213
1.3K
98.6K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
AK
AK@_akhaliq·
OpenAI o3-mini System Card
AK tweet media
Português
11
68
361
46.7K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
Han Xiao
Han Xiao@hxiao·
Letter-dropping physics comparison: o3-mini vs. deepseek-r1 vs. claude-3.5 in one-shot - which is the best? Prompt: Create a JavaScript animation of falling letters with realistic physics. The letters should: * Appear randomly at the top of the screen with varying sizes * Fall under Earth's gravity (9.8 m/s²) * Have collision detection based on their actual letter shapes * Interact with other letters, ground, and screen boundaries * Have density properties similar to water * Dynamically adapt to screen size changes * Display on a dark background
English
153
255
2.6K
603.7K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
elvis
elvis@omarsar0·
AI Agents for Computer Use This report provides a comprehensive overview of the emerging field of instruction-based computer control, examining available agents – their taxonomy, development, and resources.
elvis tweet media
English
15
141
658
65.5K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
Gabriel Massadas
Gabriel Massadas@G4brym·
Gemini 2.0 doesn’t get nearly enough credit. I just dumped all my workers-qb source code into it, hit it with a simple, humble prompt, and boom => it one-shotted the docs. Not just good docs, way better than what I had before, packed with examples. Kinda insane.
English
30
60
718
115.4K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
AK
AK@_akhaliq·
OpenAI o3-mini just one shotted this prompt: write a script for 100 bouncing yellow balls within a sphere, make sure to handle collision detection properly. make the sphere slowly rotate. make sure balls stays within the sphere. implement it in p5.js
English
137
405
4.3K
814.7K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
anton
anton@abacaj·
Finished a run (R1 style) GRPO on Qwen-2.5-0.5B (base model) yield +10 accuracy points on GSM8K. Literally just works. Base model scores 41.6% as reported on qwen paper vs 51%~ GRPO
anton tweet media
English
41
108
1.1K
107.8K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
Antaripa Saha
Antaripa Saha@doesdatmaksense·
for people learning gpu programming and especially triton should check out liger kernel by linkedin it was released last year and built on top of triton to provide pre-optimized, ready-to-use implementations gpu optimization techniques specifically tailored for llm training
Antaripa Saha tweet media
English
9
61
621
33.9K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
Caleb Peffer (Hiring!)
Caleb Peffer (Hiring!)@CalebPeffer·
Excited to announce text-to-api.ai A website that turns any website into a get API with @firecrawl /extract endpoint. Data on the web has never been more accessible! Thanks to @devdigest, for starting this fabulous trend. Check out his GitHub repo below!
English
37
193
2.1K
235K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
Lex Fridman
Lex Fridman@lexfridman·
OpenAI o3-mini is a good model, but DeepSeek r1 is similar performance, still cheaper, and reveals its reasoning. Better models will come (can't wait for o3pro), but the "DeepSeek moment" is real. I think it will still be remembered 5 years from now as a pivotal event in tech history, due in-part to the geopolitical implications but for many other reasons too. All this discussed in 5 hour technical podcast I just recorded on the state of AI industry. Out tomorrow (hopefully).
English
976
1K
13.1K
1.5M
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
Artificial Analysis
Artificial Analysis@ArtificialAnlys·
OpenAI’s o3-mini is here - a significant jump forward from o1-mini Initial results (full benchmarking coming soon): ➤ Artificial Analysis Quality Index of 89, matching DeepSeek R1 and just below o1 ➤ Cheaper - $1.1/$4.4 input/output pricing per million tokens, lower than many DeepSeek R1 APIs (higher than DeepSeek’s first party R1 API) ➤ Fast - similar speed to o1-mini at 170 tokens/s, although that means 2000 tokens of ‘thinking’ time will still take ~12 seconds
Artificial Analysis tweet mediaArtificial Analysis tweet mediaArtificial Analysis tweet media
English
24
59
401
79.8K
Andrea | 🇸🇪🇪🇸🇻🇪 retweetledi
Carlos E. Perez
Carlos E. Perez@IntuitMachine·
When working with o1/o3 models, I always have this feeling that I'm leaving a lot on the table with my prompting. Creating a long sequence of prompts for regular LLMs is good practice. This is because you don't want to overload what an LLM can process (or it'll lead to hallucinations). But Large Reasoning Models (LRMs) are different.
Carlos E. Perez tweet media
English
21
77
524
54.4K