Jakub Bartczuk

399 posts

Jakub Bartczuk

@lambdaofgod

My opinions are not my own. They belong to the gnomes that live in my head

Poland 가입일 Ekim 2015

373 팔로잉79 팔로워

Jakub Bartczuk@lambdaofgod·1d

@raw_works Do you use vanilla RLM from DSPy? I tried to use it for LongBench with Qwens but DSPy module has a bunch of problems which required nontrivial changes to make it work

English

320

Raymond Weitekamp@raw_works·1d

crazy preliminary results from qwen 3.5 last night: Preliminary — Qwen3.5 + dspy.RLM on LongCoT-Mini: 27B lands at #2 (33%) — behind only GPT 5.2, +11pp ahead of Gemini 3 Pro. 9B lands at #4 (17%) — still beats Sonnet 4.5. RLMs unambiguously SOTA on this, more soon!

Raymond Weitekamp@raw_works

ok so the default DSPy.RLM is literally going to destroy this benchmark before the end of the day. running now for sonnet 4.5... 🏆 Scoreboard (live) RLM: 90/94 (95.7%) Vanilla: 0/94 (0.0%) anyone want to pay for the opus run? 😉

English

207

19.8K

Jakub Bartczuk@lambdaofgod·10 Nis

@JFPuget What is funny about this comment is that it arguably makes an even bigger error as it confuses disliking Trump with antiamericanism. Using the same logic half of US is antiamerican.

English

476

JFPuget 🇺🇦🇨🇦🇬🇱@JFPuget·10 Nis

I wonder how one can both be a professor in CS and ML, and be so bad at simple data modeling.

English

10.9K

Jakub Bartczuk@lambdaofgod·31 Mar

@ZainHasan6 For covid?

English

Zain@ZainHasan6·31 Mar

In Canada you get fired for not knowing French instead of for this:

Financial Times@FT

Air Canada CEO to step down over failure to speak French ft.trib.al/WYQO3bF

English

251

Jakub Bartczuk@lambdaofgod·26 Mar

@shiri_shh Fallout New Vegas already did that.

English

159

shirish@shiri_shh·26 Mar

This startup lets you ORDER SUNLIGHT from space to your exact location in 30 seconds 😭

English

1.6K

1.1K

14.7K

4.7M

Jakub Bartczuk@lambdaofgod·25 Mar

@effectfully Just stop using OnlyFans

English

effectfully@effectfully·25 Mar

I'll never get over the fact that whenever I make a donation, I'm charged VAT in Europe.

English

3.8K

Jakub Bartczuk@lambdaofgod·15 Mar

@antoine_chaffin Reasonong ColBERT?

Indonesia

128

Antoine Chaffin@antoine_chaffin·13 Mar

This release reminded me that I've trained a state-of-the-art reasoning ColBERT model that I never evaluated on BrowseComp-Plus... I wonder how good it is... last thing I remember was that it was not worth using 8B models when a 130M one can do as good... 🤭

Mixedbread@mixedbreadai

Introducing Mixedbread Wholembed v3, our new SOTA retrieval model across all modalities and 100+ languages. Wholembed v3 brings best-in-class search to text, audio, images, PDFs, videos... You can now get the best retrieval performance on your data, no matter its format.

English

12.5K

Jakub Bartczuk@lambdaofgod·12 Mar

@contextkingceo But I meant apple the fruit :(

English

Nishkarsh@contextkingceo·12 Mar

We've raised $6.5M to kill vector databases. Every system today retrieves context the same way: vector search that stores everything as flat embeddings and returns whatever "feels" closest. Similar, sure. Relevant? Almost never. Embeddings can’t tell a Q3 renewal clause from a Q1 termination notice if the language is close enough. A friend of mine asked his AI about a contract last week, and it returned a detailed, perfectly crafted answer pulled from a completely different client’s file. Once you’re dealing with 10M+ documents, these mix-ups happen all the time. VectorDB accuracy goes to shit. We built @hydra_db for exactly this. HydraDB builds an ontology-first context graph over your data, maps relationships between entities, understands the 'why' behind documents, and tracks how information evolves over time. So when you ask about 'Apple,' it knows you mean the company you're serving as a customer. Not the fruit. Even when a vector DB's similarity score says 0.94. More below ⬇️

English

621

637

3.8M

Jakub Bartczuk@lambdaofgod·10 Mar

@srchvrs @lateinteraction @Dspy @tomaarsen Python paradox. We have tons of useful libraries that do heavy lifting but then encounter a problem that is almost trivial in newer languages (which actually have concurrency models) and it feels like hitting a wall.

English

Leo Boytsov@srchvrs·9 Mar

🧵BTW, this pattern is so pervasive, but few people realize it. For example, @lateinteraction @dspy has this kind of loop for querying models in parallel, SentenceBERT (currently maintained by @tomaarsen) uses this for distributed evaluation and embedding generation. ↩️

Leo Boytsov@srchvrs

🧵For the last seven years, I kept re-implementing the same pattern: A parallel map loop that divides the work among several processes or threads. My very first attempts were built on Python’s standard tools, e.g., multiprocessing.map... ↩️

English

3.3K

Jakub Bartczuk@lambdaofgod·5 Mar

@k_ovfefe2 Because we don't want make other people insecure with our huge bulges.

English

Whale Psychiatrist ™️@k_ovfefe2·3 Mar

Why do European men always sit like this

English

1.8K

129

4.8K

5.6M

Jakub Bartczuk@lambdaofgod·5 Mar

@josevalim @theo Elixir is great, but what about isolation on BEAM? Elixir's model makes it really easy to run REPLs but the security is problematic - I'd be happy to change my mind but I tried to research this and I didn't find any easy way to run an interpreter with restricted permissions

English

142

José Valim@josevalim·5 Mar

@theo The only logical conclusion is that AGI will happen any time now.

English

454

25K

Theo - t3.gg@theo·4 Mar

Elixir?? You have my attention OpenAI

Lisan al Gaib@scaling01

New OpenAI repo: Symphony github.com/openai/symphony TLDR: it's an orchestration layer that polls project boards for changes and spawns agents for each lifecycle stage of the ticket You will just move tickets on a board instead of prompting an agent to write the code and do a PR

English

1.5K

375.9K

Jakub Bartczuk@lambdaofgod·2 Mar

@BlancheMinerva I'm not buying all this open model doomerism but isn't this actually one of the arguments that make sense? It's way easier to detect these attacks in Claude or than in a model that someone can run on a bunch of GPUs in their basement

English

Stella Biderman@BlancheMinerva·2 Mar

It's very common for people to claim that open LLMs will be used to commit cyber attacks at massive scale. What public evidence is there for this claim? The best (and one of the only) accounts I've seen of a cyber LLM attack was done using Claude anthropic.com/news/disruptin…

English

6.6K

Jakub Bartczuk@lambdaofgod·2 Mar

@metakognita To stwierdzenie jest ewidentnie fałszywe - ignoruje finetuning który ma kolosalny wpływ na faktyczne zachowanie modeli. Prowadzisz firmę od baz danych, więc dane to wszystko i nie ma nic poza danymi ;) Niezły kandydat na bujdę roku.

Polski

706

Metakognita@metakognita·2 Mar

Oracle właśnie powiedział to samo każdej firmie zajmującej się sztuczną inteligencją na świecie. Twoje modele są bezwartościowe. Nie chodzi o technologię, talent ani miliardy wydane na ich szkolenie. Ale dane, na których byli szkoleni. Larry Ellison, człowiek, który zbudował Oracle jako fundament globalnego przedsiębiorstwa, właśnie ujawnił szokujące informacje. Powiedział, że ChatGPT, Gemini, Grok i Llama – wszystkie one trenują na dokładnie tych samych danych. Cały publiczny Internet, każda strona Wikipedii, wątek na Reddicie i każdy artykuł informacyjny. Oznacza to, że wszystkie one łączą się i stają się w zasadzie tym samym produktem, tyle że z różnymi logotypami. Ellison określa to mianem towarów. Ale tu zaczyna być niebezpiecznie. Twierdzi, że prawdziwym złotem nie są dane publiczne, lecz dane prywatne.

StockMarket.News@_Investinq

Oracle just told every AI company on earth the same thing. Your models are worthless. Not the technology, talent or the billions spent training them. But the data they were trained on. Larry Ellison, the man who built Oracle into the backbone of global enterprise just dropped a bombshell. He said ChatGPT, Gemini, Grok, and Llama, all of them are training on the exact same data. The entire public internet, every Wikipedia page, Reddit thread and every news article. That means they're all converging essentially becoming the same product with different logos. Ellison's word for it is commodities. But here's where it gets dangerous. He says the real gold isn't public data, It's private data. The medical records in hospital systems, the financial data in bank vaults. The supply chain secrets of every Fortune 500 and guess where most of that data already lives. Not Google, Amazon or Microsoft but inside Oracle. Oracle databases hold most of the world's high value private enterprise data. So Oracle just launched something called AI Database 26ai. It lets the top AI models, ChatGPT, Gemini, Grok, Llama reason directly over a company's private data, without that data ever leaving the vault. They're using a technique called RAG, Retrieval Augmented Generation. The AI doesn't train on your data, it searches it in real time. Think about what that means. A bank could ask AI to analyze every loan it's ever made without exposing a single customer record. A hospital could have AI diagnose patients using its full medical history without violating HIPAA. A defense contractor could let AI reason across classified operations without data leaving a secure environment. Ellison is betting this is bigger than the training market. Bigger than the GPU boom. Bigger than the data center buildout. He called it the largest and fastest growing market in history. The numbers back the ambition. Oracle's remaining performance obligations just hit $523 billion. That's contracted revenue not yet delivered and $300 billion of it comes from OpenAI alone. Cloud revenue hit $8 billion in a single quarter, OCI grew 66 percent and GPU revenue surged 177 percent. But here's the part nobody's talking about. If private data becomes the real AI moat, then whoever controls the database controls the future of AI. And that's a level of power that should make everyone uncomfortable.

Polski

383

88.7K

Jakub Bartczuk@lambdaofgod·1 Mar

youtu.be/D_h7hcK6pFs?is…

YouTube

ZXX

Jakub Bartczuk@lambdaofgod·22 Haz

ZXX

Jakub Bartczuk@lambdaofgod·22 Şub

@adamnemecek1 @valhalla_dev You mean dropping truth bombs in culture war doesn't count?

English

comonoidal esotericist 🇨🇿🇪🇺🇺🇸🦀@adamnemecek1·22 Şub

@valhalla_dev I’m sure both have ample first hand experience with being in a war.

English

427

developing valhalla - h/acc@valhalla_dev·22 Şub

war is created and excused by the rich and fought and died for by the poor.

English

119

1.1K

22.3K

Jakub Bartczuk@lambdaofgod·16 Şub

@JFPuget Man and I thought job market in my country is tough

English

JFPuget 🇺🇦🇨🇦🇬🇱@JFPuget·16 Şub

How dumb is LinkedIn recsys? I don't go often to linkedin website, but I did today. I saw that Carrefour hired people near my place, with relevant roles. I am intrigued because Carrefour is Europe's Walmart, not sure which roles would be relevant for me. First role: A butcher at Massy, 800 km from where I live. I closed the Linkedin page. I have nothing against butchers, it's just that I am not a butcher. That it gets the profession wrong is one thing. But it gets location wrong too. My location(approx) is visible in my profile: near Nice, France. Massy - Nice is about 900 km. So much for the job being near me.

English

3.3K

Jakub Bartczuk@lambdaofgod·4 Şub

@SebastienBubeck @ylecun Did Wiles also put ads in his proofs?

English

541

Sebastien Bubeck@SebastienBubeck·4 Şub

@ylecun So when Wiles worked in secret on Fermat's last theorem for 7 years he wasn't doing research? 🤣

English

175

16.6K

Sebastien Bubeck@SebastienBubeck·4 Şub

I've been in lots of places in my career. OAI is simply the best research environment I have ever seen. It's a combination of the field itself being a research gold mine + having access to the right mining tools + (most importantly) the freedom to explore. It's special.

Mark Chen@markchen90

How does OpenAI balance long-term research bets with product-forward research fundamentals? I’ve been getting this question a lot lately, usually framed as a suggestion that Jakub (@merettm) and I are pushing an increasingly product-focused agenda. That characterization is simply wrong. Foundational research has been core to OpenAI from the start, and today we run a research program with hundreds of exploratory projects - much like the ones that led to our reasoning-model breakthrough. The majority of our compute is allocated to foundational research and exploration - and not product milestones. Anyone who has spent time with me or Jakub knows we are the last people in the world who would push for the advancement of products over the advancement of research. We’re in the business of creating an automated scientist, and capabilities that were considered grand challenges just a few years ago (like IMO-level mathematical reasoning) now emerge as normal parts of the research process. We’re also seeing our models accelerate researchers worldwide, helping advance work across biology, mathematics, physics, and even our own research. Jakub and I put a lot of effort into ensuring that research stays focused on uncovering algorithms that will scale to the compute we’ll have a year from now. We protect mindshare and amplify discourse on exploratory work. We do this while recognizing that we’re also a deployment company - and that deployment gives us access to even larger-scale compute, richer feedback, and more room for exploration. Our researchers are passionate about having their work out in the world, and a special slice of our org is dedicated to making sure our deployments are delightful for end users. Our goal isn’t to turn research into a quarterly race. It’s to build a durable research engine - one that compounds learning over time and consistently turns long-horizon exploration into real, measurable advances, while ensuring those advances become valuable in the real world. That’s the roadmap we’re executing on. And while there have been ups and downs over the last decade (as you expect with any research program), I think most of our researchers would share my strong optimism today.

English

566

157.4K

Jakub Bartczuk@lambdaofgod·31 Oca

@jobergum If it spends so much time does it provide documentation along the way? Is it any good?

English

Jo Kristian Bergum@jobergum·31 Oca

1h and 25 minutes and counting. It indeed goes deep

English

697

Jo Kristian Bergum@jobergum·31 Oca

Trying amp deep mode this am and it really digs deep before making any changes

English

993

Jakub Bartczuk@lambdaofgod·30 Oca

@yoavgo Originally it was, but I saw it referenced in couple of courses that weren't CS101, anyway I think most of the topics fit more advanced courses better

English

239

(((ل()(ل() 'yoav))))👾@yoavgo·30 Oca

@lambdaofgod i thought it *was* cs 101

English

278

(((ل()(ل() 'yoav))))👾@yoavgo·30 Oca

what are some non-ML algorithms that you think people who work on/with AI should be familiar with or benefit from knowing? (beyond the basic algo-and-data-structures 101)

English

14.7K

Jakub Bartczuk@lambdaofgod·30 Oca

@yoavgo I think it served similar role to your course, introducing abstractions beyond CS101.

English

279

(((ل()(ل() 'yoav))))👾@yoavgo·30 Oca

@lambdaofgod in what sense is it similar in your view?

English

1.1K

탐색

@raw_works @JFPuget @ZainHasan6 @shiri_shh @effectfully @antoine_chaffin @contextkingceo @hydra_db