Jakub Bartczuk

399 posts

Jakub Bartczuk banner
Jakub Bartczuk

Jakub Bartczuk

@lambdaofgod

My opinions are not my own. They belong to the gnomes that live in my head

Poland 가입일 Ekim 2015
373 팔로잉79 팔로워
Jakub Bartczuk
Jakub Bartczuk@lambdaofgod·
@raw_works Do you use vanilla RLM from DSPy? I tried to use it for LongBench with Qwens but DSPy module has a bunch of problems which required nontrivial changes to make it work
English
1
0
0
320
Raymond Weitekamp
Raymond Weitekamp@raw_works·
crazy preliminary results from qwen 3.5 last night: Preliminary — Qwen3.5 + dspy.RLM on LongCoT-Mini: 27B lands at #2 (33%) — behind only GPT 5.2, +11pp ahead of Gemini 3 Pro. 9B lands at #4 (17%) — still beats Sonnet 4.5. RLMs unambiguously SOTA on this, more soon!
Raymond Weitekamp@raw_works

ok so the default DSPy.RLM is literally going to destroy this benchmark before the end of the day. running now for sonnet 4.5... 🏆 Scoreboard (live) RLM: 90/94 (95.7%) Vanilla: 0/94 (0.0%) anyone want to pay for the opus run? 😉

English
7
19
207
19.8K
Jakub Bartczuk
Jakub Bartczuk@lambdaofgod·
@JFPuget What is funny about this comment is that it arguably makes an even bigger error as it confuses disliking Trump with antiamericanism. Using the same logic half of US is antiamerican.
English
1
0
4
476
shirish
shirish@shiri_shh·
This startup lets you ORDER SUNLIGHT from space to your exact location in 30 seconds 😭
English
1.6K
1.1K
14.7K
4.7M
effectfully
effectfully@effectfully·
I'll never get over the fact that whenever I make a donation, I'm charged VAT in Europe.
effectfully tweet media
English
7
0
58
3.8K
Antoine Chaffin
Antoine Chaffin@antoine_chaffin·
This release reminded me that I've trained a state-of-the-art reasoning ColBERT model that I never evaluated on BrowseComp-Plus... I wonder how good it is... last thing I remember was that it was not worth using 8B models when a 130M one can do as good... 🤭
Mixedbread@mixedbreadai

Introducing Mixedbread Wholembed v3, our new SOTA retrieval model across all modalities and 100+ languages. Wholembed v3 brings best-in-class search to text, audio, images, PDFs, videos... You can now get the best retrieval performance on your data, no matter its format.

English
3
4
38
12.5K
Nishkarsh
Nishkarsh@contextkingceo·
We've raised $6.5M to kill vector databases. Every system today retrieves context the same way: vector search that stores everything as flat embeddings and returns whatever "feels" closest. Similar, sure. Relevant? Almost never. Embeddings can’t tell a Q3 renewal clause from a Q1 termination notice if the language is close enough. A friend of mine asked his AI about a contract last week, and it returned a detailed, perfectly crafted answer pulled from a completely different client’s file. Once you’re dealing with 10M+ documents, these mix-ups happen all the time. VectorDB accuracy goes to shit. We built @hydra_db for exactly this. HydraDB builds an ontology-first context graph over your data, maps relationships between entities, understands the 'why' behind documents, and tracks how information evolves over time. So when you ask about 'Apple,' it knows you mean the company you're serving as a customer. Not the fruit. Even when a vector DB's similarity score says 0.94. More below ⬇️
English
621
637
6K
3.8M
Jakub Bartczuk
Jakub Bartczuk@lambdaofgod·
@srchvrs @lateinteraction @Dspy @tomaarsen Python paradox. We have tons of useful libraries that do heavy lifting but then encounter a problem that is almost trivial in newer languages (which actually have concurrency models) and it feels like hitting a wall.
English
0
0
1
30
Leo Boytsov
Leo Boytsov@srchvrs·
🧵BTW, this pattern is so pervasive, but few people realize it. For example, @lateinteraction @dspy has this kind of loop for querying models in parallel, SentenceBERT (currently maintained by @tomaarsen) uses this for distributed evaluation and embedding generation. ↩️
Leo Boytsov@srchvrs

🧵For the last seven years, I kept re-implementing the same pattern: A parallel map loop that divides the work among several processes or threads. My very first attempts were built on Python’s standard tools, e.g., multiprocessing.map... ↩️

English
2
0
25
3.3K
Jakub Bartczuk
Jakub Bartczuk@lambdaofgod·
@k_ovfefe2 Because we don't want make other people insecure with our huge bulges.
English
0
0
0
6
Jakub Bartczuk
Jakub Bartczuk@lambdaofgod·
@josevalim @theo Elixir is great, but what about isolation on BEAM? Elixir's model makes it really easy to run REPLs but the security is problematic - I'd be happy to change my mind but I tried to research this and I didn't find any easy way to run an interpreter with restricted permissions
English
0
0
0
142
José Valim
José Valim@josevalim·
@theo The only logical conclusion is that AGI will happen any time now.
English
10
16
454
25K
Jakub Bartczuk
Jakub Bartczuk@lambdaofgod·
@BlancheMinerva I'm not buying all this open model doomerism but isn't this actually one of the arguments that make sense? It's way easier to detect these attacks in Claude or than in a model that someone can run on a bunch of GPUs in their basement
English
0
0
0
75
Stella Biderman
Stella Biderman@BlancheMinerva·
It's very common for people to claim that open LLMs will be used to commit cyber attacks at massive scale. What public evidence is there for this claim? The best (and one of the only) accounts I've seen of a cyber LLM attack was done using Claude anthropic.com/news/disruptin…
English
10
3
38
6.6K
Jakub Bartczuk
Jakub Bartczuk@lambdaofgod·
@metakognita To stwierdzenie jest ewidentnie fałszywe - ignoruje finetuning który ma kolosalny wpływ na faktyczne zachowanie modeli. Prowadzisz firmę od baz danych, więc dane to wszystko i nie ma nic poza danymi ;) Niezły kandydat na bujdę roku.
Polski
0
0
2
706
Metakognita
Metakognita@metakognita·
Oracle właśnie powiedział to samo każdej firmie zajmującej się sztuczną inteligencją na świecie. Twoje modele są bezwartościowe. Nie chodzi o technologię, talent ani miliardy wydane na ich szkolenie. Ale dane, na których byli szkoleni. Larry Ellison, człowiek, który zbudował Oracle jako fundament globalnego przedsiębiorstwa, właśnie ujawnił szokujące informacje. Powiedział, że ChatGPT, Gemini, Grok i Llama – wszystkie one trenują na dokładnie tych samych danych.​ Cały publiczny Internet, każda strona Wikipedii, wątek na Reddicie i każdy artykuł informacyjny. Oznacza to, że wszystkie one łączą się i stają się w zasadzie tym samym produktem, tyle że z różnymi logotypami.​ Ellison określa to mianem towarów. Ale tu zaczyna być niebezpiecznie. Twierdzi, że prawdziwym złotem nie są dane publiczne, lecz dane prywatne.
StockMarket.News@_Investinq

Oracle just told every AI company on earth the same thing. Your models are worthless. Not the technology, talent or the billions spent training them. But the data they were trained on. Larry Ellison, the man who built Oracle into the backbone of global enterprise just dropped a bombshell. He said ChatGPT, Gemini, Grok, and Llama, all of them are training on the exact same data.​ The entire public internet, every Wikipedia page, Reddit thread and every news article. That means they're all converging essentially becoming the same product with different logos.​ Ellison's word for it is commodities. But here's where it gets dangerous. He says the real gold isn't public data, It's private data.​ The medical records in hospital systems, the financial data in bank vaults. The supply chain secrets of every Fortune 500 and guess where most of that data already lives. Not Google, Amazon or Microsoft but inside Oracle.​ Oracle databases hold most of the world's high value private enterprise data. So Oracle just launched something called AI Database 26ai.​ It lets the top AI models, ChatGPT, Gemini, Grok, Llama reason directly over a company's private data, without that data ever leaving the vault.​ They're using a technique called RAG, Retrieval Augmented Generation. The AI doesn't train on your data, it searches it in real time.​ Think about what that means. A bank could ask AI to analyze every loan it's ever made without exposing a single customer record. A hospital could have AI diagnose patients using its full medical history without violating HIPAA.​ A defense contractor could let AI reason across classified operations without data leaving a secure environment.​ Ellison is betting this is bigger than the training market. Bigger than the GPU boom. Bigger than the data center buildout.​ He called it the largest and fastest growing market in history.​ The numbers back the ambition. Oracle's remaining performance obligations just hit $523 billion. That's contracted revenue not yet delivered and $300 billion of it comes from OpenAI alone.​ Cloud revenue hit $8 billion in a single quarter, OCI grew 66 percent and GPU revenue surged 177 percent.​ But here's the part nobody's talking about. If private data becomes the real AI moat, then whoever controls the database controls the future of AI.​ And that's a level of power that should make everyone uncomfortable.

Polski
41
43
383
88.7K
developing valhalla - h/acc
developing valhalla - h/acc@valhalla_dev·
war is created and excused by the rich and fought and died for by the poor.
developing valhalla - h/acc tweet media
English
45
119
1.1K
22.3K
JFPuget 🇺🇦🇨🇦🇬🇱
How dumb is LinkedIn recsys? I don't go often to linkedin website, but I did today. I saw that Carrefour hired people near my place, with relevant roles. I am intrigued because Carrefour is Europe's Walmart, not sure which roles would be relevant for me. First role: A butcher at Massy, 800 km from where I live. I closed the Linkedin page. I have nothing against butchers, it's just that I am not a butcher. That it gets the profession wrong is one thing. But it gets location wrong too. My location(approx) is visible in my profile: near Nice, France. Massy - Nice is about 900 km. So much for the job being near me.
English
5
0
31
3.3K
Sebastien Bubeck
Sebastien Bubeck@SebastienBubeck·
@ylecun So when Wiles worked in secret on Fermat's last theorem for 7 years he wasn't doing research? 🤣
English
25
3
175
16.6K
Sebastien Bubeck
Sebastien Bubeck@SebastienBubeck·
I've been in lots of places in my career. OAI is simply the best research environment I have ever seen. It's a combination of the field itself being a research gold mine + having access to the right mining tools + (most importantly) the freedom to explore. It's special.
Mark Chen@markchen90

How does OpenAI balance long-term research bets with product-forward research fundamentals? I’ve been getting this question a lot lately, usually framed as a suggestion that Jakub (@merettm) and I are pushing an increasingly product-focused agenda. That characterization is simply wrong. Foundational research has been core to OpenAI from the start, and today we run a research program with hundreds of exploratory projects - much like the ones that led to our reasoning-model breakthrough. The majority of our compute is allocated to foundational research and exploration - and not product milestones. Anyone who has spent time with me or Jakub knows we are the last people in the world who would push for the advancement of products over the advancement of research. We’re in the business of creating an automated scientist, and capabilities that were considered grand challenges just a few years ago (like IMO-level mathematical reasoning) now emerge as normal parts of the research process. We’re also seeing our models accelerate researchers worldwide, helping advance work across biology, mathematics, physics, and even our own research. Jakub and I put a lot of effort into ensuring that research stays focused on uncovering algorithms that will scale to the compute we’ll have a year from now. We protect mindshare and amplify discourse on exploratory work. We do this while recognizing that we’re also a deployment company - and that deployment gives us access to even larger-scale compute, richer feedback, and more room for exploration. Our researchers are passionate about having their work out in the world, and a special slice of our org is dedicated to making sure our deployments are delightful for end users. Our goal isn’t to turn research into a quarterly race. It’s to build a durable research engine - one that compounds learning over time and consistently turns long-horizon exploration into real, measurable advances, while ensuring those advances become valuable in the real world. That’s the roadmap we’re executing on. And while there have been ups and downs over the last decade (as you expect with any research program), I think most of our researchers would share my strong optimism today.

English
34
21
566
157.4K
Jakub Bartczuk
Jakub Bartczuk@lambdaofgod·
@jobergum If it spends so much time does it provide documentation along the way? Is it any good?
English
0
0
0
22
Jo Kristian Bergum
Jo Kristian Bergum@jobergum·
1h and 25 minutes and counting. It indeed goes deep
English
3
0
1
697
Jo Kristian Bergum
Jo Kristian Bergum@jobergum·
Trying amp deep mode this am and it really digs deep before making any changes
English
3
0
3
993
Jakub Bartczuk
Jakub Bartczuk@lambdaofgod·
@yoavgo Originally it was, but I saw it referenced in couple of courses that weren't CS101, anyway I think most of the topics fit more advanced courses better
English
0
0
1
239
(((ل()(ل() 'yoav))))👾
what are some non-ML algorithms that you think people who work on/with AI should be familiar with or benefit from knowing? (beyond the basic algo-and-data-structures 101)
English
23
0
55
14.7K
Jakub Bartczuk
Jakub Bartczuk@lambdaofgod·
@yoavgo I think it served similar role to your course, introducing abstractions beyond CS101.
English
1
0
1
279