Fabian Blaicher

59 posts

Fabian Blaicher

@fb_ldn

AI, WTG, Shipamax

London, England Beigetreten Ocak 2016

198 Folgt54 Follower

Fabian Blaicher retweetet

Matt Mireles@mattmireles·18h

Now you can hold the beating heart of an AI in your hands, on your laptop. You can teach the sand to think. You can watch it as it learns. x.com/mattmireles/st…

Matt Mireles@mattmireles

x.com/i/article/2042…

English

191

Fabian Blaicher@fb_ldn·2d

@heynavtoor This is actually not surprising if you look at how LLMs are trained.

English

387

Nav Toor@heynavtoor·3d

🚨SHOCKING: Apple just proved that AI models cannot do math. Not advanced math. Grade school math. The kind a 10-year-old solves. And the way they proved it is devastating. Apple researchers took the most popular math benchmark in AI — GSM8K, a set of grade-school math problems — and made one change. They swapped the numbers. Same problem. Same logic. Same steps. Different numbers. Every model's performance dropped. Every single one. 25 state-of-the-art models tested. But that wasn't the real experiment. The real experiment broke everything. They added one sentence to a math problem. One sentence that is completely irrelevant to the answer. It has nothing to do with the math. A human would read it and ignore it instantly. Here's the actual example from the paper: "Oliver picks 44 kiwis on Friday. Then he picks 58 kiwis on Saturday. On Sunday, he picks double the number of kiwis he did on Friday, but five of them were a bit smaller than average. How many kiwis does Oliver have?" The correct answer is 190. The size of the kiwis has nothing to do with the count. A 10-year-old would ignore "five of them were a bit smaller" because it's obviously irrelevant. It doesn't change how many kiwis there are. But o1-mini, OpenAI's reasoning model, subtracted 5. It got 185. Llama did the same thing. Subtracted 5. Got 185. They didn't reason through the problem. They saw the number 5, saw a sentence that sounded like it mattered, and blindly turned it into a subtraction. The models do not understand what subtraction means. They see a pattern that looks like subtraction and apply it. That is all. Apple tested this across all models. They call the dataset "GSM-NoOp" — as in, the added clause is a no-operation. It does nothing. It changes nothing. The results are catastrophic. Phi-3-mini dropped over 65%. More than half of its "math ability" vanished from one irrelevant sentence. GPT-4o dropped from 94.9% to 63.1%. o1-mini dropped from 94.5% to 66.0%. o1-preview, OpenAI's most advanced reasoning model at the time, dropped from 92.7% to 77.4%. Even giving the models 8 examples of the exact same question beforehand, with the correct solution shown each time, barely helped. The models still fell for the irrelevant clause. This means it's not a prompting problem. It's not a context problem. It's structural. The Apple researchers also found that models convert words into math operations without understanding what those words mean. They see the word "discount" and multiply. They see a number near the word "smaller" and subtract. Regardless of whether it makes any sense. The paper's exact words: "current LLMs are not capable of genuine logical reasoning; instead, they attempt to replicate the reasoning steps observed in their training data." And: "LLMs likely perform a form of probabilistic pattern-matching and searching to find closest seen data during training without proper understanding of concepts." They also tested what happens when you increase the number of steps in a problem. Performance didn't just decrease. The rate of decrease accelerated. Adding two extra clauses to a problem dropped Gemma2-9b from 84.4% to 41.8%. Phi-3.5-mini from 87.6% to 44.8%. The more thinking required, the more the models collapse. A real reasoner would slow down and work through it. These models don't slow down. They pattern-match. And when the pattern becomes complex enough, they crash. This paper was published at ICLR 2025, one of the most prestigious AI conferences in the world. You are using AI to help you make financial decisions. To check legal documents. To solve problems at work. To help your children with homework. And Apple just proved that the AI is not thinking about any of it. It is pattern matching. And the moment something unexpected shows up in your question, it breaks. It does not tell you it broke. It just quietly gives you the wrong answer with full confidence.

English

867

2.9K

11.5K

2.1M

Fabian Blaicher retweetet

Matt Mireles@mattmireles·2d

Introducing... Gemma 4 Multimodal Fine-Tuner for  Apple Silicon - LoRA fine-tunning toolkit for Gemma LLM - runs locally on macOS via PyTorch and Metal - streams data from Google Cloud to your machine - fine-tune on audio, image and text - easy-to-use CLI wizard If you want to fine-tune the new Gemma 4 on text, images, or audio without renting an H100 or copying a terabyte of data to your laptop, this is the only toolkit that does it all on Apple Silicon.

English

102

882

88.5K

Fabian Blaicher retweetet

Tyler@tyler_agg·11 Haz

I built a simple internal chatbot that answers your team's SOPs and FAQs using your existing Google Drive docs in the walkthrough, I show: – how it finds the right doc for any question – how your team can “@” specific files (like in Slack) – how to set it up and share it with your team in minutes reply “chatbot” and I’ll send the full video + code (must be following)

English

192

213

27.2K

Fabian Blaicher@fb_ldn·12 Mar

@MacroAlf It's not about bailing out SVB. It's about bailing out startup employees and the all supporting industries.

English

Alf@MacroAlf·11 Mar

SVB does not deserve a bailout. A deep look at their financial statement reveals how horrific they were at risk management. And in my opinion incompetence explains only part of it. Moral hazard must have been at play. A thread. 1/

English

589

2.2K

10.7K

3.7M

Fabian Blaicher retweetet

Y Combinator@ycombinator·2 Kas

Congrats @xiaojenna, @fb_ldn and the entire @shipamax (YC W17) team!

English

Fabian Blaicher@fb_ldn·13 Mar

Masters of Doom - super exciting book for anyone interested in the history of id software, @ID_AA_Carmack and the greatest games of all time Commander Keen, Doom and Quake

English

Fabian Blaicher retweetet

Hamza Tahir@htahir111·14 Ara

We raised $2.7 million to an #opensource #mlops framework for production-ready ML pipelines! Check out @zenml_io with our new README and docs! Give us a star and run your first pipeline today! So excited! prnewswire.com/news-releases/…

English

Fabian Blaicher retweetet

Simon White@simonwhiteio·12 Kas

What happened to all the MEAN/MERN stack devs that used to scream about MongoDB being a superior db to Postgres for every use case because “Postgres isn’t web scale”?

English

Fabian Blaicher@fb_ldn·12 Kas

@British_Airways is it possible to get refunds for cancelled flights, please? COVID as excuse for not getting support via phone or chat is not acceptable anymore - it started February last year!

English

Fabian Blaicher@fb_ldn·1 Eki

@HelloFreshUK I got a text message that I'll get another delivery even though I cancelled via chat. Your support bot is also ... not able to handle this

English

Fabian Blaicher@fb_ldn·16 Eyl

Loved discussing some of the latest topics from ICDAR 2021 in our Shipamax research group. If you want to be at the forefront of cutting-edge machine learning you can never stop learning :) I'm always humbled by the brain-power of our team.

English

Fabian Blaicher@fb_ldn·20 May

@BulbUK I hope you are joking "we have managed to keep this price increase to 8.2%" Imagine anything else having a price increase like that?

English

Fabian Blaicher@fb_ldn·18 May

What's next @shipamax ? shipamax.com/blog/shipamax-…

English

Fabian Blaicher@fb_ldn·5 May

@benedictevans Very sad that these books are top-sellers

English

Fabian Blaicher@fb_ldn·5 May

The UK should take the lead to push the legal acceptance of digital documents www-telegraph-co-uk.cdn.ampproject.org/c/s/www.telegr…

English

Fabian Blaicher retweetet

Jia-Bin Huang@jbhuang0604·1 May

Types of Computer Vision Paper (source: @xkcd)

English

143

684

Fabian Blaicher@fb_ldn·4 May

Great to see the UK moving forward, but this will still require lots of customs filings and documents to be processed bbc.co.uk/news/business-…

English

Fabian Blaicher retweetet

deel@deel·1 Nis

🪐 You can now hire on Mars. It's one small step for man, one giant leap for remote-kind. Finally, w̵o̵r̵l̵d̵w̵i̵d̵e̵ out of this world hiring for: 🚀 @elonmusk @SpaceX ✨ @richardbranson @virgingalactic 🛰 @NASA @Space_Station 🍫 @MarsGlobal @milkyway letsdeel.com/hire-employees…

English

Fabian Blaicher@fb_ldn·25 Mar

@scotfoodjames @BMPA_INFO @Feorlean @adampayne26 @mrjamesob @pmdfoster @faisalislam @lukemcgee @FergusEwingSNP @DavidHenigUK @qmscotland You should check out lodgitimate.com!

English

Entdecken

@heynavtoor @MacroAlf @xiaojenna @shipamax @ID_AA_Carmack @zenml_io @British_Airways @HelloFreshUK