Mikel Bober-Irizar

1.4K posts

Mikel Bober-Irizar banner
Mikel Bober-Irizar

Mikel Bober-Irizar

@mikb0b

24 // Kaggle Competitions Grandmaster & ML/AI Researcher. Building video games @iconicgamesio, machine reasoning @Cambridge_CL, bioscience @ForecomAI.

London Katılım Ağustos 2011
1.2K Takip Edilen8K Takipçiler
Sabitlenmiş Tweet
Mikel Bober-Irizar
Mikel Bober-Irizar@mikb0b·
Why do pre-o3 LLMs struggle with generalization tasks like @arcprize? It's not what you might think. OpenAI o3 shattered the ARC-AGI benchmark. But the hardest puzzles didn’t stump it because of reasoning, and this has implications for the benchmark as a whole. Analysis below🧵
Mikel Bober-Irizar tweet media
English
18
69
653
206.5K
Alexander Doria
Alexander Doria@Dorialexander·
@mikb0b Not fully but correctly inferring it’s medical domain and trying to connect dots from there. In this extreme size range I’m surprised.
English
1
0
17
771
maxgmcg
maxgmcg@maxgmcg·
twitter 👍
English
1
0
4
126
bilal
bilal@bilaltwovec·
an underrated feature of a city's subway system is having the nice 2700K led lighting instead of the standard harsh white florescent tubes its such an upgrade
English
1
0
7
678
will brown
will brown@willccbb·
who are the best up-and-coming technical bloggers on here nowadays? if this is maybe you, feel free to reply w your fav recent post :)
English
85
30
561
125.9K
Piotr Żelasko
Piotr Żelasko@PiotrZelasko·
Which model is best for what? nvidia/parakeet-tdt-0.6b-v3: blazing fast and accurate ASR inference with PnC and timestamps nvidia/canary-1b-v2: top accuracy with fast inference, ASR and translation, with PnC and timestamps Both are commercially-friendly licensed: CC-BY-4.0
English
2
0
13
917
Piotr Żelasko
Piotr Żelasko@PiotrZelasko·
You asked for it, and we listened. MULTILINGUAL Canary v2 and Parakeet v3!! 🌏 25 European languages 🏆 SotA on Multilingual Open ASR Leaderboard 🔥 600x and 2000x faster than real-time 🕰️ Timestamps! 🗣️ Speech translation (Canary) 🃏 Granary: all data is open, train it yourself!
Piotr Żelasko tweet media
English
13
41
330
25.4K
east coast anna in exile
east coast anna in exile@love_soze_·
growing up i was always bothered that there was no idiomatic expression in english for “on the pareto frontier of most-A and most-B”
English
2
0
11
461
Jake Arkinstall, PhD 🏴󠁧󠁢󠁷󠁬󠁳󠁿
@mikb0b @kalomaze The audiobook is also fantastic. I dont even like fiction all that much, just listened to it because I was looking for my next listen and the movie trailer looked good. Never thought I'd get so attached to a fictional 5 legged alien spider thing.
English
1
0
1
31
kalomaze
kalomaze@kalomaze·
for some reason i want to read something on the plane. what should i go for
English
19
0
53
6.8K
Mikel Bober-Irizar
Mikel Bober-Irizar@mikb0b·
@DavidSHolz completely agree with this - not to mention cost of each call going way up (even if the list price goes down)
English
0
0
0
119
David
David@DavidSHolz·
crazy how wait times increased with thinking LLMs and how the experience itself feels net neutral. I desire the response more, but all the *flow-of-the-thing* is gone. the extension of the mind has become an excellent assistant. feeling profoundly not-me. a missed opportunity.
English
23
45
451
59.4K
Mikel Bober-Irizar
Mikel Bober-Irizar@mikb0b·
@shitpost9000 @twofifteenam @billyhumblebrag single vent means that it's only blowing hot air out of the apartment, which means negative pressure => sucks an equal amount of outside air back into the apartment. way less efficient but weirdly every portable unit seems to have this flaw
English
0
0
1
43
billy
billy@billyhumblebrag·
Americucks coping and seething as i seamlessly install 12 000 BTU of cooling into my European apartment
billy tweet media
English
1.8K
73
5.7K
1.9M
east coast anna in exile
east coast anna in exile@love_soze_·
growing up and realizing most bespoke file formats/datastores are just zip files or sqlite tables and not highly engineered custom binary formats was my own personal “there is no santa clause”
English
40
122
2.6K
62.2K
Keir Bradwell
Keir Bradwell@keirbradwell·
a good thing about making highly amateur photography a hobby is that it forces you to go on more walks
Keir Bradwell tweet mediaKeir Bradwell tweet mediaKeir Bradwell tweet mediaKeir Bradwell tweet media
English
4
0
24
1.3K
Theo - t3.gg
Theo - t3.gg@theo·
@GenePark DM me a screenshot when you hit the credits and I’ll throw $500 at a charity of your choice 🫡
English
2
0
18
1.7K
Gene Park
Gene Park@GenePark·
My 5 favorite video games of all time. my fight against recency bias is keeping Donkey Kong Bananza off the top 5, we'll see.
Gene Park tweet media
English
323
73
4.1K
355.1K
LaurieWired
LaurieWired@lauriewired·
Fading out audio is one of the most CPU-intensive tasks you can possibly do! Values that get close (but not quite) zero, hit an underflow gap known as "Subnormal" range. It’s a mathematical conundrum so tricky, both x86 and ARM made special CPU instructions just to handle it!
LaurieWired tweet mediaLaurieWired tweet media
English
157
804
13.4K
733.8K
Mikel Bober-Irizar
Mikel Bober-Irizar@mikb0b·
@secemp9 this is kinda neat - hard part might be generating the right question surface for the (avoiding lots of low-quality or similar questions, getting all the weird quirks that happen in programming) or you could just train it on stackoverflow :D
English
1
0
1
62
secemp
secemp@secemp9·
another idea I had: train a model on just books/docs, make it solid at explaining, analogies, reasoning on code (but it would be bad at generating code of course). then use that as a teacher for a new model that would learn programming, stackoverflow-style. basically: - generate a question, get the teacher to answer - turn the whole thing into a feedback loop - distill that into a dataset or use it for RL - or merge the book-thinker and the code-learner, or train a third model on the distilled output. feels like a decent pipeline but I haven’t seen much done like this afaik
secemp@secemp9

hypothetical: say you have raw code and want to make an LLM better at it. how do you even turn that into a dataset? no QA pairs, just code, barely any comment. how to best do this? I always wondered if there were solid papers on this. one obvious path: prompt an LLM with stuff like - “what does this function do” - “how to implement X” etc and generate question:answer pairs from code chunks

English
4
0
14
1.5K
Mikel Bober-Irizar
Mikel Bober-Irizar@mikb0b·
@tenderizzation that’s when you write a shell script to ssh into all of them for you (again, rather than reading the docs)
English
0
0
8
471
secemp
secemp@secemp9·
no support for ctranslate2 so I need to use either my gpu or add support for it myself...can't believe this is my life
GIF
English
3
0
40
8.3K
secemp
secemp@secemp9·
I couldn't believe whisper was SOTA and then found out there is actually a better model from nvidia (WER around 6 vs 9-10 for whisper)
secemp tweet media
English
36
36
1.2K
164.8K