Noema

773 posts

Noema banner
Noema

Noema

@noemaclips

Nostalgic for the future. A lantern into the latent space ✧

The Long Now Katılım Aralık 2015
1.8K Takip Edilen189 Takipçiler
Bayesian
Bayesian@Bayesian0_0·
Saw this one coming last month!
Bayesian tweet media
English
4
0
31
628
John Yang
John Yang@jyangballin·
How much of SQLite, FFmpeg, PHP compiler can LMs code from scratch? Given just an executable and no starter code or internet access. Introducing ProgramBench: 200 rigorous, whole-repo generation tasks where models design, build, and ship a working program end to end. 🧵
John Yang tweet media
English
97
242
1.5K
671.2K
Anton Korinek
Anton Korinek@akorinek·
1/🆕 New NBER paper: 𝗪𝗵𝗲𝗻 𝗗𝗼𝗲𝘀 𝗔𝘂𝘁𝗼𝗺𝗮𝘁𝗶𝗻𝗴 𝗔𝗜 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝗣𝗿𝗼𝗱𝘂𝗰𝗲 𝗘𝘅𝗽𝗹𝗼𝘀𝗶𝘃𝗲 𝗚𝗿𝗼𝘄𝘁𝗵? Under empirically grounded calibrations, a singularity could arrive within just a few years of automating AI research. 🧵 📄 nber.org/papers/w35155
Anton Korinek tweet media
English
10
76
345
80.4K
Noema
Noema@noemaclips·
@Delachica_ @runwayml Brutal!! Para el audio habéis usado IA también? De ser así cuál?
Español
1
0
17
25.6K
Contanimation
Contanimation@Cont_animation·
Sincitium is finally here. We are pleased to present our latest piece: a concept trailer created specifically for the @runwayml Big Pitch Contest. For this project, we wanted to explore a completely different aesthetic from our usual studio style, and this film is the result of that experimentation. We hope you enjoy it as much as we enjoyed the creative process. Produced by: Contanimation Directed by: Javier De La Chica and Guillermo Miranda Art Direction: Javier De La Chica Editing: Guillermo Miranda Voices: Juan Rabadán #runwaybigpitchcontest
English
407
490
4.8K
3M
Noema
Noema@noemaclips·
@j_dekoninck Amazing. Are you planning to develop your own aggregated index? I'd love something similar to Epoch's ECI or the Artificial Analysis Index but for Matharena. Keep up the good work!!🫶
English
1
0
1
58
Jasper Dekoninck
Jasper Dekoninck@j_dekoninck·
Introducing a new paper! Beyond Benchmarks: MathArena as an Evaluation Platform for Mathematics with LLMs Static benchmarks are no longer enough. Models improve too quickly and numbers become stale quickly. Instead, we argue for continuously maintained evaluation platforms.
Jasper Dekoninck tweet media
English
4
12
51
2.6K
Miles Brundage
Miles Brundage@Miles_Brundage·
Common Codex (app) bug - you close the window but the app doesn't close, and then when you try to open up a new window, it doesn't work, and you have to reset the app
English
4
0
17
2.5K
Ahmad Al-Dahle
Ahmad Al-Dahle@Ahmad_Al_Dahle·
The most interesting thing about DeepSeek-V4 isn't the benchmarks, it's the bet: efficient ultra-long context is the precondition for test-time scaling and long-horizon agents. 27% of V3's FLOPs at 1M tokens! The rest flows from this ...
DeepSeek@deepseek_ai

🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice. Try it now at chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today! 📄 Tech Report: huggingface.co/deepseek-ai/De… 🤗 Open Weights: huggingface.co/collections/de… 1/n

English
3
1
41
5.4K
Carlos Santana
Carlos Santana@DotCSV·
¿Habéis detectado ya cuál es el "tinte amarillo" de la nueva versión de imágenes de ChatGPT? Yo lo tengo claro 🤗 Os dejo un examen visual, pero cuidado porque una vez lo veais no parareis de notarlo.
Carlos Santana tweet mediaCarlos Santana tweet mediaCarlos Santana tweet mediaCarlos Santana tweet media
Español
49
25
617
194.2K
Noema
Noema@noemaclips·
@DotCSV Parece que es un bug! Si ejecutas desde API nunca pasa (que yo haya detectado)
Español
0
0
7
7.1K
Angel 🌼
Angel 🌼@Angaisb_·
I also made this game using GPT Image 2 and Codex (GPT-5.4) We can now build so much stuff thanks to GPT Image 2
English
21
13
342
28.4K
Noema
Noema@noemaclips·
@AcerFur 2 models though, maybe the nerfed one is the instant one? Hyped to see how well it scores on IRGB!
English
0
0
2
252
Acer
Acer@AcerFur·
GPT-image-2 reasons during image generation. Now you know why I made IRGB ;)
English
6
2
118
5.5K
Noema
Noema@noemaclips·
@fofrAI Fofr dubstep room, smoke guns triggering on a big drop, crushed monstera leaves strong aroma fills the air
English
1
0
1
130
Noema
Noema@noemaclips·
@spicey_lemonade That really looks like images v1.5, are you sure it's the new one?
English
1
0
10
651
spicylemonade
spicylemonade@spicey_lemonade·
GPT Image 2 failure point: I tested whether the model could reconstruct a figure from a physics paper using only the text description, and it generated a completely unrelated image. It seems like it can’t handle long context. This reinforces the idea that it’s not part of an Omni 5.5 model, but rather a separate image generation model that the system routes to
spicylemonade tweet mediaspicylemonade tweet media
English
8
3
38
7.4K
Noema
Noema@noemaclips·
@AcerFur On top of Prism or a totally different tool?
English
0
0
0
238
Acer
Acer@AcerFur·
I can’t wait to work on some things in the summer. Hopefully we can bring a *very* good tool for mathematicians
English
6
6
142
4.8K
Noema
Noema@noemaclips·
@AcerFur Had to try <3 Excited to see you pushing the limits on your stay at OAI bro! 🫶
English
0
0
1
165
Acer
Acer@AcerFur·
@noemaclips Hah I’m not gonna answer that one
English
1
0
4
209
Acer
Acer@AcerFur·
It is funny to see how much AI bros are like eager to see what mathematicians say about models nowadays lol I don’t think society has ever cared so much about what mathematicians have to say
English
14
4
181
8.6K
Noema
Noema@noemaclips·
@AcerFur For a .1 jump or for a new pre-training?
English
1
0
1
203
Acer
Acer@AcerFur·
@noemaclips The models have done pretty much exactly as well as I expected
English
1
0
6
521