Roberto Dailey

46 posts

Roberto Dailey

@RobertoDailey1

Research Scientist at @CognizantAILab. Prev: PhD in Semiconductor Data Analytics at UT Operations Research. Not checking this site during Lent.

Katılım Ağustos 2013

1.5K Takip Edilen328 Takipçiler

Roberto Dailey@RobertoDailey1·5d

Really cool work from my colleague @_GPaolo on open-ended multi-agent environments. He creates a resource-constrained grid world for ai agents where they can interact, search for resources, and leave persistent text artifacts for each other. Without direction you can see the emergence of rules, division of labor, and even attempted governance! The code is up to try out yourself here: github.com/cognizant-ai-l…

Giuseppe Paolo@_GPaolo

What happens when AI agents are left to live (and die) together in a shared world? We’ve been exploring this at the @cognizant AI Lab — and they started forming something that looks like a society.

English

226

Roberto Dailey@RobertoDailey1·26 Şub

Cognizant AI lab @cognizantailab is out with new work in gradient-free fine-tuning with Evolution Strategies (ES)! We expand our initial paper with larger models (7B) and math reasoning to demonstrate ES works out of the box and is competitive with RL across broad domains, without the engineering overhead of gradient-based RL methods. arxiv.org/abs/2509.24372 alphaxiv.org/abs/2509.24372… Inspired by the success of ES we have also pushed ES research in three new directions. First, we put ES to use in a task standard gradient-based RL can’t reach: successfully fine-tuning LLM’s directly in quantized space with Quantized Evolution Strategies (QES). arxiv.org/abs/2602.03120 alphaxiv.org/abs/2602.03120 Next, we looked at developing a theoretical intuition as to why we can succeed in fine-tuning multi-billion parameter models with population sizes as low as 30 in “Blessing of Dimensionality in LLM Fine-tuning” arxiv.org/abs/2602.00170 alphaxiv.org/abs/2602.00170 Lastly, we use ES to help teach models to know what they know, using ES to fine-tune models in a metacognitive task. arxiv.org/abs/2602.02605 alphaxiv.org/abs/2602.02605 We’ve just released a blog describing the overall effort here: cgnz.at/6005QZNMb

English

4.9K

Roberto Dailey@RobertoDailey1·23 Şub

Our lab was able to run 20-disk towers of Hanoi (~1 million steps) on gpt-4.1 mini by simply observing per-step error rates and adding appropriate error checking. I think people should no longer be citing the Illusion of Thinking paper as a fundamental limitation of LLM's. x.com/RobertoDailey1…

Guri Singh@heygurisingh

Apple has just published a paper with a devastating title: *The Illusion of Thinking*. And it's not a metaphor. What it demonstrates is that the AI models we use every day - yes, ones like ChatGPT - don't think. Not one bit. They just imitate doing so. Let me explain: 🧵👇

English

1.7K

Roberto Dailey retweetledi

Xin Qiu@realVsonicV·14 Şub

We recently released a new version of our Evolution Strategies (ES) fine-tuning paper, with more benchmarks, baselines and discussions, strengthening the foundation for using ES as a propagation-free post-training paradigm. (arXiv: arxiv.org/abs/2509.24372, alphaXiv: alphaxiv.org/abs/2509.24372…) We also released three intriguing follow-up works on this new direction: (1) Quantized Evolution Strategies (QES) extends ES to post-training of quantized LLMs. With a frugal memory usage at low-precision inference level, QES achieves a high-precision optimization trajectory in quantized parameter space. (arXiv: arxiv.org/abs/2602.03120, alphaXiv: alphaxiv.org/abs/2602.03120) (2) The "Blessing of Dimensionality" paper tries to explain why ES only needs a population size of ~30 to fine-tune billions of parameters. It discovers that larger models may have lower intrinsic dimensionality, which makes parameter-space search in ES easier. (arXiv: arxiv.org/abs/2602.00170, alphaXiv: alphaxiv.org/abs/2602.00170) (3) Evolution Strategy for Metacognitive Alignment (ESMA)" uses ES to fine-tune LLMs to know what they know. That is, using alignment between "whether LLM answers one question correctly" and "whether LLM knows it can answer one question correctly" as the objective of fine-tuning, strengthening the metacognitive alignment of LLMs. (arXiv: arxiv.org/abs/2602.02605, alphaXiv: alphaxiv.org/abs/2602.02605) Looking forward to adding more to this ES ecosystem!

English

4.5K

Roberto Dailey@RobertoDailey1·12 Şub

@Jobamey This is so damn cool

English

Jamieson Warner@Jobamey·12 Şub

This is dissipation of a blob as it mixes and splits chaotically. This is the Lorenz system, and in time all initial conditions converge to its butterfly-esque non-equilibrium steady state (NESS).

English

104

Roberto Dailey@RobertoDailey1·29 Oca

@besttrousers @maiamindel @souljagoyteller Good ol’ Milkshake Duck

English

227

Matt Darling 🌐🏗️@besttrousers·29 Oca

@maiamindel @souljagoyteller I was surprised("the x men guy?") and then read him Wikipedia page and was no longer surprised.

Woburn, MA 🇺🇸 English

3.6K

Sami Gold@souljagoyteller·29 Oca

I still can’t believe they got Brett Ratner to film the documentary

Variety@Variety

“Melania,” the First Lady documentary acquired by Amazon MGM for $40M, is projecting $3M to $5M in its opening weekend across 1,500 domestic theaters. Meanwhile, Sam Raimi’s survival thriller “Send Help” is eyeing a $15M debut on a $40M budget. variety.com/2026/film/box-…

English

7.1K

Roberto Dailey@RobertoDailey1·27 Oca

@nateparrott lol this was so much fun. Refreshed like 20 times.

English

413

nate parrott@nateparrott·27 Oca

made a version of my website that writes itself from scratch live whenever you visit it

English

378

29.5K

Roberto Dailey@RobertoDailey1·20 Oca

@sotoumisorato This is awesome. Are the background clouds an image or did you have a system for generating them?

English

312

そとうみ@sotoumisorato·20 Oca

手描きアニメ風レンダリング Viewport / Final #blender #b3d

日本語

827

12.3K

303.9K

Roberto Dailey@RobertoDailey1·16 Oca

@samuel_krug I'm tired of blender's cycles speedups, need more whimsy features.

English

Samuel Krug@samuel_krug·16 Oca

I find the Blender Node Editor's Lack of Whimsy Disturbing youtube.com/watch?v=37ghKm…

YouTube

English

2.8K

Roberto Dailey@RobertoDailey1·6 Oca

@KhoaVuUmn AI agents are coming for entomology like a freight train. Insects are not prepared.

English

Khoa Vu@KhoaVuUmn·6 Oca

You can do this easily in Claude Code.

Journal of Art in Society@artinsociety

Only Nature could get away with this ~ my favourite outrageous caterpillar, the spectacular Saturniidae (Marco Fisher)

English

3.2K

Roberto Dailey@RobertoDailey1·4 Oca

@delong @WilliamHogeland Haha I think that's a bit of a philosophical question, though I would note at least in some cases using standard error correction paradigms (voting/sanity checking) can let an LLM execute patterns they would normally struggle with: arxiv.org/abs/2511.09030

English

DeLong🖖@delong·4 Oca

@RobertoDailey1 @WilliamHogeland Have something learned a pattern if it cannot then execute it?

English

William Hogeland@WilliamHogeland·2 Oca

This is a good distinction.

Matt Bruenig@MattBruenig

@indiemusicfan4 @tomer_stern @ZephyrTeachout I agree actually with you that when it comes to creative/artistic areas, the LLMs are somewhat limited given that there is a purpose beyond comprehension. But for other areas, technical areas, comprehension is the only real purpose.

English

4.8K

Roberto Dailey@RobertoDailey1·4 Oca

@delong @WilliamHogeland Though I would note that often here I think the llm's don't struggle with learning an underlying pattern or rule, they struggle with the execution of said pattern in a consistent/reliable way.

English

Roberto Dailey@RobertoDailey1·4 Oca

Yeah here I would agree that humans can very easily scale problem size with time and a pattern/ruleset in a way LLMs continue to struggle at. Given time, humans can scale much father on addition/multiplication problems - and they don't need to train on a million examples - often just being provided the ruleset is enough.

English

Roberto Dailey@RobertoDailey1·4 Oca

No use of tool calls in my multiplication example, was referring to experiments done by Yuntian Deng last January (image attached). And on "systematic knowledge that is **not** next-token predictions" - Much of recent model improvements come from reinforcement learning which is teaching specific tasks - though often when teaching one tasks models start to learn a set of adjacent tasks at the same time. As for literary text interpretation models, I would agree that's not as prioritized as math/coding at the moment - though I imagine they'll try and improve those abilities over time as well.

English

DeLong🖖@delong·4 Oca

@RobertoDailey1 @WilliamHogeland ...what they are doing that does not lend me confidence that there is not a large "let's see how much we can grift naïve VCs for to buy us toys" component to all this. 3/END

English

226

Roberto Dailey@RobertoDailey1·4 Oca

@delong @WilliamHogeland I think this framing is slightly off. Early this year the best models could handle ~9-digit multiplication with high accuracy. There are enough 9-digit multiplication problems that most aren't covered in the training data - at least sometimes models learn underlying patterns.

English

DeLong🖖@delong·3 Oca

@WilliamHogeland ...conversation... 4. And then it repeats... So every single word it outputs to you is the result of some human somewhere thinking that is the next word to say in a conversation "close" to the current one. But as the conversation evolves the set of close conversations... 2/

English

767

Roberto Dailey@RobertoDailey1·30 Ara

@_martinsh If you are willing to bake in the lighting, gaussian splats work really well to bake in the lighting from cycles then render in real time with EEVEE

English

103

Mārtiņš Upītis 🇱🇻 🇺🇦@_martinsh·29 Ara

Another mission is be doing that in EEVEE. I need 1:1 data set to validate EEVEE custom volumetric to cycles so I baked cloud data to a 512x512x256 grid. More experiments coming soon.

English

1.2K

Mārtiņš Upītis 🇱🇻 🇺🇦@_martinsh·29 Ara

I’ve been learning about light scattering in clouds. Especially how appearance changes with the medium density. You can see how some parts of clouds get seemingly darker and fuzzy while other are bright white. These are Blender Cycles renders, 4 volumetric samples. Quick renders.