Roberto Dailey

46 posts

Roberto Dailey banner
Roberto Dailey

Roberto Dailey

@RobertoDailey1

Research Scientist at @CognizantAILab. Prev: PhD in Semiconductor Data Analytics at UT Operations Research. Not checking this site during Lent.

Katılım Ağustos 2013
1.5K Takip Edilen328 Takipçiler
Roberto Dailey
Roberto Dailey@RobertoDailey1·
Really cool work from my colleague @_GPaolo on open-ended multi-agent environments. He creates a resource-constrained grid world for ai agents where they can interact, search for resources, and leave persistent text artifacts for each other. Without direction you can see the emergence of rules, division of labor, and even attempted governance! The code is up to try out yourself here: github.com/cognizant-ai-l…
Giuseppe Paolo@_GPaolo

What happens when AI agents are left to live (and die) together in a shared world? We’ve been exploring this at the @cognizant AI Lab — and they started forming something that looks like a society.

English
0
1
6
226
Roberto Dailey
Roberto Dailey@RobertoDailey1·
Cognizant AI lab @cognizantailab is out with new work in gradient-free fine-tuning with Evolution Strategies (ES)! We expand our initial paper with larger models (7B) and math reasoning to demonstrate ES works out of the box and is competitive with RL across broad domains, without the engineering overhead of gradient-based RL methods. arxiv.org/abs/2509.24372 alphaxiv.org/abs/2509.24372… Inspired by the success of ES we have also pushed ES research in three new directions. First, we put ES to use in a task standard gradient-based RL can’t reach: successfully fine-tuning LLM’s directly in quantized space with Quantized Evolution Strategies (QES). arxiv.org/abs/2602.03120 alphaxiv.org/abs/2602.03120 Next, we looked at developing a theoretical intuition as to why we can succeed in fine-tuning multi-billion parameter models with population sizes as low as 30 in “Blessing of Dimensionality in LLM Fine-tuning” arxiv.org/abs/2602.00170 alphaxiv.org/abs/2602.00170 Lastly, we use ES to help teach models to know what they know, using ES to fine-tune models in a metacognitive task. arxiv.org/abs/2602.02605 alphaxiv.org/abs/2602.02605 We’ve just released a blog describing the overall effort here: cgnz.at/6005QZNMb
English
0
13
82
4.9K
Roberto Dailey
Roberto Dailey@RobertoDailey1·
Our lab was able to run 20-disk towers of Hanoi (~1 million steps) on gpt-4.1 mini by simply observing per-step error rates and adding appropriate error checking. I think people should no longer be citing the Illusion of Thinking paper as a fundamental limitation of LLM's. x.com/RobertoDailey1…
Guri Singh@heygurisingh

Apple has just published a paper with a devastating title: *The Illusion of Thinking*. And it's not a metaphor. What it demonstrates is that the AI models we use every day - yes, ones like ChatGPT - don't think. Not one bit. They just imitate doing so. Let me explain: 🧵👇

English
0
0
14
1.7K
Roberto Dailey retweetledi
Xin Qiu
Xin Qiu@realVsonicV·
We recently released a new version of our Evolution Strategies (ES) fine-tuning paper, with more benchmarks, baselines and discussions, strengthening the foundation for using ES as a propagation-free post-training paradigm. (arXiv: arxiv.org/abs/2509.24372, alphaXiv: alphaxiv.org/abs/2509.24372…) We also released three intriguing follow-up works on this new direction: (1) Quantized Evolution Strategies (QES) extends ES to post-training of quantized LLMs. With a frugal memory usage at low-precision inference level, QES achieves a high-precision optimization trajectory in quantized parameter space. (arXiv: arxiv.org/abs/2602.03120, alphaXiv: alphaxiv.org/abs/2602.03120) (2) The "Blessing of Dimensionality" paper tries to explain why ES only needs a population size of ~30 to fine-tune billions of parameters. It discovers that larger models may have lower intrinsic dimensionality, which makes parameter-space search in ES easier. (arXiv: arxiv.org/abs/2602.00170, alphaXiv: alphaxiv.org/abs/2602.00170) (3) Evolution Strategy for Metacognitive Alignment (ESMA)" uses ES to fine-tune LLMs to know what they know. That is, using alignment between "whether LLM answers one question correctly" and "whether LLM knows it can answer one question correctly" as the objective of fine-tuning, strengthening the metacognitive alignment of LLMs. (arXiv: arxiv.org/abs/2602.02605, alphaXiv: alphaxiv.org/abs/2602.02605) Looking forward to adding more to this ES ecosystem!
Xin Qiu tweet media
English
1
14
53
4.5K
Jamieson Warner
Jamieson Warner@Jobamey·
This is dissipation of a blob as it mixes and splits chaotically. This is the Lorenz system, and in time all initial conditions converge to its butterfly-esque non-equilibrium steady state (NESS).
English
1
0
2
104
nate parrott
nate parrott@nateparrott·
made a version of my website that writes itself from scratch live whenever you visit it
English
16
11
378
29.5K
Roberto Dailey
Roberto Dailey@RobertoDailey1·
@sotoumisorato This is awesome. Are the background clouds an image or did you have a system for generating them?
English
0
0
1
312
そとうみ
そとうみ@sotoumisorato·
手描きアニメ風レンダリング Viewport / Final #blender #b3d
日本語
22
827
12.3K
303.9K
Roberto Dailey
Roberto Dailey@RobertoDailey1·
@samuel_krug I'm tired of blender's cycles speedups, need more whimsy features.
English
0
0
0
63
Roberto Dailey
Roberto Dailey@RobertoDailey1·
@KhoaVuUmn AI agents are coming for entomology like a freight train. Insects are not prepared.
English
0
0
0
28
Roberto Dailey
Roberto Dailey@RobertoDailey1·
@delong @WilliamHogeland Haha I think that's a bit of a philosophical question, though I would note at least in some cases using standard error correction paradigms (voting/sanity checking) can let an LLM execute patterns they would normally struggle with: arxiv.org/abs/2511.09030
English
1
0
0
39
Roberto Dailey
Roberto Dailey@RobertoDailey1·
@delong @WilliamHogeland Though I would note that often here I think the llm's don't struggle with learning an underlying pattern or rule, they struggle with the execution of said pattern in a consistent/reliable way.
English
1
0
0
35
Roberto Dailey
Roberto Dailey@RobertoDailey1·
Yeah here I would agree that humans can very easily scale problem size with time and a pattern/ruleset in a way LLMs continue to struggle at. Given time, humans can scale much father on addition/multiplication problems - and they don't need to train on a million examples - often just being provided the ruleset is enough.
English
1
0
0
35
Roberto Dailey
Roberto Dailey@RobertoDailey1·
No use of tool calls in my multiplication example, was referring to experiments done by Yuntian Deng last January (image attached). And on "systematic knowledge that is **not** next-token predictions" - Much of recent model improvements come from reinforcement learning which is teaching specific tasks - though often when teaching one tasks models start to learn a set of adjacent tasks at the same time. As for literary text interpretation models, I would agree that's not as prioritized as math/coding at the moment - though I imagine they'll try and improve those abilities over time as well.
Roberto Dailey tweet media
English
1
0
0
65
DeLong🖖
DeLong🖖@delong·
@RobertoDailey1 @WilliamHogeland ...what they are doing that does not lend me confidence that there is not a large "let's see how much we can grift naïve VCs for to buy us toys" component to all this. 3/END
English
1
0
2
226
Roberto Dailey
Roberto Dailey@RobertoDailey1·
@delong @WilliamHogeland I think this framing is slightly off. Early this year the best models could handle ~9-digit multiplication with high accuracy. There are enough 9-digit multiplication problems that most aren't covered in the training data - at least sometimes models learn underlying patterns.
English
1
0
0
78
DeLong🖖
DeLong🖖@delong·
@WilliamHogeland ...conversation... 4. And then it repeats... So every single word it outputs to you is the result of some human somewhere thinking that is the next word to say in a conversation "close" to the current one. But as the conversation evolves the set of close conversations... 2/
English
2
0
1
767
Roberto Dailey
Roberto Dailey@RobertoDailey1·
@_martinsh If you are willing to bake in the lighting, gaussian splats work really well to bake in the lighting from cycles then render in real time with EEVEE
Roberto Dailey tweet mediaRoberto Dailey tweet mediaRoberto Dailey tweet media
English
0
0
1
103
Mārtiņš Upītis 🇱🇻 🇺🇦
Another mission is be doing that in EEVEE. I need 1:1 data set to validate EEVEE custom volumetric to cycles so I baked cloud data to a 512x512x256 grid. More experiments coming soon.
Mārtiņš Upītis 🇱🇻 🇺🇦 tweet mediaMārtiņš Upītis 🇱🇻 🇺🇦 tweet mediaMārtiņš Upītis 🇱🇻 🇺🇦 tweet media
English
1
0
15
1.2K
Mārtiņš Upītis 🇱🇻 🇺🇦
I’ve been learning about light scattering in clouds. Especially how appearance changes with the medium density. You can see how some parts of clouds get seemingly darker and fuzzy while other are bright white. These are Blender Cycles renders, 4 volumetric samples. Quick renders.
Mārtiņš Upītis 🇱🇻 🇺🇦 tweet mediaMārtiņš Upītis 🇱🇻 🇺🇦 tweet mediaMārtiņš Upītis 🇱🇻 🇺🇦 tweet mediaMārtiņš Upītis 🇱🇻 🇺🇦 tweet media
English
5
8
133
3.8K