Anselm
2.8K posts

Anselm
@MagicTraks
I love all things good and decent...except brussel sprouts. Brussel Sprouts deserve whatever evil befalls them...


Hegseth's firing of Gen. Randy George "reflects growing hostility between Hegseth and the Army’s leadership," military officials told NYT nytimes.com/2026/04/02/us/…















2026 ai bubble is peaking itself right now: someone just dropped a new benchmark on claude skills and the result of their paper is actually insane the paper explicitly states smaller models with skills beat larger models without them a smaller model like claude 4.5 haiku equipped with high quality skills smokes a raw state of the art opus 4.5 model by about 6 percent (27.7 vs 22.0) imagine getting sota level performance from a free model, its basically cheating, you just have to manually spoonfeed it a basic markdown file explaining how to do its job all of you opus guys are dumb, you can literally spam haiku with skills and get things shipped in 5x lesser time and 0 cost even wilder thing is that codex gpt 5.2 fails on the pareto frontier. codex burned massive compute and costs, just to get completely mogged by gemini 3 flash hitting maximum performance at a fraction of cash i can believe skill engineering is now a valid, mathematically proven substitute for compute over that, it says self generated skills provide zero benefit on average and show negative deltas on 16/84 tasks. if you give an agent more than 3 skills at once, it bloats its context and completely fails


This is probably the most gonzo case I have found in academic publishing: the Review of Financial Studies published an article that was critical of Fannie Mae, so Fannie Mae sponsored a "replication" to get the original article retracted. Here, the authors of the original article detail incredible coding issues in the "replication" (hardcoded arbitrary exclusion of observations, critical data join errors, arbitrary duplication of observations, introduction of non-classical measurement error through flawed rounding, a flawed event study design with non-mutually exclusive time dummies and incorrect treatment years). I haven't yet verified all these errors, but I will check them. The authors do seem credible, and it is notable that the Review of Financial Studies doesn't care. When I say that truth has no ontological status in academic journals, I think this is probably an example of what I mean.




Someone just posted this on TikTok😭😭😭😭😭😭 We really do put our supervisors through a lot
















