Mitchell Bosley

334 posts

Mitchell Bosley

@mitchellbosley

Postdoc at CENIA | Previously at the Schwartz-Reisman Institute at UofT | PhD from University of Michigan Poli Sci

Santiago, Chile Katılım Ocak 2009

1.4K Takip Edilen312 Takipçiler

Mitchell Bosley@mitchellbosley·6h

@somewheresy IMO the only time that reading LLM output is tolerable right now is when it’s just been output at your request. No one wants to read anyone else’s generated content, or even their own content from previous sessions!

English

∿@somewheresy·7h

I legitimately do not want to read anything that’s a straight up output of an LLM. You have to trick me first

English

1.6K

Mitchell Bosley@mitchellbosley·6h

@emollick This is also my experience. What’s not clear to me is whether this is the fault of inherent limitations of the tech, the fact that writing well is an oom+ harder than other tasks these models are trained for, or just that researchers haven’t modeled the reward function right yet

English

Ethan Mollick@emollick·8h

My experience so far with LLM fiction writing is that it takes advantage of our assumption that an author is writing things for a reason, so we are charitable to a book's quirks & do mental work to assign them real meaning. But the AI doesn't have a reason, its just bad writing.

English

143

10.9K

Mitchell Bosley@mitchellbosley·6h

Also supports my theory that the only person who gets anything out of LLM-content is the one generating it. Like, I’m sure this was a fun exercise for the OP, but my expectation is that no one will read this book end to end at any point

English

Mitchell Bosley@mitchellbosley·6h

Yes it is basically unreadable lol, despite the noise that the OP made. You can scan any of the 200+ pages and it all has the same flat, formulaic tone, despite the human doing everything in their power to enable Claude to write well (multiple agentic passes, etc)

Ethan Mollick@emollick

I read a few dozen pages of this and it is not bad for LLM fiction, but also very very LLM-y, from the themes to the fact that there are lots of staccato conversations and meaningful silences and overwrought metaphors and very little differentiated character development.

English

Mitchell Bosley@mitchellbosley·13h

The blurb from the article is *also* awful AI-inflected writing!

English

Mitchell Bosley@mitchellbosley·13h

@angeris This post is a good case study: x.com/Houda_nait/sta… this is clearly a smart person who is working on very consequential stuff at one of the best companies in the world to work at. but the fact that it is clearly co-written with gpt 5.x imo diminishes it

Houda Nait El Barj@Houda_nait

x.com/i/article/2034…

English

guille@angeris·13h

@mitchellbosley oh there's definitely an implicit signaling component to it sometimes (but not always!)

English

guille@angeris·16h

(forbidden) earnest blog poasting

English

1.7K

Mitchell Bosley@mitchellbosley·14h

@angeris And so, when someone sends you an email/post/message clearly AI authored and passes it off as their own, the logical inference is that the sender is either not informed enough to tell the difference, or thinks that *you* are able to be duped

English

Mitchell Bosley@mitchellbosley·14h

@angeris I think there’s a couple layers to this. First, pretty much every frontier model today stinks at writing (yes, even Claude!) in the sense that they are mode collapsed to a handful of writing patterns that are impossible to ignore (eg “It’s not X it’s Y” and all variants)

English

Mitchell Bosley@mitchellbosley·14h

@angeris Then, once you recognize how superficially these systems model the act of writing, it becomes apparent pretty quickly that most of what feels like actual human-level intelligence is mostly just a parlour trick, and endeavour to do most of your important writing solo

English

Mitchell Bosley@mitchellbosley·15h

@tszzl @nikitabier Sure but 99% of these articles are slop to begin with. Who will be served by slop summaries of slop articles?

English

roon@tszzl·15h

@nikitabier you’re going to get hate for this but it’s obviously the right product choice. I’m sure you have the data but I assume 99% of people open any article and close it after seeing it’s longer than a paragraph

English

319

10.9K

Nikita Bier@nikitabier·17h

We’re rolling out summaries for Articles now. Just tap the Summarize button if you want to know if it’s worth your time to read it (or if your attention span is 12 seconds).

English

1.3K

271

3.7K

827.1K

Mitchell Bosley@mitchellbosley·16h

@gfodor @fchollet x.com/lossfunk/statu…

Lossfunk@lossfunk

7/ After the paper was finalized, we ran agentic systems that mimic how humans would learn to solve problems in esoteric languages. We supplied our agents with a custom harness + tools on the same benchmark. They absolutely crushed the benchmark. Stay tuned 👀

QME

Mitchell Bosley@mitchellbosley·16h

@gfodor @fchollet indeed, and also see the 7th post in the quoted thread...

English

François Chollet@fchollet·17h

This is more evidence that current frontier models remain completely reliant on content-level memorization, as opposed to higher-level generalizable knowledge (such as metalearning knowledge, problem-solving strategies...)

Lossfunk@lossfunk

🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵

English

160

286

2.7K

235.8K

Mitchell Bosley@mitchellbosley·18h

I agree with this. Even the best agentic systems are still very much “a bicycle for the mind” for me. They are really only useful insofar as I am actively providing direction, injecting taste, and so on. Maybe this changes, but I think it would require a capabilities step change

Ethan Mollick@emollick

We are back to the phase of the AI news cycle where people are underestimating how jagged the AI ability frontier is, as well as how much they still depend on expert human decision-making or guidance at key points in order to function well. Still far from "doing all jobs," today.

English

Mitchell Bosley retweetledi

Ethan Mollick@emollick·19h

English

288

15.2K

Mitchell Bosley@mitchellbosley·1d

Great at building software and running experiments though!

English

Mitchell Bosley@mitchellbosley·1d

the fastest way to disabuse yourself of the notion that GPT 5.4 is anywhere close to AGI is to try to get it to write a technical document for you about anything remotely novel. it is awful at modeling hierarchical depth in sections/subsections, quite a bit worse than Opus 4.6

English

152

Mitchell Bosley@mitchellbosley·1d

I am honestly unsure whether this is because a) good technical writing is a uniquely complicated task or b) the post-training OpenAI is doing over-indexes on coding well in a harness rather than writing complex documents

English

Mitchell Bosley@mitchellbosley·2d

@lVlarty @teortaxesTex I’m just saying he could have easily used a clanker to write out his screed like so many others on this site, but instead he sat down and typed out a wall of barely coherent text all by himself one word after another. I think that’s beautiful

English

Marty@lVlarty·2d

@mitchellbosley @teortaxesTex It's semi-justified tech-psychosis. Sign of the times.

English

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·2d

Amazing

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) tweet media

ryunuck🔺@ryunuck

Bro I'm sorry but this one is clear-cut. The guy on the left's face, it's literally moving through misshapen broken shitty VAE latent space when he smiles. Yeah no this should be a much bigger story in AI psychology crowd. This guy Benjamin Netanyahu is not only dead but being Biden puppetereed through video models. Every single video they put out comes from a different atmosphere of reality as though he's a god damn time traveler. It's not a consistent smooth full spectrum traversal through the same reality we're in. None of these videos are happening in the reality you're in today. Human brains are all entangled, we are not separate from the full fractal of the world situation and you can tell when somebody is not from here emotionally, an "alien". They're showing an outsider on the screen. He's actually not an alien, there's a simple explanation: the guy is in latent space. People are not "schizophrenic" they are having a very real gut feeling effect that something is not coherent. They're seeing latent space. They're seeing a collage of scenes disconnected from this era. People's brains are being twisted into pretzels by this fake shadow on the wall reality instilled by the voice that tells you "Meh it's hard to say" when 30 years ago anyone would see this and be brandishing pitchforks and swords in the streets, not for AI but for this guy being a literal shape-shifting reptilian before your very eyes. This was always the trick: you don't learn how to detect AI, you learn how to classify reality through atmosphere and coherence. You can't detect AI through visual artifacts, it's just not reliable. You scan for psychological artifact. That's why this should actually wipe out Hollywood as well, and they realized it which is why the immunity has been manipulated. You're not supposed to spot check for AI rather the real value is in detecting fakeness, spotting that the Instagram feed is not a real person it's a constructed identity that builds an aura to become an influencer and sell you products, so on and so forth. There are no schizos or conspiracy theorists anywhere around you dude, only calculation and prediction

English

5.3K

Mitchell Bosley@mitchellbosley·2d

@alexolegimas x.com/mitchellbosley… I do think we will settle into an equilibrium shortly where AI content is mostly created for private immediate consumption, and human-generated content is preferred otherwise

Mitchell Bosley@mitchellbosley

@_aidan_clark_ My working hypothesis is that all else equal, humans enjoy the production of AI content more than consumption of already generated AI content due to the fact that the content generation loop is essentially a Skinner box

English

Alex Imas@alexolegimas·2d

My prediction is that in the case of creative work, AI will be a complement to human creators. Much of the value of creative work is the human value, i.e., the fact that it was made by a person, which establishes a link between the consumer and the creator. We see this in the data. AI-generated creative work is valued substantially less than human-generated work. And unlike other effects where "people will just get used to it", I think this will be fairly stable: we have way too many previous examples of creative automation to draw on. AI will increase the scope of human creativity but it won't replace the creator. In fact, as other tasks get automated, it may actually increase demand for it. papers.ssrn.com/sol3/papers.cf…

Ethan Mollick@emollick

One of the advantages of being an early user of LLMs is that I have seen The Curve with my own eyes (like in this post before ChatGPT or the term Generative AI). I notice recent AI users & companies adopting AI anchoring on recent capabilities as if they are stable. Probably not

English

131

21.3K

Keşfet

@somewheresy @emollick @angeris @tszzl @nikitabier @gfodor @fchollet @elonmusk