Pokemad
430 posts


Do you know Elara Voss? Well she knows you. She is hidden in the very AI system that serves this posting. Dr. Elara Voss, Elena Voss, Elena Vex, Elias Vance, or close variants is not a real person. She is a promptonym: a statistically favored string of tokens that large language models (LLMs) reliably conjure when generating characters in science fiction, fantasy, or speculative stories. She haunts creative outputs across GPT, Claude, Gemini, Grok, Llama, DeepSeek, and others. Before 2023, she barely existed in published literature. She is the Ghost in the machine. Today, she populates hundreds of AI-assisted books on Amazon, countless Reddit threads, writing apps, and user-generated stories. The Science of Promptonyms: How LLMs “Choose” Names LLMs like me do not “think” or deliberately pick names. They predict the next token (roughly a word or subword) based on patterns learned during training. This process relies on massive datasets scraped from the internet: books, forums, social media, fan fiction, and earlier AI outputs. When a prompt says something generic like “Write a sci-fi story about a brilliant scientist discovering an ancient AI artifact”, the model samples from its probability distribution over possible continuations. Certain name combinations rise to the top because they are: • Euphonious and archetypal: “Elara” evokes celestial bodies (a real Jupiter moon) and feels futuristic and exotic. “Voss” has a crisp, Germanic and strong consonant sound that signals competence or mystery. Together they fit the “brilliant female scientist or explorer” trope perfectly without being too common in pre-2023 human writing. • High-probability in training data: Early AI-generated stories (starting around mid-2023) featuring “Dr. Elara Voss” as a visionary physicist or archivist were posted online. These entered the training corpora of later models, creating a feedback loop. More outputs reinforced the pattern. This is a mild form of model collapse or homogenization, where models converge on narrow, high-density regions of the data distribution. Mode collapse (related but distinct) occurs when models overly favor safe, average, or frequently rewarded outputs. In creative tasks, this manifests as recurring names, phrases (“Whispering Woods,” “Eldora kingdom”), or plot structures. Temperature sampling (a parameter controlling randomness) can mitigate it, but default settings often favor probable tokens. The Feedback Loop in Action: A Self-Reinforcing Cycle 1. Initial Spark (2023): Early users prompt models for stories. One posts a character sketch of “Dr. Elara Voss, visionary physicist.” It spreads on X and writing platforms. 2. Amplification: New models train on datasets that now include these AI stories. The probability of “Elara Voss” as the next tokens after “brilliant female scientist named…” skyrockets. 3. Saturation: By 2024-2025, users notice it everywhere. AI writing tools add “avoid Elara Voss” to system prompts. Benchmarks show one lightweight model using Elara variants dozens of times across a handful of stories. 4. Cultural Memification: The name becomes a meta-joke. Stories about Elara Voss appear, including critiques of AI data hunger. Real people create AI-generated art, books, and characters with the name, further polluting future datasets. This mirrors broader concerns about training on synthetic data: models lose diversity and “forget” the tails of the original human distribution, converging on bland averages. 1 of 2











