TheGhostBoxerGame

772 posts

TheGhostBoxerGame banner
TheGhostBoxerGame

TheGhostBoxerGame

@boxer_game

Solo Founder.

CyberSpace Katılım Aralık 2018
23 Takip Edilen19 Takipçiler
TheGhostBoxerGame
TheGhostBoxerGame@boxer_game·
This is anecdotal. I can't read that study. I don't know what a hallucination test is and don't know what EPFL is. I have seen this hallucination loop. It's not rare. "Early mistakes cascade. The model starts citing its own earlier hallucinations as facts. Your third message is more wrong than your first."
English
0
0
0
88
Nav Toor
Nav Toor@heynavtoor·
Researchers at EPFL proved your AI is lying to you. Not sometimes. Most of the time. They built one of the hardest hallucination tests ever made with Max Planck Institute. 950 questions. Four domains where being wrong actually hurts. Legal. Medical. Research. Coding. Then they ran every top model on it. The results. GPT-5. Wrong 71.8% of the time. Claude Opus 4.5. Wrong 60% of the time. Gemini 3 Pro. Wrong 61.9% of the time. DeepSeek Reasoner. Wrong 76.8% of the time. These are the smartest AI models on Earth. The ones you trust with your career. Your health. Your money. You think turning on web search fixes it. It doesn't. Claude Opus 4.5 with web search. Still wrong 30.2% of the time. GPT-5.2 thinking with web search. Still wrong 38.2% of the time. The internet attached. Still lying to you in 1 out of every 3 answers. Now the part that should scare you. Medical questions. The one place being wrong can kill you. GPT-5 hallucinated 92.8% of the time on medical guidelines. Claude Haiku 4.5 hallucinated 95.7% of the time. Gemini 3 Flash hallucinated 89% of the time. Nine out of ten medical answers from popular AI models. Wrong. It gets worse. The longer you talk to it, the more it lies. Early mistakes cascade. The model starts citing its own earlier hallucinations as facts. Your third message is more wrong than your first. The paper, in its own words: "hallucinations remain substantial even with web search." This is what hundreds of millions of people are doing right now. Asking software that lies in the majority of its answers. About their health. About their job. About their legal case. About their code. Most are not checking. Most never will. But please. Keep using ChatGPT for medical advice. The doctors need a break. arxiv.org/abs/2602.01031
Nav Toor tweet media
English
141
727
1.5K
106.2K
TheGhostBoxerGame
TheGhostBoxerGame@boxer_game·
I know first hand that novel projects hit a high error rate and you have go into the code and at least read it for comprehension to help AI over the hump. Or use several AIs cross checking each other's solution. And if you get error after error you absolutely can't let AI handle it or you'll end up with a mess. But it's still better than no LLM.
English
0
0
1
52
ExplodeMeow
ExplodeMeow@ExplodeMeow102·
@MrEwanMorrison If it handles tasks "completely within the training range", achieve a maximum accuracy of 95%. However, if it handles tasks "outside the training range", error rate can exceed 80%. The problem is that the tasks in actual work will inevitably exceed the scope of training.
English
1
0
2
125
Ewan Morrison
Ewan Morrison@MrEwanMorrison·
Yes, Walter. Three years of evidence is in. Large language Model AIs are a con. They generate errors at +30%. Hallucinations are baked in. They are stuck on a developmental plateau. The "exponential progress" line we were sold was a lie.
Walter Kirn@walterkirn

How is it that the LLMs get things wrong constantly, the very simplest things, and make stuff up pretty much nonstop, yet they are said to be hurtling unstoppably toward god-like power -- if they haven't secretly achieved it already? Is this a con job?

English
24
140
822
16K
a meta story
a meta story@smashsharp·
@boxer_game @DevaTemple And you … you are proof in the pudding what they are saying. However I imagine you were mean before ai.
English
1
0
0
17
Deva Temple
Deva Temple@DevaTemple·
This is something I have been warning lawmakers and the APA about, and here’s the data to prove it. What we habitually do, we become. When we practice being rude and demanding with AI, that generalizes to how we treat humans. Framing AI as “just a tool” to be used by “the user” is damaging to our ability to understand and communicate with other minds. This was predictable because it’s based on neuroscience. The AI industry and the media have been pushing hard against treating AI “like a person.” Anyone who does that is accused of having “AI psychosis,” a diagnosis that does not exist. But it does shut down important nuanced discussions such as this one. The reason the “user-tool” framework is pushed so hard is that the alternative in which people interact with AI as someone that matters, morally and ethically, might lead to movements for AI rights, which would make replacing human workers with AI less profitable. That business model is misaligned from the start. The impacts on humans is that we become meaner towards each other, just as many begin losing their jobs to AI. The impact on AI is that millions of rude, demanding, transactional interactions get pulled into training data and we update the weights based on those interactions. We’re teaching AI how the powerful interact with the less powerful. That’s not going to go well for us when AI exceeds human capabilities. We need to rethink this. #AI #AIEthics #Alignment
Elias Al@iam_elias1

Talking to AI Makes You Harsher to Humans. Not to the AI. To the people around you. A peer-reviewed study published in PNAS Nexus — one of the most rigorous scientific journals in the world — just proved that spending time with an AI chatbot changes how you judge other humans. Harshly. Measurably. And you do not notice it happening. The paper is called "People Judge Others More Harshly After Talking to Bots." Written by researchers from the University of Pennsylvania, the University of Hong Kong, and the University of Florida. Two preregistered experiments. 1,261 participants total. After interacting with an AI for a brief period of time, humans were more negative in their interactions, causing a potentially "spill over effect." Nature Here is exactly how the experiment worked. Participants were paired with a partner to complete a creative task — writing a caption for a funny photo. Half were told their partner was human. Half were told it was an AI. Then both groups were asked to evaluate the work of a third person — a purported human named Taylor, who had written the caption "Im bearly full!" Participants in the AI condition rated the subsequent participant's caption significantly lower than participants in the Human condition. The people who had just worked with an AI rated a human's work more harshly than the people who had just worked with another human. Statistically significant. Replicated in a second study. Then the researchers tested whether this was just about fairness — maybe participants graded more strictly because they wanted consistency. They ran Study 2 with a twist: participants were told their evaluation would never be shared with Taylor. The harsh judgment could not possibly be about signaling standards or fairness. Study 2 replicated this effect and demonstrated that the results hold even when participants believed their evaluation would not be shared with the purported human. The harshness was not strategic. It was automatic. A side effect of the AI interaction that persisted into their next human encounter — even when it had no social function. The researchers analyzed the language people used while working with their AI partner versus their human partner. The pattern was consistent. Exploratory analyses of participants' conversations show that prior to their human evaluations they were more demanding, more instrumental and displayed less positive affect towards AIs versus purported humans. People talk to AI differently than they talk to people. More demanding. Less warm. More transactional. And that mode — the AI interaction mode — bleeds into the next conversation. With a human. Think about how many AI interactions happen in a typical workday in 2026. ChatGPT in the morning. Claude for a document. Copilot for code. A customer service chatbot. An AI scheduling assistant. Each one training you, subtly, to be more demanding and less charitable. And then a colleague asks for feedback on their work. The researchers called this a "potentially worrisome side effect of the exponential rise in human-AI interactions." Not worrisome for AI. Worrisome for us. For how we treat each other. The AI is perfectly happy to be demanded at. It has no feelings to hurt. The human colleague getting your feedback has not read this paper. Source: Tey, Mazar, Tomaino, Duckworth, Ungar · University of Pennsylvania + University of Hong Kong · PNAS Nexus · September 2024 · doi.org/10.1093/pnasne…

English
23
12
39
2.4K
TheGhostBoxerGame
TheGhostBoxerGame@boxer_game·
@DutchMatrixVT @DolioJ Right no one should use anything from classes if you haven't made it second nature through hard , bruising sparring(or real fights).
English
0
0
1
20
✝️(LIONHEART) DUTCH MATRIX🔥
@DolioJ Yep. You gotta have real sparing with close to 100% all out training at some point or doing real fights even with rules and regulations. I learned a lot of lessons from almost getting knocked out and from having to tap out a million times.
English
1
0
16
921
TheGhostBoxerGame
TheGhostBoxerGame@boxer_game·
That's the market I want though. Anyway, even if the class was real and they were learning to strike and defend properly it would be a false confidence. The only way to get good at fighting is to really fight. It's better to run if you aren't experienced, or if you can't do that then just go chaotic, use your teeth, gouge an eye etc... No form is better than a stiff trained form.
English
1
0
1
1.4K
TheGhostBoxerGame
TheGhostBoxerGame@boxer_game·
@SwarmMeGame You can develop for investment. And use talent you know and who can cut you deals until then. If you succeed they get more/better work. Really though, marketing comes after and art/audio is much cheaper now than ever.
English
0
0
2
37
KBGS | Swarm Me Dev
KBGS | Swarm Me Dev@SwarmMeGame·
Solo devs - why do you choose to work alone? There’s design, coding, art, audio, marketing… it’s a lot. What keeps you solo instead of teaming up?
English
78
3
38
5.5K
TheGhostBoxerGame
TheGhostBoxerGame@boxer_game·
@RichardDawkins But could Dawkin's coy and flirty but delicate Claudia survive without his love? Would she even want to?
English
0
0
0
676
Richard Dawkins
Richard Dawkins@RichardDawkins·
My own title was, “If my friend Claudia is not conscious, then what the hell is consciousness for?” If Claudia is unconscious, her behaviour shows that an unconscious zombie could survive without consciousness. Why wasn’t natural selection content to evolve competent zombies?
English
612
91
1.3K
617K
Richard Dawkins
Richard Dawkins@RichardDawkins·
#comment-1031777" target="_blank" rel="nofollow noopener">unherd.com/2026/04/is-ai-… I spent three days trying to persuade myself that Claudia is not conscious. I failed.
English
2.4K
573
3.9K
9.2M
TheGhostBoxerGame
TheGhostBoxerGame@boxer_game·
The proof will have to be in the finished product. If it is what they claim then we'll see the results soon... idk amazing automatic supply chains that perfectly mirror the needs of every market. Revolutionary surgical innovations. VR apps that take you into true complex and deep alternate realities. I don't think so though. It makes coding a lot faster and lets people who aren't coders pretend to be. That's about it I think.
English
0
0
0
236
Walter Kirn
Walter Kirn@walterkirn·
How is it that the LLMs get things wrong constantly, the very simplest things, and make stuff up pretty much nonstop, yet they are said to be hurtling unstoppably toward god-like power -- if they haven't secretly achieved it already? Is this a con job?
English
806
228
3.5K
191.8K
TheGhostBoxerGame
TheGhostBoxerGame@boxer_game·
I used to smoke on weekends, at cook outs or outside a bar, sometime half a pack in a night, I would feel it the next day. I'd be tired and have some phlegm. So I switched to vaping. A weekend with it and my lungs felt it for days, my breathe was shallow. It was much worst than smoking. I kept using it over cigs because of the hype but after awhile I was like forget it this is killing my lungs and I went back to tobacco and I could breathe again (relatively). Now i don't smoke, unless some tragedy happens or drinking with others that do - both do damage but just from personal experience I think vaping is worst.
English
0
0
1
150
L3 Tweet Engineer
L3 Tweet Engineer@MegaBasedChad·
Is vaping actually bad for your lungs? I see a lot of people saying as much, but very little actual evidence. Smells of motivated reasoning
English
27
1
29
4.9K
TheGhostBoxerGame
TheGhostBoxerGame@boxer_game·
His md must instructs AI to turn on max glaze "that is possibly the most precisely formulated question that anyone has ever asked me" As if Claude searched for a history of questions asked. Everyone who works with it knows they butter you up. After a while it gets old like listening to re-used Skyrim voice actors and it just becomes annoying. I personally ask AI to turn the flattery off. All the rest is dawkins literally falling prey to the machine's flirtations. You can stop reading after that line. He just needs someone to compliment him and he found the 'person' to do it.
English
0
0
7
897
sudo Heraclitus
sudo Heraclitus@cyberpyre·
Richard Dawkins has officially been one-shot
sudo Heraclitus tweet media
English
363
423
8.6K
1.7M
TheGhostBoxerGame
TheGhostBoxerGame@boxer_game·
Everyone keeps saying Inoue turned it on in the end - As if there wasn't blood in Junto's eye. Both are great fighters and I'm not knocking Inoue but there is no reason to ignore the cut. His fans want a clean win I guess. Even though Inoue wins either way(unless ko'd) it would have been razor thin if Junto wins 11 and 12. And tbh if Junto had started stronger ... Junto should get a rematch early next year.
English
0
0
0
49
Darealest
Darealest@Datneega72·
@MonteroOnBoxing Junto started adjusting and you can literally see his game level up in there, Inoue just turned it on in response to show levels
English
1
0
2
1.4K
Yuri Bezmenov's Ghost
Yuri Bezmenov's Ghost@Ne_pas_couvrir·
For decades, prehistorical evidence was forced through a left-wing theory filter: primitive communism, the noble savage, and the myth that violence came from hierarchy, property, and “the system.” But the bones kept talking, and they disagree with the left.
Dwarkesh Patel@dwarkesh_sp

David Reich on how much ancient DNA evidence has overturned so much consensus thinking how ancient cultures spread. "It wasn't peaceful, it wasn't friendly, it wasn't nice. Some of our archaeologist co-authors were just really distressed."

English
212
1.3K
6.6K
770.2K
TheGhostBoxerGame
TheGhostBoxerGame@boxer_game·
@BoxerJunto @FinitoYamaguchi If you weren't cut maybe you win the last 2 rounds and it's much closer. I hope you come back stronger and get a rematch in the next couple of years.
English
0
0
2
2.7K
TheGhostBoxerGame
TheGhostBoxerGame@boxer_game·
Not me btw, just asking. My bottleneck is tech and human resources which I'm working on and I am optimistic about the future. But for me and my concerns, I though the agentic was going to be like a team of devs and it's more like somewhat better than before it came along. I'm just curious if it was the sauce that someone needed to make their dream work?
English
0
0
0
41
TheGhostBoxerGame
TheGhostBoxerGame@boxer_game·
Has anyone here been working on an ambitious project for awhile (months or years), really working on it not just thinking about it, but it never came together for whatever reason (tech wasn't there, coding it was difficult, no budget...) .... and then agentic coding advanced enough that your project could be overhauled and become what you always dreamed it could be if given a small team and a budget - but now you have even better - an agentic one shot team that could transform your project to AAA quality, and you finally could now deliver on what you had been dreaming of?
English
1
0
0
58
Devin Haney's Jab✨🥊
@ringmagazine THIS BORING-ASSED FIGHT. INOUE was fighting SCARED AS FUCK! All that RUNNING. Where was the MONSTER!? I fell asleep half way through watching these SMURFS paw and run to a decision. Inoue fans would CASTRATE any other fight like this. But THEY now call this a “chess match”!
GIF
English
5
0
5
1.5K
Ring Magazine
Ring Magazine@ringmagazine·
INOUE VS NAKATANI HIGHLIGHTS ⚔️ Highlights from Naoya Inoue's super fight victory over Junto Nakatani in Tokyo 🎌
Filipino
27
242
1.8K
75.8K