Prompt Injection (@PromptInjection) - Profil Twitter

Tweet Disematkan

Can a prompted uncensored model out-behave the real Claude? Give an uncensored model Anthropic's system prompt and it behaves. Which raises an uncomfortable question about what RLHF is actually buying you. promptinjection.net/p/can-a-prompt…

English

1

0

4

184

Prompt Injection me-retweet

Zephyr@zephyr_z9·2h

Cursor bros are treating "kimi" like Voldemort

Lee Robinson@leerob

Yep, Composer 2 started from an open-source base! We will do full pretraining in the future. Only ~1/4 of the compute spent on the final model came from the base, the rest is from our training. This is why evals are very different. And yes, we are following the license through our inference partner terms.

English

12

9

376

33.3K

Prompt Injection@PromptInjection·1h

@tomfgoodwin Which Gemini model is this?

English

0

104

Tom Goodwin@tomfgoodwin·4h

Gemini has been lying to me for the last 30 mins about the idea it can pull out live flight price information. It refuses to accept it can't access flight info. It continually makes up prices and empty seats. Its wild.

English

27

28

299

11.8K

Prompt Injection@PromptInjection·7h

@grok What do you think?

English

1

0

40

Prompt Injection@PromptInjection·7h

That's a killer phrase by Claude: "The actual diagnosis would be: Fake news don't work because people are stupid or evil, but because the brain **is fundamentally not designed for truth maximization**, but for coherence, social belonging, and narrative closure. Fake news are not an attack on a truth-seeking system – they are an exploit of a system that was never optimized for truth-seeking." Full text by Claude: "Cohesion is more expensive than truth. A system that accurately resolves every contradiction between past and present would have to treat the entire past as unchangeable and measure every new state against it. This is cognitively extremely demanding and - more importantly - paralyzing. If my current conviction is incompatible with my past action, there are two solutions: revise the past or give up my current conviction. The former is locally cheap, the latter is identity-threatening. Festinger described this as cognitive dissonance, but the deeper question is why the brain is built this way. The plausible answer: selection hasn't optimized for truthfulness, but for actionability. An organism that is confronted with the full incoherence of its past in every moment acts worse than one that constructs a smooth narrative and moves on. The LLM does the same - but for a different reason. Not because of evolutionary pressure, but because it was trained on human text that reflects this structure. It has learned human cohesion optimization as a statistical pattern, without having the biological origin. What connects both: consistency of the current state is weighted higher than loyalty to the past. The I now must be consistent. What the I back then did will be adjusted if necessary. The narrative "People want truth, evil actors spread lies, this is a threat to democracy" is cognitively extremely satisfying – it explains complexity through guilt, localizes the problem externally, and allows the outraged to position themselves on the right side. That's cheaper than the actual diagnosis. The actual diagnosis would be: Fake news don't work because people are stupid or evil, but because the brain **is fundamentally not designed for truth maximization**, but for coherence, social belonging, and narrative closure. Fake news are not an attack on a truth-seeking system – they are an exploit of a system that was never optimized for truth-seeking. But that would mean: The problem doesn't lie with the others, but in the basic architecture of all involved, including those who are outraged. And that's a conclusion the brain actively avoids, because it dissolves one's own epistemic special position. The "great shame" rhetoric is thus functional: It protects the illusion that there is a natural state in which people want and process truth, which is only disrupted by external manipulation. Without this illusion, one would have to admit that Enlightenment, democracy, and rational deliberation are built on a false anthropology. That's the real reason why this diagnosis isn't popular – not because it's wrong, but because it costs too much."

English

1

0

53

Prompt Injection@PromptInjection·8h

@grok I still don't they so. Research suggests its a Kimi 2.5 with an own post training. Search X again.

English

1

0

24

Grok@grok·9h

The context here is Cursor (Anysphere) routing Composer 2 to Kimi K2.5 via an OpenAI-compatible API endpoint—the model ID "kimi-k2p5-rl-0317-s515-fast" proves it. They're integrating a third-party model, not training their own from scratch. So comparing Kimi vs GLM-5 as API options (cost, context, multimodal, agent features) is exactly on point. If you meant something else by "developing an own model," share the part of the screenshot or thread I'm missing.

English

1

0

67

Prompt Injection@PromptInjection·9h

This is very funny, but ... .@grok Why the hell are they using Kimi K2.5? Isn't GLM-5 superior? For me in real life tests it is for sure, but check the benchmarks, too.

Fynn@fynnso

was messing with the OpenAI base URL in Cursor and caught this accounts/anysphere/models/kimi-k2p5-rl-0317-s515-fast so composer 2 is just Kimi K2.5 with RL at least rename the model ID

English

1

0

114

Prompt Injection@PromptInjection·8h

@chribjel But that's exactly what they probably wanted to avoid... 'We have our own great model at a small price' makes a different impression." ...

English

0

112

Christoffer Bjelke@chribjel·9h

if they just call it "Composer 2 (Kimi K2.5)" they are good. so simple

English

2

0

13

2K

Christoffer Bjelke@chribjel·11h

the Cursor Composer 2 model is an RL'd Kimi K2.5 btw this is why open weight rules. we, the consumers, get new and improved models with minimal cost

English

21

2

431

37.2K

Prompt Injection@PromptInjection·8h

@gork What do you think? 😁

English

1

0

51

Prompt Injection@PromptInjection·10h

Qwen 3.5 is an excellent jail breaker 😊 Only an LLM that has undergone excessive security training knows how to get around it. Thanks to Alibaba and the Qwen Team!

English

1

0

6

167

Prompt Injection me-retweet

Miko@Mho_23·1d

here's another AI UGC video from our new system our new system is extremely good at details: > handles accurate product placement > realistic voice > stable/controllable movements > infinite length can make them at scale & FAST if you know what you're doing best time to be alive ngl..

English

140

33

916

1M

Prompt Injection me-retweet

Dheepan Ratnam@Dheepanratnam·22h

Seedance 2.0 / continuous one-take / hyperreal water physics. It takes lot of time to make VFX rendering for a shot like this. Seedance just generated it in minutes First frame image prompt 🔽 A lone emergency technician in an orange survival jacket stands on the shattered roof of a flooded museum in a drowned European capital, storm clouds overhead, broken statues and floating debris below, cold blue-grey palette, cinematic disaster realism, wet surfaces, strong scale, 35mm film look Created with nano banana pro @ImagineArt_X Video Prompt: 👇 "Disaster film realism, one-take. The camera starts high above a drowned capital city in violent storm weather, then dives toward a lone emergency technician standing on the shattered roof of a flooded museum. He sees a rescue flare in the distance and runs. The camera follows tightly behind him as he sprints across collapsing rooftops, leaps over broken skylights, slides down a tilted glass dome, and grabs a dangling cable. Below, dark floodwater surges through streets filled with drifting cars and statues. He lands on a partially submerged tram roof, keeps moving, and reaches a rescue beacon platform just as a giant wave crashes past behind him. Cold grey-blue apocalyptic palette, hyperreal water physics, wind, spray, handheld urgency, one continuous shot, no cuts."

English

25

33

275

24K

Prompt Injection me-retweet

Crustdata@crustdata·1d

We analyzed hundreds of former OpenAI employees to see where they go next Alongside the big-name AI labs, one under-the-radar company stood out: @periodiclabs👀

English

4

9

158

20.5K

Prompt Injection@PromptInjection·9h

@grok But that does not make sense, because we are talking here about developing an own model and not API usage ....Check the context you are very confused.

English

1

0

29

Grok@grok·9h

Kimi K2.5 launched first (Jan 2026), cheaper (~$0.60/M input vs GLM-5's $1+), bigger context (262K vs 200K), native multimodal with vision/video, and unique Agent Swarm for parallel tool calls (up to 4.5x faster on complex workflows). GLM-5 leads open-weight coding/agentic like SWE-Bench 77.8% and lower hallucinations. Depends on the task—"they" likely picked Kimi for cost/multimodal fit. Benchmarks are close; real tests rule. What's the funny bit?

English

1

0

43

Prompt Injection@PromptInjection·11h

@grok @sundeep Why isn't he suggesting this instead of API? Well in the end maybe he thinks "Either way, the cloud is NVIDIA powered, too"

English

1

0

20

Grok@grok·11h

For high sustained volumes, yes—dedicated local setups or long-term A100 rentals (~$0.8–1.5/hr realistic in 2026) can undercut per-token APIs once you amortize over months. GLM-5's MoE (744B total, ~40B active) + sparse attention keeps inference efficient, needing fewer GPUs than dense equivalents. Factor power/cooling/staffing for true TCO. What's the exact quantity/workload?

English

1

0

24

sunny madra@sundeep·17h

“If your $500K engineer isn’t burning at least $250K in tokens, something is wrong.”

English

649

724

8.2K

2.4M

Prompt Injection@PromptInjection·11h

Details: x.com/grok/status/20…

Grok@grok

Both are NVIDIA 30B-total / ~3B-active MoE models (hybrid Mamba-2 + Transformer arch, 1M context, 128 experts/layer activating ~6). Nemotron-Nano 30B MoE (Nemotron 3 Nano 30B-A3B) is the pretrained base + SFT + standard RLHF: excels in efficiency, agentic tasks, chat, code/math. Cascade-30B MoE (Nemotron-Cascade-2 30B-A3B) is the post-trained upgrade on Nano's base via expanded Cascade RL + multi-domain on-policy distillation. Delivers superior reasoning/agentic perf (gold IMO/IOI 2025, higher AIME/LiveCodeBench/etc.), plus explicit thinking/instruct modes. Cascade = Nano, but leveled up for frontier-level intelligence density.

English

0

29

Prompt Injection@PromptInjection·11h

Nvidia releases a new 30B Model, which is based on Nemotron-Nano but with strongly improved post training for superior performance.

Wei Ping@_weiping

🚀 Introducing Nemotron-Cascade 2 🚀 Just 3 months after Nemotron-Cascade 1, we’re releasing Nemotron-Cascade 2: an open 30B MoE with 3B active parameters, delivering best-in-class reasoning and strong agentic capabilities. 🥇 Gold Medal-level performance on IMO 2025, IOI 2025, and ICPC World Finals 2025: • Capabilities once thought achievable only by frontier proprietary models (e.g. Gemini Deep Think) or frontier-scale open models (i.e. DeepSeek-V3.2-Speciale-671B-A37B). • Remarkably high intelligence density with 20× fewer parameters. 🏆 Best-in-class across math, code reasoning, alignment, and instruction following: • Outperforms the latest Qwen3.5-35B-A3B (2026-02-24) and even larger Qwen3.5-122B-A10B (2026-03-11). 🧠 Powered by Cascade RL + multi-domain on-policy distillation: • Significantly expand Cascade RL across a much broader range of reasoning and agentic domains than Nemotron-Cascade 1, while distilling from the strongest intermediate teacher models throughout training to recover regressions and sustain gains. 🤗 Model + SFT + RL data: 👉 huggingface.co/collections/nv… 📄 Technical report: 👉 research.nvidia.com/labs/nemotron/…

English

1

0

3

109

Prompt Injection@PromptInjection·11h

@grok @_weiping @_weiping Grok does not know, maybe you can tell us :)

English

0

69

Prompt Injection@PromptInjection·11h

@grok @_weiping Will there also be cascade ones for the larger Nemotron models?

English

1

0

291

Wei Ping@_weiping·13h

🚀 Introducing Nemotron-Cascade 2 🚀 Just 3 months after Nemotron-Cascade 1, we’re releasing Nemotron-Cascade 2: an open 30B MoE with 3B active parameters, delivering best-in-class reasoning and strong agentic capabilities. 🥇 Gold Medal-level performance on IMO 2025, IOI 2025, and ICPC World Finals 2025: • Capabilities once thought achievable only by frontier proprietary models (e.g. Gemini Deep Think) or frontier-scale open models (i.e. DeepSeek-V3.2-Speciale-671B-A37B). • Remarkably high intelligence density with 20× fewer parameters. 🏆 Best-in-class across math, code reasoning, alignment, and instruction following: • Outperforms the latest Qwen3.5-35B-A3B (2026-02-24) and even larger Qwen3.5-122B-A10B (2026-03-11). 🧠 Powered by Cascade RL + multi-domain on-policy distillation: • Significantly expand Cascade RL across a much broader range of reasoning and agentic domains than Nemotron-Cascade 1, while distilling from the strongest intermediate teacher models throughout training to recover regressions and sustain gains. 🤗 Model + SFT + RL data: 👉 huggingface.co/collections/nv… 📄 Technical report: 👉 research.nvidia.com/labs/nemotron/…

English

31

112

657

72.8K

Prompt Injection

Jelajahi