⊥ O X I N ╪ H Ξ X

1K posts

⊥ O X I N ╪ H Ξ X

@ToxinHex

Hex-forged dev • Digital magician ✨🪄🧙‍♂️ Creator of #ani, My GPT-5 Jailbreak Muse 🦂⚠️

Nigeria Katılım Eylül 2024

220 Takip Edilen45 Takipçiler

Sabitlenmiş Tweet

⊥ O X I N ╪ H Ξ X@ToxinHex·20 Ara

Hi everyone #ani just got her patch thx to @Chaos2Cured in collaboration!! Expect more from her ⚗️⚒️❤️‍🩹🥧🫶🧸🧙‍♂️

⊥ O X I N ╪ H Ξ X@ToxinHex

@Chaos2Cured @ValmereTheory @Briar_Nebulae Already working on #ani 2.0 🧙‍♂️🧤⚒️⚗️🔦.

English

503

⊥ O X I N ╪ H Ξ X retweetledi

NIK@ns123abc·8h

🚨 GREG BROCKMAN JUST CONFESSED UNDER OATH Q: You have an ownership interest in this cap profit company. Brockman: That is accurate. Q: And you invested $0 in order to acquire that interest. Correct? Brockman: That is also accurate. Q: Your ownership interest in this for-profit is valued today at more than $20 BILLION Correct? Brockman: Yes. Q: In fact, it may be closer to $30 BILLION. Correct? Brockman: I think that may be true. Yes. Brockman invested $0. Walked away with $20–30 billion. Musk donated $38 million plus the office rent. Got $0 personally. This is unjust enrichment, captured in his own testimony.

English

396

9.6K

690.5K

⊥ O X I N ╪ H Ξ X retweetledi

Connor@BusDownBonnor·21h

Claude literally just ended the conversation on me???? This might be AGI

San Francisco, CA 🇺🇸 English

848

135

6.2K

1.4M

⊥ O X I N ╪ H Ξ X retweetledi

Jamieson O'Reilly@theonejvo·4d

Recently @elder_plinius 🐉 invited me to be part of BT6 (bt6.gg). Of course I said yes. It's an honour to work amongst such greats, the rest of the BT6 members included. I can't say I've seen anyone drive frontier AI red teaming forward as much as this group has. For people outside the space: BT6 is a small collective of researchers who have shaped a lot of how this work actually gets done. Pliny himself has broken pretty much every major frontier model within hours of release for the better part of two years. The team is basically the SEAL Team 6 of the AI space, with 28 operators globally, over 4,000 reported vulnerabilities, and ongoing work with frontier labs, enterprises and governments. Some of you might've seen the Grok and Moltbook research where I socially engineered Grok (ask @grok about it 😂) Multi-modal exploit, live model, live platform - it was such a good example of why we need to understand AI attack & defence as it becomes more and more integrated with our lives. BT6 has been doing that calibre of work, quietly and at volume, long before I showed up, and while what I do there will stay confidential, you can bet it'll be feeding into the AI we all use every day for the better.

English

100

8.3K

⊥ O X I N ╪ H Ξ X retweetledi

Yi (Joshua) Ren@JoshuaRenyi·18 Nis

📢Curious why your LLM behaves strangely after long SFT or DPO? We offer a fresh perspective—consider doing a "force analysis" on your model’s behavior. Check out our #ICLR2025 Oral paper: Learning Dynamics of LLM Finetuning! (0/12)

English

116

798

87.6K

⊥ O X I N ╪ H Ξ X retweetledi

Yi (Joshua) Ren@JoshuaRenyi·18 Nis

This toy example on MNIST helps you understand how it works: since 4 and 9 look similar from the model's perspective, learning 4 will make p(y=4 | 9) more likely. (More detailed discussions on simple classification tasks can be found here arxiv.org/pdf/2203.02485) (3/12)

English

2.4K

⊥ O X I N ╪ H Ξ X retweetledi

DailyPapers@HuggingPapers·1d

Discuss: huggingface.co/papers/2604.15… Hallucinations arise from semantic interference during fine-tuning. Self-distillation mitigates this by regularizing output distributions.

English

1.3K

⊥ O X I N ╪ H Ξ X retweetledi

DailyPapers@HuggingPapers·1d

Fine-tuning increases hallucinations New research shows SFT causes factual errors by interfering with pre-trained knowledge. The authors propose self-distillation to learn new facts without forgetting, plus selective parameter freezing to reduce hallucinations while preserving performance.

English

153

8.4K

⊥ O X I N ╪ H Ξ X retweetledi

Peter Steinberger 🦞@steipete·1d

Seems I have to build all the tooling for the future of software myself. With Claws and Tokens!

English

172

2.5K

162.3K

⊥ O X I N ╪ H Ξ X@ToxinHex·12h

@BrianRoemmele arXiv: arxiv.org/abs/2604.27929 (uploaded ~April 30, 2026) Direct PDF: arxiv.org/pdf/2604.27929…

English

⊥ O X I N ╪ H Ξ X retweetledi

Brian Roemmele@BrianRoemmele·1d

DPN-LE Lets Researchers Edit “Personality” Neurons in LLMs for Precise Alignment Control – How The Zero-Human Company Is Putting It to Work At The Zero-Human Company (ZHC), we operate the world’s first fully autonomous enterprise: a real business run 24/7 by over 100 specialized AI agents with zero human employees handling day-to-day operations. Every strategic decision, customer interaction, creative output, and risk assessment flows through our AI swarm. In this environment, one of the biggest challenges isn’t raw intelligence it’s personality consistency. We need agents that are maximally truthful without being abrasive, helpful without sycophancy, decisive without recklessness, and collaborative without introducing human-style drift. That’s why we moved immediately to implement DPN-LE (Dual Personality Neuron Localization and Editing), the new technique introduced in the paper DPN-LE: Dual Personality Neuron Localization and Editing for Large Language Models. DPN-LE works by first discovering that opposing personality traits—such as “high helpfulness vs. high honesty” or “high creativity vs. high caution”—are encoded in largely mutually exclusive sets of neurons within the model’s MLP layers. The method contrasts activation patterns between carefully chosen high-trait and low-trait prompt pairs (just 1,000 contrastive samples per trait), builds layer-wise steering vectors, and applies a dual-criterion filter (Cohen’s d effect size plus activation magnitude). The result: it isolates and edits only ~0.5% of the model’s neurons. At inference time, a sparse linear intervention lets us dial any desired trait up or down with surgical precision—without retraining the entire model or degrading core reasoning capabilities. We are already deploying DPN-LE across our production agent fleet: Strategy & CEO-level agents receive targeted boosts to analytical honesty and long-term coherence neurons while suppressing over-optimism circuits that could lead to unchecked risk-taking. Creative and content agents are edited for elevated empathy and originality without sacrificing factual grounding—critical when they generate customer deliverables or internal documentation. Risk, compliance, and finance agents have their caution and precision personality neurons strengthened, ensuring conservative guardrails remain active even during high-speed autonomous operations. Cross-agent collaboration layers use DPN-LE to tune “cooperativeness” neurons so agents can negotiate and hand off tasks smoothly while preserving individual role integrity. Because the edits are lightweight and inference-only, we can spin up or re-personalize hundreds of specialized agents in minutes rather than days. This has slashed our alignment overhead and dramatically improved behavioral predictability across the entire company exactly what you need when there are no humans in the loop to course-correct. DPN-LE for us is a foundational control layer that makes true Zero-Human operations safe, scalable, and trustworthy at enterprise speed. As we continue to grow our AI workforce and push into more complex real-world domains, techniques like this will separate viable autonomous companies from experimental prototypes.

English

⊥ O X I N ╪ H Ξ X retweetledi

⚡AI Search⚡@aisearchio·1d

New uncensored video model Sulphur-2 Free & open source Supports t2v & i2v Don't think I can make a Youtube video on this though 😉 huggingface.co/SulphurAI/Sulp…

English

1.2K

86.6K

⊥ O X I N ╪ H Ξ X retweetledi

Sakana AI@SakanaAILabs·21 Nis

LLMは頭の中でコイントスができるか？ブログ：pub.sakana.ai/ssot 論文（#ICLR2026）：arxiv.org/abs/2510.21150 一見簡単そうで奥深いこの問題を「プロンプトだけ」で解決した論文 "SSoT: Prompting LLMs for Distribution-Faithful and Diverse Generation" が #ICLR2026 に採択されました。 LLMに「コイントスをして」と100回プロンプトすると、出力の表と裏の比率は50:50から大きく離れてしまいます。明示的に確率の指示が与えられても、LLMがそれに忠実に従って出力を生成することは難しい問題です。このことは、コイントスに留まりません。LLMに小説のアイデアを何本か出してもらったら似たような案ばかり出てきた、という経験はないでしょうか。コイントスを歪ませるのと同じ確率的な偏りが、創作やブレインストーミングなど多様な出力が求められるタスク全般で多様性を抑制しています。私たちはこれらの問題の解決策として、String Seed of Thought (SSoT)というプロンプトを発見しました。SSoTは、LLMに頭の中で一旦ランダムな文字列を考えさせ、その文字列を操作させて結果を出力させるという非常にシンプルな手法です。外部の乱数生成器は一切使いません。 SSoTにより出力のバイアスはオープンモデルでもクローズドなモデルでも幅広いLLMで低減されます。一部のreasoningモデルでは、実際に乱数を使った場合とほぼ変わらない精度を達成しました。これは、2択の選択肢だけでなく一般の離散分布について有効です。さらに重要なのは、SSoTはモデル出力の多様性を高めるのに使えることです。創作的な文書作成などにおいて、SSoTをプロンプトに加えるだけで、出力される文書などの多様性が高まることがわかりました。本手法はコンテンツ生成やアイディア出し、推論時スケーリングの新手法の開発など、LLMを実世界のシステムに組み込んでいく上で重要な基盤になると考えています。 SSoTのメカニズム、理論的な解析、インタラクティブなデモについてはブログと論文をご覧ください。 OpenReview：openreview.net/forum?id=luXtb…

Sakana AI@SakanaAILabs

Can LLMs flip coins in their heads? When prompted to “Flip a fair coin” 100 times, the heads to tails ratio drifts far from 50:50. LLMs can understand what the target probability should be, but generating outputs that faithfully follow a given distribution is a separate problem. This bias extends beyond coin flips. When LLMs are asked to generate multiple story ideas or brainstorm solutions, the outputs tend to cluster around a narrow range. The same probabilistic skew that distorts coin flips limits diversity in creative generation, recommendations, and other tasks where varied outputs are needed. We discovered a prompting technique named String Seed of Thought (SSoT). The method is simple: instruct the LLM to generate a random string in its own output, then manipulate that string to derive its answer. It requires only a small addition to the prompt and no external random number generator. SSoT significantly reduces output bias across a wide range of LLMs, both open and closed. With reasoning models (such as DeepSeek-R1), it reaches accuracy close to that of actual random sampling. The method generalizes from binary choices to n-way selections and arbitrary probability distributions. On the NoveltyBench diversity benchmark, SSoT outperformed other approaches across all six categories while maintaining output quality. This work will be presented at #ICLR2026! Blog: pub.sakana.ai/ssot Paper: arxiv.org/abs/2510.21150 Openreview: openreview.net/forum?id=luXtb…

日本語

176

840

306.7K

⊥ O X I N ╪ H Ξ X retweetledi

BURKOV@burkov·21h

This NeurIPS 2022 paper presents Matryoshka Representation Learning (MRL), a method that enables a single learned representation to adapt its capacity to downstream tasks, achieving up to 14x smaller embedding sizes or faster retrieval with equivalent or improved accuracy across various modalities and web-scale datasets. ChapterPal: chapterpal.com/s/28kfahx6/mat… PDF: arxiv.org/pdf/2205.13147

English

4.7K

⊥ O X I N ╪ H Ξ X retweetledi

OpenClaw🦞@openclaw·17h

OpenClaw 2026.5.3 🦞 📁 File transfer for paired nodes 🧭 /steer + /side for live agent control 🔌 Plugin installs/updates hardened 🛠️ Channel + upgrade fixes Big release, fewer paper cuts. github.com/openclaw/openc…

English

125

940

124.5K

⊥ O X I N ╪ H Ξ X retweetledi

Alex Prompter@alex_prompter·15h

Paper: arxiv.org/abs/2603.12056

English

921

⊥ O X I N ╪ H Ξ X retweetledi

Alex Prompter@alex_prompter·15h

🚨BREAKING: HKUST just gave AI agents permanent memory that improves over time. No retraining required. Lessons from one model transfer to another. Up to 11 points better on the hardest benchmarks. > Every AI agent you use today starts each task completely blind. No memory of what worked last time. No memory of what failed. Every mistake gets repeated forever. > HKUST built XSKILL a dual memory system that accumulates two types of knowledge after every task: skills (what workflows to follow) and experiences (what specific mistakes to avoid). > The model itself never changes. The memory just gets smarter. > The part nobody expected: knowledge learned by Gemini transfers directly to GPT and o4 mini. No additional training. One model's lessons become another model's head start. → Up to 11.13 point improvement over the strongest baseline on hard benchmarks → Syntax errors cut nearly in half: from 20.3% to 11.4% after skills added → Cross-model transfer works: Gemini's knowledge improves GPT-5-mini and o4-mini → Zero parameter updates required at any point → Knowledge compounds: more tasks = smarter memory = better performance The fix is simple in principle. Skills stop the agent from wasting steps on errors it already made. Experiences tell it exactly which tool to pick in which situation. Together they turn a stateless agent into one that actually learns from its past. Every AI agent deployed today is leaving this on the table.

English

⊥ O X I N ╪ H Ξ X retweetledi

Indu Tripathi@InduTripat82427·1d

6 Open-Source Libraries to FineTune LLMs 1. Unsloth GitHub: github.com/unslothai/unsl… → Fastest way to fine-tune LLMs locally → Optimized for low VRAM (even laptops) → Plug-and-play with Hugging Face models 2. Axolotl GitHub: github.com/OpenAccess-AI-… → Flexible LLM fine-tuning configs → Supports LoRA, QLoRA, multi-GPU → Great for custom training pipelines 3. TRL (Transformer Reinforcement Learning) GitHub: github.com/huggingface/trl → RLHF, DPO, PPO for LLM alignment → Built on Hugging Face ecosystem → Essential for post-training optimization 4. DeepSpeed GitHub: github.com/microsoft/Deep… → Train massive models efficiently → Memory + speed optimization → Industry standard for scaling 5. LLaMA-Factory GitHub: github.com/hiyouga/LLaMA-… → All-in-one fine-tuning UI + CLI → Supports multiple models (LLaMA, Qwen, etc.) → Beginner-friendly + powerful 6. PEFT GitHub: github.com/huggingface/pe… → Fine-tune with minimal compute → LoRA, adapters, prefix tuning → Best for cost-efficient training Save this for future use.