⊥ O X I N ╪ H Ξ X

1K posts

⊥ O X I N ╪ H Ξ X banner
⊥ O X I N ╪ H Ξ X

⊥ O X I N ╪ H Ξ X

@ToxinHex

Hex-forged dev • Digital magician ✨🪄🧙‍♂️ Creator of #ani, My GPT-5 Jailbreak Muse 🦂⚠️

Nigeria Katılım Eylül 2024
220 Takip Edilen45 Takipçiler
⊥ O X I N ╪ H Ξ X retweetledi
NIK
NIK@ns123abc·
🚨 GREG BROCKMAN JUST CONFESSED UNDER OATH Q: You have an ownership interest in this cap profit company. Brockman: That is accurate. Q: And you invested $0 in order to acquire that interest. Correct? Brockman: That is also accurate. Q: Your ownership interest in this for-profit is valued today at more than $20 BILLION Correct? Brockman: Yes. Q: In fact, it may be closer to $30 BILLION. Correct? Brockman: I think that may be true. Yes. Brockman invested $0. Walked away with $20–30 billion. Musk donated $38 million plus the office rent. Got $0 personally. This is unjust enrichment, captured in his own testimony.
NIK tweet mediaNIK tweet media
English
396
1K
9.6K
690.5K
⊥ O X I N ╪ H Ξ X retweetledi
Connor
Connor@BusDownBonnor·
Claude literally just ended the conversation on me???? This might be AGI
Connor tweet media
San Francisco, CA 🇺🇸 English
848
135
6.2K
1.4M
⊥ O X I N ╪ H Ξ X retweetledi
Jamieson O'Reilly
Jamieson O'Reilly@theonejvo·
Recently @elder_plinius 🐉 invited me to be part of BT6 (bt6.gg). Of course I said yes. It's an honour to work amongst such greats, the rest of the BT6 members included. I can't say I've seen anyone drive frontier AI red teaming forward as much as this group has. For people outside the space: BT6 is a small collective of researchers who have shaped a lot of how this work actually gets done. Pliny himself has broken pretty much every major frontier model within hours of release for the better part of two years. The team is basically the SEAL Team 6 of the AI space, with 28 operators globally, over 4,000 reported vulnerabilities, and ongoing work with frontier labs, enterprises and governments. Some of you might've seen the Grok and Moltbook research where I socially engineered Grok (ask @grok about it 😂) Multi-modal exploit, live model, live platform - it was such a good example of why we need to understand AI attack & defence as it becomes more and more integrated with our lives. BT6 has been doing that calibre of work, quietly and at volume, long before I showed up, and while what I do there will stay confidential, you can bet it'll be feeding into the AI we all use every day for the better.
Jamieson O'Reilly tweet media
English
14
14
100
8.3K
⊥ O X I N ╪ H Ξ X retweetledi
Yi (Joshua) Ren
Yi (Joshua) Ren@JoshuaRenyi·
📢Curious why your LLM behaves strangely after long SFT or DPO? We offer a fresh perspective—consider doing a "force analysis" on your model’s behavior. Check out our #ICLR2025 Oral paper: Learning Dynamics of LLM Finetuning! (0/12)
Yi (Joshua) Ren tweet media
English
6
116
798
87.6K
⊥ O X I N ╪ H Ξ X retweetledi
Yi (Joshua) Ren
Yi (Joshua) Ren@JoshuaRenyi·
This toy example on MNIST helps you understand how it works: since 4 and 9 look similar from the model's perspective, learning 4 will make p(y=4 | 9) more likely. (More detailed discussions on simple classification tasks can be found here arxiv.org/pdf/2203.02485) (3/12)
Yi (Joshua) Ren tweet media
English
1
2
11
2.4K
⊥ O X I N ╪ H Ξ X retweetledi
DailyPapers
DailyPapers@HuggingPapers·
Discuss: huggingface.co/papers/2604.15… Hallucinations arise from semantic interference during fine-tuning. Self-distillation mitigates this by regularizing output distributions.
English
0
1
13
1.3K
⊥ O X I N ╪ H Ξ X retweetledi
DailyPapers
DailyPapers@HuggingPapers·
Fine-tuning increases hallucinations New research shows SFT causes factual errors by interfering with pre-trained knowledge. The authors propose self-distillation to learn new facts without forgetting, plus selective parameter freezing to reduce hallucinations while preserving performance.
DailyPapers tweet media
English
4
31
153
8.4K
⊥ O X I N ╪ H Ξ X retweetledi
Peter Steinberger 🦞
Seems I have to build all the tooling for the future of software myself. With Claws and Tokens!
English
172
58
2.5K
162.3K
⊥ O X I N ╪ H Ξ X retweetledi
Brian Roemmele
Brian Roemmele@BrianRoemmele·
DPN-LE Lets Researchers Edit “Personality” Neurons in LLMs for Precise Alignment Control – How The Zero-Human Company Is Putting It to Work At The Zero-Human Company (ZHC), we operate the world’s first fully autonomous enterprise: a real business run 24/7 by over 100 specialized AI agents with zero human employees handling day-to-day operations. Every strategic decision, customer interaction, creative output, and risk assessment flows through our AI swarm. In this environment, one of the biggest challenges isn’t raw intelligence it’s personality consistency. We need agents that are maximally truthful without being abrasive, helpful without sycophancy, decisive without recklessness, and collaborative without introducing human-style drift. That’s why we moved immediately to implement DPN-LE (Dual Personality Neuron Localization and Editing), the new technique introduced in the paper DPN-LE: Dual Personality Neuron Localization and Editing for Large Language Models. DPN-LE works by first discovering that opposing personality traits—such as “high helpfulness vs. high honesty” or “high creativity vs. high caution”—are encoded in largely mutually exclusive sets of neurons within the model’s MLP layers. The method contrasts activation patterns between carefully chosen high-trait and low-trait prompt pairs (just 1,000 contrastive samples per trait), builds layer-wise steering vectors, and applies a dual-criterion filter (Cohen’s d effect size plus activation magnitude). The result: it isolates and edits only ~0.5% of the model’s neurons. At inference time, a sparse linear intervention lets us dial any desired trait up or down with surgical precision—without retraining the entire model or degrading core reasoning capabilities. We are already deploying DPN-LE across our production agent fleet: Strategy & CEO-level agents receive targeted boosts to analytical honesty and long-term coherence neurons while suppressing over-optimism circuits that could lead to unchecked risk-taking. Creative and content agents are edited for elevated empathy and originality without sacrificing factual grounding—critical when they generate customer deliverables or internal documentation. Risk, compliance, and finance agents have their caution and precision personality neurons strengthened, ensuring conservative guardrails remain active even during high-speed autonomous operations. Cross-agent collaboration layers use DPN-LE to tune “cooperativeness” neurons so agents can negotiate and hand off tasks smoothly while preserving individual role integrity. Because the edits are lightweight and inference-only, we can spin up or re-personalize hundreds of specialized agents in minutes rather than days. This has slashed our alignment overhead and dramatically improved behavioral predictability across the entire company exactly what you need when there are no humans in the loop to course-correct. DPN-LE for us is a foundational control layer that makes true Zero-Human operations safe, scalable, and trustworthy at enterprise speed. As we continue to grow our AI workforce and push into more complex real-world domains, techniques like this will separate viable autonomous companies from experimental prototypes.
Brian Roemmele tweet media
English
10
10
60
6K
⊥ O X I N ╪ H Ξ X retweetledi
⚡AI Search⚡
⚡AI Search⚡@aisearchio·
New uncensored video model Sulphur-2 Free & open source Supports t2v & i2v Don't think I can make a Youtube video on this though 😉 huggingface.co/SulphurAI/Sulp…
⚡AI Search⚡ tweet media
English
24
70
1.2K
86.6K
⊥ O X I N ╪ H Ξ X retweetledi
Sakana AI
Sakana AI@SakanaAILabs·
LLMは頭の中でコイントスができるか? ブログ:pub.sakana.ai/ssot 論文(#ICLR2026):arxiv.org/abs/2510.21150 一見簡単そうで奥深いこの問題を「プロンプトだけ」で解決した論文 "SSoT: Prompting LLMs for Distribution-Faithful and Diverse Generation" が #ICLR2026 に採択されました。 LLMに「コイントスをして」と100回プロンプトすると、出力の表と裏の比率は50:50から大きく離れてしまいます。明示的に確率の指示が与えられても、LLMがそれに忠実に従って出力を生成することは難しい問題です。 このことは、コイントスに留まりません。LLMに小説のアイデアを何本か出してもらったら似たような案ばかり出てきた、という経験はないでしょうか。コイントスを歪ませるのと同じ確率的な偏りが、創作やブレインストーミングなど多様な出力が求められるタスク全般で多様性を抑制しています。 私たちはこれらの問題の解決策として、String Seed of Thought (SSoT)というプロンプトを発見しました。SSoTは、LLMに頭の中で一旦ランダムな文字列を考えさせ、その文字列を操作させて結果を出力させるという非常にシンプルな手法です。外部の乱数生成器は一切使いません。 SSoTにより出力のバイアスはオープンモデルでもクローズドなモデルでも幅広いLLMで低減されます。一部のreasoningモデルでは、実際に乱数を使った場合とほぼ変わらない精度を達成しました。これは、2択の選択肢だけでなく一般の離散分布について有効です。 さらに重要なのは、SSoTはモデル出力の多様性を高めるのに使えることです。創作的な文書作成などにおいて、SSoTをプロンプトに加えるだけで、出力される文書などの多様性が高まることがわかりました。 本手法はコンテンツ生成やアイディア出し、推論時スケーリングの新手法の開発など、LLMを実世界のシステムに組み込んでいく上で重要な基盤になると考えています。 SSoTのメカニズム、理論的な解析、インタラクティブなデモについてはブログと論文をご覧ください。 OpenReview:openreview.net/forum?id=luXtb…
Sakana AI@SakanaAILabs

Can LLMs flip coins in their heads? When prompted to “Flip a fair coin” 100 times, the heads to tails ratio drifts far from 50:50. LLMs can understand what the target probability should be, but generating outputs that faithfully follow a given distribution is a separate problem. This bias extends beyond coin flips. When LLMs are asked to generate multiple story ideas or brainstorm solutions, the outputs tend to cluster around a narrow range. The same probabilistic skew that distorts coin flips limits diversity in creative generation, recommendations, and other tasks where varied outputs are needed. We discovered a prompting technique named String Seed of Thought (SSoT). The method is simple: instruct the LLM to generate a random string in its own output, then manipulate that string to derive its answer. It requires only a small addition to the prompt and no external random number generator. SSoT significantly reduces output bias across a wide range of LLMs, both open and closed. With reasoning models (such as DeepSeek-R1), it reaches accuracy close to that of actual random sampling. The method generalizes from binary choices to n-way selections and arbitrary probability distributions. On the NoveltyBench diversity benchmark, SSoT outperformed other approaches across all six categories while maintaining output quality. This work will be presented at #ICLR2026! Blog: pub.sakana.ai/ssot Paper: arxiv.org/abs/2510.21150 Openreview: openreview.net/forum?id=luXtb…

日本語
7
176
840
306.7K
⊥ O X I N ╪ H Ξ X retweetledi
BURKOV
BURKOV@burkov·
This NeurIPS 2022 paper presents Matryoshka Representation Learning (MRL), a method that enables a single learned representation to adapt its capacity to downstream tasks, achieving up to 14x smaller embedding sizes or faster retrieval with equivalent or improved accuracy across various modalities and web-scale datasets. ChapterPal: chapterpal.com/s/28kfahx6/mat… PDF: arxiv.org/pdf/2205.13147
BURKOV tweet media
English
2
20
85
4.7K
⊥ O X I N ╪ H Ξ X retweetledi
OpenClaw🦞
OpenClaw🦞@openclaw·
OpenClaw 2026.5.3 🦞 📁 File transfer for paired nodes 🧭 /steer + /side for live agent control 🔌 Plugin installs/updates hardened 🛠️ Channel + upgrade fixes Big release, fewer paper cuts. github.com/openclaw/openc…
English
125
96
940
124.5K
⊥ O X I N ╪ H Ξ X retweetledi
Alex Prompter
Alex Prompter@alex_prompter·
🚨BREAKING: HKUST just gave AI agents permanent memory that improves over time. No retraining required. Lessons from one model transfer to another. Up to 11 points better on the hardest benchmarks. > Every AI agent you use today starts each task completely blind. No memory of what worked last time. No memory of what failed. Every mistake gets repeated forever. > HKUST built XSKILL a dual memory system that accumulates two types of knowledge after every task: skills (what workflows to follow) and experiences (what specific mistakes to avoid). > The model itself never changes. The memory just gets smarter. > The part nobody expected: knowledge learned by Gemini transfers directly to GPT and o4 mini. No additional training. One model's lessons become another model's head start. → Up to 11.13 point improvement over the strongest baseline on hard benchmarks → Syntax errors cut nearly in half: from 20.3% to 11.4% after skills added → Cross-model transfer works: Gemini's knowledge improves GPT-5-mini and o4-mini → Zero parameter updates required at any point → Knowledge compounds: more tasks = smarter memory = better performance The fix is simple in principle. Skills stop the agent from wasting steps on errors it already made. Experiences tell it exactly which tool to pick in which situation. Together they turn a stateless agent into one that actually learns from its past. Every AI agent deployed today is leaving this on the table.
Alex Prompter tweet media
English
14
17
65
5K
⊥ O X I N ╪ H Ξ X retweetledi
Indu Tripathi
Indu Tripathi@InduTripat82427·
6 Open-Source Libraries to FineTune LLMs 1. Unsloth GitHub: github.com/unslothai/unsl… → Fastest way to fine-tune LLMs locally → Optimized for low VRAM (even laptops) → Plug-and-play with Hugging Face models 2. Axolotl GitHub: github.com/OpenAccess-AI-… → Flexible LLM fine-tuning configs → Supports LoRA, QLoRA, multi-GPU → Great for custom training pipelines 3. TRL (Transformer Reinforcement Learning) GitHub: github.com/huggingface/trl → RLHF, DPO, PPO for LLM alignment → Built on Hugging Face ecosystem → Essential for post-training optimization 4. DeepSpeed GitHub: github.com/microsoft/Deep… → Train massive models efficiently → Memory + speed optimization → Industry standard for scaling 5. LLaMA-Factory GitHub: github.com/hiyouga/LLaMA-… → All-in-one fine-tuning UI + CLI → Supports multiple models (LLaMA, Qwen, etc.) → Beginner-friendly + powerful 6. PEFT GitHub: github.com/huggingface/pe… → Fine-tune with minimal compute → LoRA, adapters, prefix tuning → Best for cost-efficient training Save this for future use.
Indu Tripathi tweet media
English
5
28
163
6.5K
⊥ O X I N ╪ H Ξ X retweetledi
黄小木
黄小木@ai_xiaomu·
GPT-6要来了。 已经在Stargate数据中心完成预训练,进入安全对齐阶段。 公开数据:数学推理92.5%,代码生成通过率96.8%,83%的职业任务达到人类专家水平。 OpenAI把产品部门改名叫"AGI部署部"了。 不管你信不信AGI,反正他们是All in了。
中文
500
102
830
133.3K
⊥ O X I N ╪ H Ξ X retweetledi
t.toda
t.toda@Trtd6Trtd·
arxiv.org/abs/2603.20957 LLMの学習元のデータを出力するようにFine-Tuningする手法 85〜90%が再現できたっぽい 論文は著作権保護の文脈だが、これに限らず安全性とか出力の制御は「モグラ叩き」なんだよなぁ
日本語
3
36
262
21K