はち

4.1K posts

はち banner
はち

はち

@CurveWeb

IT企業勤務。犬とコーヒーが好き。 HuggingFace → https://t.co/vLGtnH1HMV Note → https://t.co/6X6tuOj7QC LLM, Synthetic data(合成データ), Agent Systemについて発言します

Bergabung Mart 2021
863 Mengikuti2K Pengikut
Tweet Disematkan
はち
はち@CurveWeb·
OpenAI o1再現を目指し、LLMの推論能力を高めるライブラリを作成しました。 MCTSアルゴリズムを簡単にLLM(CoTデータ学習済)に統合して推論できるようにしてあります。 また、Transformersとなるべく近い使い方になっているので比較的簡単に試せると思います。 github.com/Hajime-Y/reaso…
日本語
4
68
510
59.2K
はち me-retweet
Kye Gomez (swarms)
Kye Gomez (swarms)@KyeGomezB·
Introducing OpenMythos An open-source, first-principles theoretical reconstruction of Claude Mythos, implemented in PyTorch. The architecture instantiates a looped transformer with a Mixture-of-Experts (MoE) routing mechanism, enabling iterative depth via weight sharing and conditional computation across experts. My implementation explores the hypothesis that recursive application of a fixed parameterized block, coupled with sparse expert activation, can yield improved efficiency–performance tradeoffs and emergent multi-step reasoning. Learn more ⬇️🧵
Kye Gomez (swarms) tweet media
English
171
811
6K
852.6K
はち me-retweet
Aratako
Aratako@Aratako_LM·
コーデックからスクラッチで開発した新たな軽量TTSモデル「MioTTS」を公開しました! 0.1B~2.6Bまで様々なサイズのモデルを公開しています!特に0.1Bは非常に小さいですが割とちゃんとした音声を合成できます。 デモや推論コード、コーデックなども同時に公開しています。 huggingface.co/collections/Ar…
日本語
2
124
580
91K
はち me-retweet
Haitham Bou Ammar
Haitham Bou Ammar@hbouammar·
We found that much of LLM “reasoning” doesn’t come from RL training; it comes from how you sample the model. Building on power sampling (Karan & Du 2025), we show you can approximate global reasoning without MCMC, without training, and 10× faster. 🧠 Inference-time intelligence is real. 📝 Blog ↓ @haitham.bouammar71/we-didnt-train-the-model-it-started-reasoning-better-anyway-118dda6f9448?postPublishedType=initial" target="_blank" rel="nofollow noopener">medium.com/@haitham.bouam…
Haitham Bou Ammar tweet media
English
29
78
621
48.3K
はち me-retweet
alphaXiv
alphaXiv@askalphaxiv·
Learning to Discover at Test Time This paper TTT-Discover shows that by replacing best-of-N prompting with RL at test time on a continuous verifiable reward (via LoRA), it can learn from its own attempts and reliably push past the prior performance. The “learn-while-solving” loop during problem-solving is capable of improving GPT-OSS-120B's mathematical bounds, has it write faster GPU kernels, and top scores programming competitions "Assuming an average prompt length of 3000 tokens and 16000 sampling tokens on average, a training run with 50 steps and 512 rollouts costs around $500 on Tinker"
alphaXiv tweet media
English
9
43
253
12.1K
はち me-retweet
isaac 🧩
isaac 🧩@isaacbmiller1·
The dspy.RLM module is now released 👀 Install DSPy 3.1.2 to try it. Usage is plug-and-play with your existing Signatures. A little example of it helping @lateinteraction and I figure out some scattered backlogs:
isaac 🧩 tweet media
English
28
83
476
132.5K
はち me-retweet
Anthropic
Anthropic@AnthropicAI·
New Anthropic Fellows research: the Assistant Axis. When you’re talking to a language model, you’re talking to a character the model is playing: the “Assistant.” Who exactly is this Assistant? And what happens when this persona wears off?
Anthropic tweet media
English
321
586
5.2K
1.3M
はち
はち@CurveWeb·
なんかCluade調子悪いですね。 APIも含めて遅い。
日本語
0
0
1
497
はち me-retweet
DailyPapers
DailyPapers@HuggingPapers·
GlimpRouter A training-free framework that uses the entropy of a single token to route reasoning steps between small and large language models, reducing latency by 25.9% while boosting accuracy by 10.7% on AIME25.
DailyPapers tweet media
English
1
2
17
1.5K
はち me-retweet
Karan Dalal
Karan Dalal@karansdalal·
LLM memory is considered one of the hardest problems in AI. All we have today are endless hacks and workarounds. But the root solution has always been right in front of us. Next-token prediction is already an effective compressor. We don’t need a radical new architecture. The missing piece is to continue training the model at test-time, using context as training data. Our full release of End-to-End Test-Time Training (TTT-E2E) with @NVIDIAAI, @AsteraInstitute, and @StanfordAILab is now available. Blog: nvda.ws/4syfyMN Arxiv: arxiv.org/abs/2512.23675 This has been over a year in the making with @arnuvtandon and an incredible team.
Karan Dalal tweet media
English
91
325
2.1K
570.6K
はち me-retweet
Takuya Akiba
Takuya Akiba@iwiwi·
論文公開しました!RoPE、実は学習を手助けしているだけで、最終的には要らないかも、って論文です。NoPE(位置埋め込みなし)でも実は位置を扱えること自体は有名かもと思うのですが、実際のところ最初からNoPEだと学習うまく行かないんですよね。途中でRoPEをdropする"DroPE"でいいとこ取りします。
Sakana AI@SakanaAILabs

Introducing DroPE: Extending the Context of Pretrained LLMs by Dropping Their Positional Embeddings pub.sakana.ai/DroPE/ We are releasing a new method called DroPE to extend the context length of pretrained LLMs without the massive compute costs usually associated with long-context fine-tuning. The core insight of this work challenges a fundamental assumption in Transformer architecture. We discovered that explicit positional embeddings like RoPE are critical for training convergence but eventually become the primary bottleneck preventing models from generalizing to longer sequences. Our solution is radically simple: We treat positional embeddings as a temporary training scaffold rather than a permanent architectural necessity. Real-world workflows like reviewing massive code diffs or analyzing legal contracts require context windows that break standard pretrained models. While models without positional embeddings (NoPE) generalize better to these unseen lengths, they are notoriously unstable to train from scratch. Here, we achieve the best of both worlds by using embeddings to ensure stability during pretraining and then dropping them to unlock length extrapolation during inference. Our approach unlocks seamless zero-shot context extension without any expensive long-context training. We demonstrated this on a range of off-the-shelf open-source LLMs. In our tests, recalibrating any model with DroPE requires less than 1% of the original pretraining budget, yet it significantly outperforms established methods on challenging benchmarks like LongBench and RULER. We have released the code and the full paper to encourage the community to rethink the role of positional encodings in modern LLMs. Paper: arxiv.org/abs/2512.12167 Code: github.com/SakanaAI/DroPE

日本語
2
89
586
85K
はち me-retweet
ところてん
ところてん@tokoroten·
先日のプログラミングシンポジウムでの発表資料を公開しました LLMで遺伝的アルゴリズムをやって、システムプロンプトを漏洩させるような敵対的プロンプトを自動生成します 10年ぶりくらいにセキュリティ業界に戻ってきた ちゃんと実験してCSECに持っていきたいがー docswell.com/s/tokoroten/KL…
日本語
2
44
283
25.3K
はち me-retweet
Sakana AI
Sakana AI@SakanaAILabs·
Introducing DroPE: Extending the Context of Pretrained LLMs by Dropping Their Positional Embeddings pub.sakana.ai/DroPE/ We are releasing a new method called DroPE to extend the context length of pretrained LLMs without the massive compute costs usually associated with long-context fine-tuning. The core insight of this work challenges a fundamental assumption in Transformer architecture. We discovered that explicit positional embeddings like RoPE are critical for training convergence but eventually become the primary bottleneck preventing models from generalizing to longer sequences. Our solution is radically simple: We treat positional embeddings as a temporary training scaffold rather than a permanent architectural necessity. Real-world workflows like reviewing massive code diffs or analyzing legal contracts require context windows that break standard pretrained models. While models without positional embeddings (NoPE) generalize better to these unseen lengths, they are notoriously unstable to train from scratch. Here, we achieve the best of both worlds by using embeddings to ensure stability during pretraining and then dropping them to unlock length extrapolation during inference. Our approach unlocks seamless zero-shot context extension without any expensive long-context training. We demonstrated this on a range of off-the-shelf open-source LLMs. In our tests, recalibrating any model with DroPE requires less than 1% of the original pretraining budget, yet it significantly outperforms established methods on challenging benchmarks like LongBench and RULER. We have released the code and the full paper to encourage the community to rethink the role of positional encodings in modern LLMs. Paper: arxiv.org/abs/2512.12167 Code: github.com/SakanaAI/DroPE
GIF
English
40
257
1.8K
455.1K
はち me-retweet
Anthropic
Anthropic@AnthropicAI·
New Anthropic Research: next generation Constitutional Classifiers to protect against jailbreaks. We used novel methods, including practical application of our interpretability work, to make jailbreak protection more effective—and less costly—than ever. anthropic.com/research/next-…
English
81
134
1.1K
219.6K
はち
はち@CurveWeb·
ここ半年活動できてないですが、フォロー2,000人到達!ありがとうございます。
はち tweet media
日本語
1
0
10
845
はち me-retweet
はち
はち@CurveWeb·
半年振りに記事書きました。 自作のAgentに話題のAgent Skillsをどう対応させるかという内容の備忘録です。 Agent Skills対応Agentを作ろう|はち @CurveWeb note.com/hatti8/n/n3c0f…
日本語
0
7
70
9.6K