はち (@CurveWeb) - Profil Twitter | Zamantika Mersobahis Locabet

Tweet Disematkan

はち@CurveWeb·16 Ara

OpenAI o1再現を目指し、LLMの推論能力を高めるライブラリを作成しました。 MCTSアルゴリズムを簡単にLLM（CoTデータ学習済）に統合して推論できるようにしてあります。また、Transformersとなるべく近い使い方になっているので比較的簡単に試せると思います。 github.com/Hajime-Y/reaso…

日本語

4

68

510

59.2K

はち me-retweet

Kye Gomez (swarms)@KyeGomezB·1d

Introducing OpenMythos An open-source, first-principles theoretical reconstruction of Claude Mythos, implemented in PyTorch. The architecture instantiates a looped transformer with a Mixture-of-Experts (MoE) routing mechanism, enabling iterative depth via weight sharing and conditional computation across experts. My implementation explores the hypothesis that recursive application of a fixed parameterized block, coupled with sparse expert activation, can yield improved efficiency–performance tradeoffs and emergent multi-step reasoning. Learn more ⬇️🧵

English

171

811

6K

852.6K

はち me-retweet

Naoaki Okazaki@chokkanorg·20 Şub

📢 GPT-OSS Swallow と Qwen3 Swallow をリリースしました。継続事前学習＋SFT＋強化学習を全面刷新し、日本語性能と推論能力を両立させたオープンなLLMを、 Apache 2.0ライセンスで利用できます。 Qwen3 Swallow: swallow-llm.github.io/qwen3-swallow.… GPT-OSS Swallow: swallow-llm.github.io/gptoss-swallow…

日本語

13

342

1.3K

235.9K

はち me-retweet

Aratako@Aratako_LM·11 Şub

コーデックからスクラッチで開発した新たな軽量TTSモデル「MioTTS」を公開しました！ 0.1B～2.6Bまで様々なサイズのモデルを公開しています！特に0.1Bは非常に小さいですが割とちゃんとした音声を合成できます。デモや推論コード、コーデックなども同時に公開しています。 huggingface.co/collections/Ar…

日本語

2

124

580

91K

はち me-retweet

ITmedia AI＋@itm_aiplus·10 Şub

日本政府、AIの社会実装を妨げている規制の情報を募集　制度見直しの参考に itmedia.co.jp/aiplus/article…

日本語

89

932

2K

678.8K

はち me-retweet

Haitham Bou Ammar@hbouammar·31 Oca

We found that much of LLM “reasoning” doesn’t come from RL training; it comes from how you sample the model. Building on power sampling (Karan & Du 2025), we show you can approximate global reasoning without MCMC, without training, and 10× faster. 🧠 Inference-time intelligence is real. 📝 Blog ↓ @haitham.bouammar71/we-didnt-train-the-model-it-started-reasoning-better-anyway-118dda6f9448?postPublishedType=initial" target="_blank" rel="nofollow noopener">medium.com/@haitham.bouam…

English

29

78

621

48.3K

はち me-retweet

Mason Daugherty@masondrxy·28 Oca

x.com/i/article/2015…

ZXX

10

47

309

91.7K

はち me-retweet

布留川英一 / Hidekazu Furukawa@npaka123·28 Oca

Gemini 3 Flash の新機能 Agentic Vision の概要｜npaka @npaka123 note.com/npaka/n/n1c294…

日本語

0

25

148

53.3K

はち me-retweet

alphaXiv@askalphaxiv·25 Oca

Learning to Discover at Test Time This paper TTT-Discover shows that by replacing best-of-N prompting with RL at test time on a continuous verifiable reward (via LoRA), it can learn from its own attempts and reliably push past the prior performance. The “learn-while-solving” loop during problem-solving is capable of improving GPT-OSS-120B's mathematical bounds, has it write faster GPU kernels, and top scores programming competitions "Assuming an average prompt length of 3000 tokens and 16000 sampling tokens on average, a training run with 50 steps and 512 rollouts costs around $500 on Tinker"

English

9

43

253

12.1K

はち me-retweet

isaac 🧩@isaacbmiller1·20 Oca

The dspy.RLM module is now released 👀 Install DSPy 3.1.2 to try it. Usage is plug-and-play with your existing Signatures. A little example of it helping @lateinteraction and I figure out some scattered backlogs:

English

28

83

476

132.5K

はち me-retweet

Anthropic@AnthropicAI·20 Oca

New Anthropic Fellows research: the Assistant Axis. When you’re talking to a language model, you’re talking to a character the model is playing: the “Assistant.” Who exactly is this Assistant? And what happens when this persona wears off?

English

321

586

5.2K

1.3M

はち@CurveWeb·14 Oca

なんかCluade調子悪いですね。 APIも含めて遅い。

日本語

0

1

497

はち me-retweet

DailyPapers@HuggingPapers·14 Oca

GlimpRouter A training-free framework that uses the entropy of a single token to route reasoning steps between small and large language models, reducing latency by 25.9% while boosting accuracy by 10.7% on AIME25.

English

1

2

17

1.5K

はち me-retweet

Karan Dalal@karansdalal·12 Oca

LLM memory is considered one of the hardest problems in AI. All we have today are endless hacks and workarounds. But the root solution has always been right in front of us. Next-token prediction is already an effective compressor. We don’t need a radical new architecture. The missing piece is to continue training the model at test-time, using context as training data. Our full release of End-to-End Test-Time Training (TTT-E2E) with @NVIDIAAI, @AsteraInstitute, and @StanfordAILab is now available. Blog: nvda.ws/4syfyMN Arxiv: arxiv.org/abs/2512.23675 This has been over a year in the making with @arnuvtandon and an incredible team.

English

91

325

2.1K

570.6K

はち me-retweet

Takuya Akiba@iwiwi·12 Oca

論文公開しました！RoPE、実は学習を手助けしているだけで、最終的には要らないかも、って論文です。NoPE（位置埋め込みなし）でも実は位置を扱えること自体は有名かもと思うのですが、実際のところ最初からNoPEだと学習うまく行かないんですよね。途中でRoPEをdropする"DroPE"でいいとこ取りします。

Sakana AI@SakanaAILabs

Introducing DroPE: Extending the Context of Pretrained LLMs by Dropping Their Positional Embeddings pub.sakana.ai/DroPE/ We are releasing a new method called DroPE to extend the context length of pretrained LLMs without the massive compute costs usually associated with long-context fine-tuning. The core insight of this work challenges a fundamental assumption in Transformer architecture. We discovered that explicit positional embeddings like RoPE are critical for training convergence but eventually become the primary bottleneck preventing models from generalizing to longer sequences. Our solution is radically simple: We treat positional embeddings as a temporary training scaffold rather than a permanent architectural necessity. Real-world workflows like reviewing massive code diffs or analyzing legal contracts require context windows that break standard pretrained models. While models without positional embeddings (NoPE) generalize better to these unseen lengths, they are notoriously unstable to train from scratch. Here, we achieve the best of both worlds by using embeddings to ensure stability during pretraining and then dropping them to unlock length extrapolation during inference. Our approach unlocks seamless zero-shot context extension without any expensive long-context training. We demonstrated this on a range of off-the-shelf open-source LLMs. In our tests, recalibrating any model with DroPE requires less than 1% of the original pretraining budget, yet it significantly outperforms established methods on challenging benchmarks like LongBench and RULER. We have released the code and the full paper to encourage the community to rethink the role of positional encodings in modern LLMs. Paper: arxiv.org/abs/2512.12167 Code: github.com/SakanaAI/DroPE

日本語

2

89

586

85K

はち me-retweet

ところてん@tokoroten·12 Oca

先日のプログラミングシンポジウムでの発表資料を公開しました LLMで遺伝的アルゴリズムをやって、システムプロンプトを漏洩させるような敵対的プロンプトを自動生成します 10年ぶりくらいにセキュリティ業界に戻ってきたちゃんと実験してCSECに持っていきたいがー docswell.com/s/tokoroten/KL…

日本語

2

44

283

25.3K

はち me-retweet

Sakana AI@SakanaAILabs·12 Oca

Introducing DroPE: Extending the Context of Pretrained LLMs by Dropping Their Positional Embeddings pub.sakana.ai/DroPE/ We are releasing a new method called DroPE to extend the context length of pretrained LLMs without the massive compute costs usually associated with long-context fine-tuning. The core insight of this work challenges a fundamental assumption in Transformer architecture. We discovered that explicit positional embeddings like RoPE are critical for training convergence but eventually become the primary bottleneck preventing models from generalizing to longer sequences. Our solution is radically simple: We treat positional embeddings as a temporary training scaffold rather than a permanent architectural necessity. Real-world workflows like reviewing massive code diffs or analyzing legal contracts require context windows that break standard pretrained models. While models without positional embeddings (NoPE) generalize better to these unseen lengths, they are notoriously unstable to train from scratch. Here, we achieve the best of both worlds by using embeddings to ensure stability during pretraining and then dropping them to unlock length extrapolation during inference. Our approach unlocks seamless zero-shot context extension without any expensive long-context training. We demonstrated this on a range of off-the-shelf open-source LLMs. In our tests, recalibrating any model with DroPE requires less than 1% of the original pretraining budget, yet it significantly outperforms established methods on challenging benchmarks like LongBench and RULER. We have released the code and the full paper to encourage the community to rethink the role of positional encodings in modern LLMs. Paper: arxiv.org/abs/2512.12167 Code: github.com/SakanaAI/DroPE

GIF

English

40

257

1.8K

455.1K

はち me-retweet

Anthropic@AnthropicAI·10 Oca

New Anthropic Research: next generation Constitutional Classifiers to protect against jailbreaks. We used novel methods, including practical application of our interpretability work, to make jailbreak protection more effective—and less costly—than ever. anthropic.com/research/next-…

English

81

134

1.1K

219.6K

はち@CurveWeb·9 Oca

@ChiekoTazuke ありがとうございます！

日本語

0

2

35