woner
8.3K posts

woner
@mywoner
AI丨LLM Security丨Privacy Computing丨Representation Learning丨Nyaa~丨中英日zh,en,ja三语丨雅俗和谐
China shenzhen&gz/HongKong Katılım Ağustos 2013
1.9K Takip Edilen1.3K Takipçiler

Codex usage limits have now been reset across all paid plans. Enjoy the weekend!
Tibo@thsottiaux
We found and fixed two issues that could explain this degradation of the capability of GPT-5.5 in Codex over the last ~ 48 hours. We are monitoring over the coming hours to fully confirm and I will reset usage limits this evening. Apologies and now is the time for /fast maxxing.
English

@ReitsukiSion 再给点时间吧..算法不会差,推理算力得等下半年的950,pre/post-train 的数据更能拉开差距,composer 2 就是一个例子
中文
woner retweetledi

BREAKING: President Trump gives a toast to President Xi and invites him to the White House for an official visit in September:
"Thank you again, President Xi, for this beautiful welcome... It is my honor to extend an invitation to you and Madam Peng to visit us at the White House, September 24th, and we look forward to it."
"I now like to raise a glass and propose a toast to the rich and enduring ties between the American and Chinese people. It's a very special relationship, and I want to thank you again. This has been an amazing period of time. Thank you, President Xi."
English
woner retweetledi

Introducing Aurora, a new optimizer for training frontier-scale models.
We train Aurora-1.1B, which achieves 100x data efficiency on open-source internet data. Despite having 25% fewer parameters, 2 orders of magnitude fewer training tokens, and using fully open-source internet-only data, Aurora matches Qwen3-1.7B on several benchmarks.
Aurora was developed after identifying a major failure mode that can occur under Muon, an increasingly popular optimizer that has shown strong gains over Adam(W). We find that Muon can cause a huge percentage of neurons to effectively die early in training, reducing effective network capacity so that many parameters no longer meaningfully contribute to network outputs.
By redistributing update energy more uniformly across neurons while preserving Muon’s stability properties, Aurora prevents neuron death and recovers substantial model capacity.
What makes this work especially exciting is that it points toward a broader direction for ML research: better optimizers may not come purely from elegant mathematical abstractions, but from understanding and addressing the concrete dynamics and pathologies that emerge inside real training systems.
Tilde@tilderesearch
English
woner retweetledi





