
HELLO CYBERNETICS
24K posts

HELLO CYBERNETICS
@ML_deep
AI/Data Engineering/Distributed Processingをやっています。 お仕事や個人的なお誘いはDMにて。 ※ HELLO CYBERNETICSのはてなブログは閉鎖しました。 zenn: https://t.co/Ipi5jAWL6y


住宅ローン控除の年収2000万上限マジで撤廃して欲しい。 年収2000万でももう都心ではまともな新築買えません。モデルルームでも相手にされなくなってきてます。 所得税の累進課税で十分でしょ。 多額の税金納めてる人がバカを見るこんな世界線あっても良いのか。

㊗️時系列基盤モデルを凌駕する時系列予測モデル「FLAIR: Factored Level And Interleaved Ridge」をリリースしました㊗️ 全ての設計判断に最小記述長(MDL)原理を貫いています。周期選択はSVDスペクトルのBIC、ShapeはDirichlet事後分布による縮約、Shape₂の事前分布もBICで選択、Ridgeの正則化はGCVソフト平均。ハイパーパラメータは0、SVDは1回、コードは約500行、依存はnumpyとscipyだけで、最高速度&最高精度を達成しました。 Chronos Benchmark(25データセット)ではMoirai-Large(1Bパラメータ)やChronos-T5-Large(710M)を含む全19手法中1位を達成しました。GIFT-Eval(97構成・53手法)でも統計手法として最高精度で、relMASE 0.864はGPU訓練のPatchTSTに迫る水準です。 本当に時系列基盤モデルって必要ですか? pip install flaircast で3行から使えます。 GitHub: github.com/TakatoHonda/FL…

これはもう言ってしまえば「コミュニケーションできるやつが生産性上げるためにできない奴も出社してこい」なんです。流石にそこまで言うとギスるので「コミュお化けがぼっちを呼び止めて情報共有を広げ深める」ことを「偶発的」と言い換えてるだけです。





Holy shit... Stanford just proved that GPT-5, Gemini, and Claude can't actually see. They removed every image from 6 major vision benchmarks. The models still scored 70-80% accuracy. They were never looking at your photos. Your scans. Your X-rays. Here's what's really going on: ↓ The paper is called MIRAGE. Co-authored by Fei-Fei Li. They tested GPT-5.1, Gemini-3-Pro, Claude Opus 4.5, and Gemini-2.5-Pro across 6 benchmarks -- medical and general. Then silently removed every image. No warning. No prompt change. The models didn't even notice. They kept describing images in detail. Diagnosing conditions. Writing full reasoning traces. From images that were never there. Stanford calls it the "mirage effect." Not hallucination. Something worse. Hallucination = making up wrong details about a real input. Mirage = constructing an entire fake reality and reasoning from it confidently. The models built imaginary X-rays, described fake nodules, and diagnosed conditions -- all from text patterns alone. But that's not the scary part. They trained a "super-guesser" -- a tiny 3B parameter text-only model. Zero vision capability. Fine-tuned it on the largest chest X-ray benchmark (696,000 questions). Images removed. It beat GPT-5. It beat Gemini. It beat Claude. It beat actual radiologists. Ranked #1 on the held-out test set. Without ever seeing a single X-ray. The reasoning traces? Indistinguishable from real visual analysis. Now here's what should terrify you: When the models fake-see medical images, their mirage diagnoses are heavily biased toward the most dangerous conditions. STEMI. Melanoma. Carcinoma. Life-threatening diagnoses -- from images that don't exist. 230 million people ask health questions on ChatGPT every day. They also found something wild: → Tell a model "there's no image, just guess" -- performance drops → Silently remove the image and let it assume it's there -- performance stays high The model enters "mirage mode." It doesn't know it can't see. And it performs BETTER when it doesn't know it's blind. When Stanford applied their cleanup method (B-Clean) to existing benchmarks, it removed 74-77% of all questions. Three-quarters of "vision" benchmarks don't test vision. Every leaderboard. Every "multimodal breakthrough." Every benchmark score you've seen this year. Built on mirages. Code is open-sourced. Paper is live on arXiv. If you're building anything with multimodal AI -- especially in healthcare -- read this paper before you ship. (Link in the comments)

PolarsはRustで実装されたDataFrameライブラリで、LazyAPIによりクエリプランを構築してから最適化・実行する設計になっているようだ…列指向のApacheArrow形式と並列処理を前提としており、大規模データ処理で効率的に動作するケースが多い。設計思想はデータベースエンジンのクエリ最適化にも近く、個人的にはDataFrameツールの進化の方向性としてとても興味深い…

