fatih c. akyon

1.7K posts

fatih c. akyon

@fcakyon

making ai useful at @ultralytics and @viddexa phd cand. at @metu_odtu

Katılım Şubat 2020

109 Takip Edilen1.4K Takipçiler

Sabitlenmiş Tweet

fatih c. akyon@fcakyon·24 Ara

I have just open-sourced a literature summary on ML-based content moderation and multimodal content rating! It includes various sources for audio, text, video modalities, links to datasets and related Python tools. Github: github.com/fcakyon/conten… Arxiv: arxiv.org/abs/2212.04533

GIF

English

378

95.8K

fatih c. akyon@fcakyon·5d

See you in the @ultralytics booth, in Denver, this June 💯

English

fatih c. akyon@fcakyon·5d

🔥 Presenting SenBen at CVPR Workshops 2026: senben.kim A 241M model just beat every frontier VLM we tested except Gemini, on grounded scene graph metrics for content moderation. Most moderation systems return a binary safe/unsafe label without telling you what was detected, who is in the frame, or where it occurs. SenBen turns moderation into a structured prediction task: 13,999 movie frames annotated as Visual Genome-style scene graphs, 16 sensitivity tags, and affective attributes like pain, fear, aggression, and distress. The student is distilled from Gemini 3 Pro using a Vocabulary-Aware Recall loss and a decoupled tag head. Full open release under MIT: dataset, training code, the VAR loss, SenBen-Score, all 18 baselines. Almost every safety pipeline at this scale is closed, which makes it impossible to compare or build on. We wanted to do the opposite. #ExplainableAI #AISafety #CVPR2026 #VLM x.com/fcakyon/status…

fatih c. akyon@fcakyon

English

184

fatih c. akyon@fcakyon·5d

x.com/i/article/2049…

ZXX

fatih c. akyon retweetledi

Sakana AI@SakanaAILabs·26 Nis

What if instead of building one giant AI, we evolved a coordinator to orchestrate a diverse team of specialized AIs? 🐟 Excited to share our new paper: “TRINITY: An Evolved LLM Coordinator”, published as a conference paper at #ICLR2026! Paper: arxiv.org/abs/2512.04695 In nature, complex problems are rarely solved by a single monolithic entity, but rather by the coordinated efforts of specialized individuals working together. Yet, modern AI development is heavily focused on endlessly scaling up single, massive monolithic models, yielding diminishing returns. While model merging offers a way to combine different skills, it is often impractical due to mismatched neural architectures and the closed-source nature of top-performing models. To address this, we took a macro-level approach: test-time model composition. We introduce TRINITY, a system that fuses the complementary strengths of diverse, state-of-the-art models without needing to modify their underlying weights. TRINITY processes queries over multiple turns. At each step, a lightweight coordinator assigns one of three distinct roles to an LLM from its available pool: 1/ Thinker: Devises high-level strategies and analyzes the current state. 2/ Worker: Executes concrete problem-solving steps. 3/ Verifier: Evaluates if the current solution is complete and correct. By dynamically assigning these roles, the coordinator effectively offloads complex reasoning and skill execution onto the external models. What makes TRINITY unique is its extreme efficiency. The coordinator relies on the hidden states of a compact language model and a small routing head. In total, it has fewer than 20K learnable parameters. Training this system presented a massive challenge. Traditional Reinforcement Learning (REINFORCE) failed because the gradients had a low signal-to-noise ratio due to binary rewards and weak parameter coupling. Imitation learning (Supervised Fine-Tuning) was ruled out because generating multi-turn labels is prohibitively expensive. Our solution? We turned to nature-inspired algorithms. We optimized the coordinator using a derivative-free evolutionary algorithm. We found that evolution is uniquely suited to optimize this tight, high-dimensional coordination problem where traditional gradient-based methods fail. The results are very promising. In our experiments, TRINITY consistently outperforms existing multi-agent methods and individual models across various benchmarks. At the time of publication, it set a new state-of-the-art record on LiveCodeBench, achieving an 86.2% pass@1 score. More importantly, it demonstrated incredible generalization. Without any retraining, TRINITY transferred zero-shot to four unseen tasks (AIME, BigCodeBench, MT-Bench, and GPQA). On average, the evolved coordinator surpassed every individual constituent model in its pool, including GPT-5, Gemini 2.5-Pro, and Claude-4-Sonnet (the top frontier models available at the time of our #ICLR2026 submission last year). This work is central to Sakana AI's vision. We believe the future of AI isn't just about scaling monolithic models, but engineering collaborative, diverse AI ecosystems that can adapt and combine their strengths. We invite the community to read the paper and explore these ideas! Paper: arxiv.org/abs/2512.04695 OpenReview: openreview.net/forum?id=5HaRj… This foundational research is part of the core engine powering our multi-agent product: Sakana Fugu 🐡👇

Sakana AI@SakanaAILabs

We’re launching the beta for our new commercial AI product: Sakana Fugu 🐡, a multi-agent orchestration system! Blog: sakana.ai/fugu-beta Fugu hits SOTA on SWE-Pro, GPQA-D, and ALE-Bench, and has been our internal secret weapon. It dynamically coordinates frontier models, autonomously selecting the optimal agent combinations and roles for each task. Available as an OpenAI-compatible API, you can seamlessly integrate Fugu into your existing workflows with minimal changes. 🐟 Fugu Mini: High-speed orchestration optimized for latency 🐡 Fugu Ultra: Full model pool utilization for deep, complex reasoning Apply for the beta test here: forms.gle/BtKkhc2CfLKk1d…

English

407

95.9K

د. عبدالرحمن ذياب@dr2alshehri·20 Nis

مهارات الدكتوراه لاستخدام Claude Code: يشارك @fcakyon (أكثر من 1300 استشهاد علمي و7 براءات اختراع) خبراته العملية المجرّبة حول مهارات الدكتوراه في استخدام Claude Code. رابط مستودع GitHub في الرد أدناه:

العربية

142

9.3K

fatih c. akyon@fcakyon·20 Nis

@dr2alshehri شكراً على المشاركة!

العربية

213

fatih c. akyon@fcakyon·20 Nis

@JeremyNguyenPhD Thanks for sharing 🙏🏻

English

Jeremy Nguyen ✍🏼 🚢@JeremyNguyenPhD·20 Nis

2/ github.com/fcakyon/phd-sk…

3.1K

Jeremy Nguyen ✍🏼 🚢@JeremyNguyenPhD·20 Nis

PhD Skills for Claude Code: @fcakyon (1300+ citations, 7 patents) shares his battle-tested PhD Skills for Claude Code. github repo link in reply below:

English

457

33.3K

fatih c. akyon@fcakyon·20 Nis

More updates on the way, stay tuned 😎

Jeremy Nguyen ✍🏼 🚢@JeremyNguyenPhD

PhD Skills for Claude Code: @fcakyon (1300+ citations, 7 patents) shares his battle-tested PhD Skills for Claude Code. github repo link in reply below:

English

1.7K

fatih c. akyon@fcakyon·19 Nis

Uses YOLO based object detection 👌🏻

Dudes Posting Their W’s@DudespostingWs

Japanese engineers developed a “Sword Tip Visualization System” for the Fencing World Championships, and it makes fencing look absolutely incredible to watch.

English

325

fatih c. akyon@fcakyon·17 Nis

my reaction when Claude tries to drop her own personal suggestions during a conversation

fatih c. akyon@fcakyon

started reading @GoogleDeepMind /tips2 paper (cvpr 2026) arxiv.org/abs/2604.12012 code also released (ofc no training code, give me a week to reverse engineer it 😎): github.com/google-deepmin… its raining distillation papers these days 🔥 @nvidia /cradiov4, @Meta /eupe and now this, whos next? @ultralytics?

English

397

fatih c. akyon@fcakyon·17 Nis

English

648

fatih c. akyon retweetledi

Maxime Labonne@maximelabonne·10 Nis

I'm releasing the 34 slides on how we design and train best-in-class edge models at @liquidai I presented these slides yesterday at @aiDotEngineer They cover model architecture, pre-training, scaling laws, post-training, and even a solution to fix doom loops Special thanks to @swyx for the invitation!

English

548

27.9K

fatih c. akyon@fcakyon·10 Nis

Publishing charts with manipulated double axes should be illegal 🥲

Claude@claudeai

In evals, Sonnet with an Opus advisor scored 2.7 percentage points higher on SWE-bench Multilingual than Sonnet alone, while costing 11.9% less per task.

English

130

fatih c. akyon@fcakyon·9 Nis

@skirano @robdel12 API costs tell nothing about the model sizes, they tell you about the GTM strategy, profitability and cash runway

English

Pietro Schirano@skirano·8 Nis

@robdel12 Just look at API cost

English

14.2K

Pietro Schirano@skirano·8 Nis

People seem to forget that while GPT-5.4 rivals and even surpasses Opus 4.6 in certain tasks, it's still just the size of Sonnet.

English

1.8K

171.2K

fatih c. akyon retweetledi

Z.ai@Zai_org·7 Nis

SOTA on SWE-Bench Pro (58.4): GLM-5.1 delivers significant leaps in coding and agentic performance.

English

976

198.4K

fatih c. akyon retweetledi

merve@mervenoyann·6 Nis

tip: Gemma 4 exposes a thought channel for reasoning if you fine-tune it on reasoning and answers, you can inject thinking channel like below ⤵️

English

158

9.2K

fatih c. akyon@fcakyon·5 Nis

Its a good model but definitely not glm5 minimax 2.7 level

Beff (e/acc)@beffjezos

Google's Gemma 4 on a 128 GB Macbook Pro is near AGI on the go, no internet needed

English

195

fatih c. akyon retweetledi

Ian Landsman@IanLandsman·5 Nis

Feel like pretty soon this will be irrelevant anyway, have a Gemma 4 type modal just run on machine to do normal BS, save frontier models for real stuff.

English

236

12.3K

fatih c. akyon retweetledi

Ahmad@TheAhmadOsman·2 Nis

MASSIVE Gemma 4 (31B, Dense), a model that performs on parity w/ Kimi K2.5 (1.1T, MoE) > 35x SMALLER than Kimi K2.5 Would run on any hardware at home - RTX 3090 / 4090 / 5090 * - DGX Spark / Mac Studios - MacBook Pro (24GB+) New local SoTA * Best perf. is when ran on GPUs

English

633

47.7K

fatih c. akyon retweetledi

Logan Kilpatrick@OfficialLoganK·2 Nis

Introducing Gemma 4, our series of open weight (Apache 2.0 licensed) models, which are byte for byte the most capable open models in the world! Gemma 4 is build to run on your hardware: phones, laptops, and desktops. Frontier intelligence with a 26B MOE and a 31B Dense model!

English

287

598

6.2K

522.8K

Keşfet

@ultralytics @dr2alshehri @JeremyNguyenPhD @GoogleDeepMind @nvidia @Meta @liquidai @aiDotEngineer