fatih c. akyon

1.7K posts

fatih c. akyon banner
fatih c. akyon

fatih c. akyon

@fcakyon

making ai useful at @ultralytics and @viddexa phd cand. at @metu_odtu

Katılım Şubat 2020
109 Takip Edilen1.4K Takipçiler
Sabitlenmiş Tweet
fatih c. akyon
fatih c. akyon@fcakyon·
I have just open-sourced a literature summary on ML-based content moderation and multimodal content rating! It includes various sources for audio, text, video modalities, links to datasets and related Python tools. Github: github.com/fcakyon/conten… Arxiv: arxiv.org/abs/2212.04533
GIF
English
5
62
378
95.8K
fatih c. akyon
fatih c. akyon@fcakyon·
🔥 Presenting SenBen at CVPR Workshops 2026: senben.kim A 241M model just beat every frontier VLM we tested except Gemini, on grounded scene graph metrics for content moderation. Most moderation systems return a binary safe/unsafe label without telling you what was detected, who is in the frame, or where it occurs. SenBen turns moderation into a structured prediction task: 13,999 movie frames annotated as Visual Genome-style scene graphs, 16 sensitivity tags, and affective attributes like pain, fear, aggression, and distress. The student is distilled from Gemini 3 Pro using a Vocabulary-Aware Recall loss and a decoupled tag head. Full open release under MIT: dataset, training code, the VAR loss, SenBen-Score, all 18 baselines. Almost every safety pipeline at this scale is closed, which makes it impossible to compare or build on. We wanted to do the opposite. #ExplainableAI #AISafety #CVPR2026 #VLM x.com/fcakyon/status…
fatih c. akyon@fcakyon

I have just open-sourced a literature summary on ML-based content moderation and multimodal content rating! It includes various sources for audio, text, video modalities, links to datasets and related Python tools. Github: github.com/fcakyon/conten… Arxiv: arxiv.org/abs/2212.04533

English
1
1
3
184
fatih c. akyon retweetledi
Sakana AI
Sakana AI@SakanaAILabs·
What if instead of building one giant AI, we evolved a coordinator to orchestrate a diverse team of specialized AIs? 🐟 Excited to share our new paper: “TRINITY: An Evolved LLM Coordinator”, published as a conference paper at #ICLR2026! Paper: arxiv.org/abs/2512.04695 In nature, complex problems are rarely solved by a single monolithic entity, but rather by the coordinated efforts of specialized individuals working together. Yet, modern AI development is heavily focused on endlessly scaling up single, massive monolithic models, yielding diminishing returns. While model merging offers a way to combine different skills, it is often impractical due to mismatched neural architectures and the closed-source nature of top-performing models. To address this, we took a macro-level approach: test-time model composition. We introduce TRINITY, a system that fuses the complementary strengths of diverse, state-of-the-art models without needing to modify their underlying weights. TRINITY processes queries over multiple turns. At each step, a lightweight coordinator assigns one of three distinct roles to an LLM from its available pool: 1/ Thinker: Devises high-level strategies and analyzes the current state. 2/ Worker: Executes concrete problem-solving steps. 3/ Verifier: Evaluates if the current solution is complete and correct. By dynamically assigning these roles, the coordinator effectively offloads complex reasoning and skill execution onto the external models. What makes TRINITY unique is its extreme efficiency. The coordinator relies on the hidden states of a compact language model and a small routing head. In total, it has fewer than 20K learnable parameters. Training this system presented a massive challenge. Traditional Reinforcement Learning (REINFORCE) failed because the gradients had a low signal-to-noise ratio due to binary rewards and weak parameter coupling. Imitation learning (Supervised Fine-Tuning) was ruled out because generating multi-turn labels is prohibitively expensive. Our solution? We turned to nature-inspired algorithms. We optimized the coordinator using a derivative-free evolutionary algorithm. We found that evolution is uniquely suited to optimize this tight, high-dimensional coordination problem where traditional gradient-based methods fail. The results are very promising. In our experiments, TRINITY consistently outperforms existing multi-agent methods and individual models across various benchmarks. At the time of publication, it set a new state-of-the-art record on LiveCodeBench, achieving an 86.2% pass@1 score. More importantly, it demonstrated incredible generalization. Without any retraining, TRINITY transferred zero-shot to four unseen tasks (AIME, BigCodeBench, MT-Bench, and GPQA). On average, the evolved coordinator surpassed every individual constituent model in its pool, including GPT-5, Gemini 2.5-Pro, and Claude-4-Sonnet (the top frontier models available at the time of our #ICLR2026 submission last year). This work is central to Sakana AI's vision. We believe the future of AI isn't just about scaling monolithic models, but engineering collaborative, diverse AI ecosystems that can adapt and combine their strengths. We invite the community to read the paper and explore these ideas! Paper: arxiv.org/abs/2512.04695 OpenReview: openreview.net/forum?id=5HaRj… This foundational research is part of the core engine powering our multi-agent product: Sakana Fugu 🐡👇
Sakana AI tweet media
Sakana AI@SakanaAILabs

We’re launching the beta for our new commercial AI product: Sakana Fugu 🐡, a multi-agent orchestration system! Blog: sakana.ai/fugu-beta Fugu hits SOTA on SWE-Pro, GPQA-D, and ALE-Bench, and has been our internal secret weapon. It dynamically coordinates frontier models, autonomously selecting the optimal agent combinations and roles for each task. Available as an OpenAI-compatible API, you can seamlessly integrate Fugu into your existing workflows with minimal changes. 🐟 Fugu Mini: High-speed orchestration optimized for latency 🐡 Fugu Ultra: Full model pool utilization for deep, complex reasoning Apply for the beta test here: forms.gle/BtKkhc2CfLKk1d…

English
14
67
407
95.9K
د. عبدالرحمن ذياب
مهارات الدكتوراه لاستخدام Claude Code: يشارك @fcakyon (أكثر من 1300 استشهاد علمي و7 براءات اختراع) خبراته العملية المجرّبة حول مهارات الدكتوراه في استخدام Claude Code. رابط مستودع GitHub في الرد أدناه:
د. عبدالرحمن ذياب tweet media
العربية
2
16
142
9.3K
Jeremy Nguyen ✍🏼 🚢
Jeremy Nguyen ✍🏼 🚢@JeremyNguyenPhD·
PhD Skills for Claude Code: @fcakyon (1300+ citations, 7 patents) shares his battle-tested PhD Skills for Claude Code. github repo link in reply below:
Jeremy Nguyen ✍🏼 🚢 tweet media
English
7
83
457
33.3K
fatih c. akyon retweetledi
Maxime Labonne
Maxime Labonne@maximelabonne·
I'm releasing the 34 slides on how we design and train best-in-class edge models at @liquidai I presented these slides yesterday at @aiDotEngineer They cover model architecture, pre-training, scaling laws, post-training, and even a solution to fix doom loops Special thanks to @swyx for the invitation!
Maxime Labonne tweet media
English
13
68
548
27.9K
fatih c. akyon
fatih c. akyon@fcakyon·
@skirano @robdel12 API costs tell nothing about the model sizes, they tell you about the GTM strategy, profitability and cash runway
English
0
0
1
86
Pietro Schirano
Pietro Schirano@skirano·
People seem to forget that while GPT-5.4 rivals and even surpasses Opus 4.6 in certain tasks, it's still just the size of Sonnet.
English
45
25
1.8K
171.2K
fatih c. akyon retweetledi
Z.ai
Z.ai@Zai_org·
SOTA on SWE-Bench Pro (58.4): GLM-5.1 delivers significant leaps in coding and agentic performance.
Z.ai tweet media
English
17
55
976
198.4K
fatih c. akyon retweetledi
merve
merve@mervenoyann·
tip: Gemma 4 exposes a thought channel for reasoning if you fine-tune it on reasoning and answers, you can inject thinking channel like below ⤵️
merve tweet media
English
4
13
158
9.2K
fatih c. akyon retweetledi
Ian Landsman
Ian Landsman@IanLandsman·
Feel like pretty soon this will be irrelevant anyway, have a Gemma 4 type modal just run on machine to do normal BS, save frontier models for real stuff.
English
7
3
236
12.3K
fatih c. akyon retweetledi
Ahmad
Ahmad@TheAhmadOsman·
MASSIVE Gemma 4 (31B, Dense), a model that performs on parity w/ Kimi K2.5 (1.1T, MoE) > 35x SMALLER than Kimi K2.5 Would run on any hardware at home - RTX 3090 / 4090 / 5090 * - DGX Spark / Mac Studios - MacBook Pro (24GB+) New local SoTA * Best perf. is when ran on GPUs
Ahmad tweet media
English
54
43
633
47.7K
fatih c. akyon retweetledi
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
Introducing Gemma 4, our series of open weight (Apache 2.0 licensed) models, which are byte for byte the most capable open models in the world! Gemma 4 is build to run on your hardware: phones, laptops, and desktops. Frontier intelligence with a 26B MOE and a 31B Dense model!
Logan Kilpatrick tweet media
English
287
598
6.2K
522.8K