Salih Joshua Kilicli

2.2K posts

Salih Joshua Kilicli

@math3manticus

Houston, TX Entrou em Ağustos 2020

623 Seguindo373 Seguidores

Tweet fixado

Salih Joshua Kilicli@math3manticus·9 Mar

I created a list of resources I utilized during my Data Science learning process. I mainly included free ebooks and GitHub repos for the books. This list is biased towards my personal experience and doesn't include any affiliate links: docs.google.com/document/d/1Gs… #datascience

English

152

Salih Joshua Kilicli@math3manticus·1d

This is why we love Google! 🤗

Google DeepMind@GoogleDeepMind

Meet Gemma 4: our new family of open models you can run on your own hardware. Built for advanced reasoning and agentic workflows, we’re releasing them under an Apache 2.0 license. Here’s what’s new 🧵

English

Salih Joshua Kilicli retweetou

FOX Soccer@FOXSoccer·3d

🇹🇷 TÜRKIYE vs USA 🇺🇸 The @USMNT will face off against Türkiye in its final group stage game at the 2026 FIFA World Cup this summer on FOX!

English

479

662

7.5K

Salih Joshua Kilicli retweetou

Ahmad@TheAhmadOsman·28 Mar

BREAKING Elon Musk endorsed my Top 26 Essential Papers for Mastering LLMs and Transformers Implement those and you’ve captured ~90% of the alpha behind modern LLMs. Everything else is garnish. This list bridges the Transformer foundations with the reasoning, MoE, and agentic shift Recommended Reading Order 1. Attention Is All You Need (Vaswani et al., 2017) > The original Transformer paper. Covers self-attention, > multi-head attention, and the encoder-decoder structure > (even though most modern LLMs are decoder-only.) 2. The Illustrated Transformer (Jay Alammar, 2018) > Great intuition builder for understanding > attention and tensor flow before diving into implementations 3. BERT: Pre-training of Deep Bidirectional Transformers (Devlin et al., 2018) > Encoder-side fundamentals, masked language modeling, > and representation learning that still shape modern architectures 4. Language Models are Few-Shot Learners (GPT-3) (Brown et al., 2020) > Established in-context learning as a real > capability and shifted how prompting is understood 5. Scaling Laws for Neural Language Models (Kaplan et al., 2020) > First clean empirical scaling framework for parameters, data, and compute > Read alongside Chinchilla to understand why most models were undertrained 6. Training Compute-Optimal Large Language Models (Chinchilla) (Hoffmann et al., 2022) > Demonstrated that token count matters more than > parameter count for a fixed compute budget 7. LLaMA: Open and Efficient Foundation Language Models (Touvron et al., 2023) > The paper that triggered the open-weight era > Introduced architectural defaults like RMSNorm, SwiGLU > and RoPE as standard practice 8. RoFormer: Rotary Position Embedding (Su et al., 2021) > Positional encoding that became the modern default for long-context LLMs 9. FlashAttention (Dao et al., 2022) > Memory-efficient attention that enabled long context windows > and high-throughput inference by optimizing GPU memory access. 10. Retrieval-Augmented Generation (RAG) (Lewis et al., 2020) > Combines parametric models with external knowledge sources > Foundational for grounded and enterprise systems 11. Training Language Models to Follow Instructions with Human Feedback (InstructGPT) (Ouyang et al., 2022) > The modern post-training and alignment blueprint > that instruction-tuned models follow 12. Direct Preference Optimization (DPO) (Rafailov et al., 2023) > A simpler and more stable alternative to PPO-based RLHF > Preference alignment via the loss function 13. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (Wei et al., 2022) > Demonstrated that reasoning can be elicited through prompting > alone and laid the groundwork for later reasoning-focused training 14. ReAct: Reasoning and Acting (Yao et al., 2022 / ICLR 2023) > The foundation of agentic systems > Combines reasoning traces with tool use and environment interaction 15. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (Guo et al., 2025) > The R1 paper. Proved that large-scale reinforcement learning without > supervised data can induce self-verification and structured reasoning behavior 16. Qwen3 Technical Report (Yang et al., 2025) > A modern architecture lightweight overview > Introduced unified MoE with Thinking Mode and Non-Thinking > Mode to dynamically trade off cost and reasoning depth 17. Outrageously Large Neural Networks: Sparsely-Gated Mixture of Experts (Shazeer et al., 2017) > The modern MoE ignition point > Conditional computation at scale 18. Switch Transformers (Fedus et al., 2021) > Simplified MoE routing using single-expert activation > Key to stabilizing trillion-parameter training 19. Mixtral of Experts (Mistral AI, 2024) > Open-weight MoE that proved sparse models can match dense quality > while running at small-model inference cost 20. Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints (Komatsuzaki et al., 2022 / ICLR 2023) > Practical technique for converting dense checkpoints into MoE models > Critical for compute reuse and iterative scaling 21. The Platonic Representation Hypothesis (Huh et al., 2024) > Evidence that scaled models converge toward shared > internal representations across modalities 22. Textbooks Are All You Need (Gunasekar et al., 2023) > Demonstrated that high-quality synthetic data allows > small models to outperform much larger ones 23. Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet (Templeton et al., 2024) > The biggest leap in mechanistic interpretability > Decomposes neural networks into millions of interpretable features 24. PaLM: Scaling Language Modeling with Pathways (Chowdhery et al., 2022) > A masterclass in large-scale training > orchestration across thousands of accelerators 25. GLaM: Generalist Language Model (Du et al., 2022) > Validated MoE scaling economics with massive > total parameters but small active parameter counts 26. The Smol Training Playbook (Hugging Face, 2025) > Practical end-to-end handbook for efficiently training language models Bonus Material > T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (Raffel et al., 2019) > Toolformer (Schick et al., 2023) > GShard (Lepikhin et al., 2020) > Adaptive Mixtures of Local Experts (Jacobs et al., 1991) > Hierarchical Mixtures of Experts (Jordan and Jacobs, 1994) If you deeply understand these fundamentals; Transformer core, scaling laws, FlashAttention, instruction tuning, R1-style reasoning, and MoE upcycling, you already understand LLMs better than most Time to lock-in, good luck!

English

146

1.3K

57.6K

Salih Joshua Kilicli retweetou

Thin Signal@thin_signal·26 Mar

Not all KV caches are equal. We implemented TurboQuant (rotation-based vector quantization from Google's recent paper) for KV cache compression on Apple Silicon. The key idea: randomly rotate vectors before quantization to preserve the inner products that attention needs.

English

30.8K

Salih Joshua Kilicli retweetou

ngrok@ngrokHQ·25 Mar

Quantization can make an LLM 4x smaller and 2x faster, with barely any quality loss. But what *is* it? @samwhoo crafted a beautiful interactive essay explaining it from first principles, aimed at coders, not mathematicians. ngrok.com/blog/quantizat…

English

193

1.6K

648.3K

Salih Joshua Kilicli retweetou

Emre Can Kartal@eckartal·21 Mar

> be Asimov > buys a Unitree G1 > starts testing policies > knee breaks > waits 2 months for one replacement part > realizes closed humanoids kill development speed > builds its own humanoid robot > makes it open-source > puts everything on GitHub, from policies to parts > launches a humanoid DIY kit with a $499 deposit > does almost $1M in sales in 30 days > starts setting up manufacturing > begins shipments in the next few months Asimov gives every AI and robotics builder a fully customizable humanoid robot. Open.

Asimov@asimovinc

You can build your own humanoid at home. Asimov – Here be Dragons is now available for presale. $499 deposit, $15,000 target price. asimov.inc/diy-kit

English

226

3.5K

444.5K

Salih Joshua Kilicli@math3manticus·9 Mar

@WellnessWisdomm @brianwut Yeah, that’s not a real ice cream.

English

Wellness Wisdom@WellnessWisdomm·9 Mar

@brianwut No, ice cream is not a healthy food. Mono- and diglycerides are not good for health.

English

4.2K

Brian 🔰@brianwut·8 Mar

harvard can't figure out why ice cream eaters are healthier. it's because ice cream is the only food nobody eats out of obligation or guilt. the food diary is an accidental personality test and "eats ice cream on purpose, reports it honestly" is just measuring internal locus of control

English

185

472

15K

7.3M

Salih Joshua Kilicli@math3manticus·9 Mar

@atakankurt2 Baris’in pozisyonu haric hepsi faul. Senin bunu paylasman komik olmus.

Türkçe

259

Atakan Kurt@atakankurt2·9 Mar

Beşiktaş sezon başında çok iyi bir kamp dönemi geçirip, fizik olarak harika bir seviyeye çıkmalı. Galatasaray maçında fizik olarak ciddi anlamda yetersiz kalındı.

Club Regista@ClubRegista

Fiziken de ezilmişler!

Türkçe

134

31.6K

Salih Joshua Kilicli@math3manticus·8 Mar

@ASametSarikaya Ne kadar beyinsiz toplulugu varsa yorumlara dusmus. Direkt engelliyorum bu amipleri.

Türkçe

Salih Joshua Kilicli@math3manticus·8 Mar

@ASametSarikaya Ben eski arkadaslarimla minimal gorusuyorum. GS’li birisiyle tanisirsam bir daha denk gelmemeye calisiyorum. Ahlaksiza ahlak anlatamazsin, ustune bir de ahlak dersi dinlersin. Bunlari kendi basina birakmak lazim ki birbirlerini yesinler cikarlari catisinca.

Türkçe

1.8K

A.Samet Sarıkaya🇹🇷@ASametSarikaya·8 Mar

Bu galatasaraylıların en müslümanına, (tövbe haşa) Allah 1 değil de 2 dersen size kupa vercez deyin, ne 2 si en az 3 falan diye cevap verirler. O yüzden hiçbir gs li ile münakaşaya girmeyin futbol sohbeti etmeyin, hatta normal sohbet falan da etmeyin mk

Türkçe

360

474

3.9K

106.7K

Salih Joshua Kilicli@math3manticus·8 Mar

@firatgunayer Bunlar tek kullanimlik hakemler. Kullanip atarlar, bir daha da hakemlik yapamaz. Ama bein vs de yorumcu hakem uzmani olarak yerlestirirler. Yillardir ayni senaryo.

Türkçe

Fırat Günayer@firatgunayer·7 Mar

Neredeyse her kararı tartışmalı, berbat bir hakem yönetimi. İşin kötüsü bu atama yapıldığında herkes, bu hakemin bu maçı iyi yönetemeyeceğini biliyordu. Mhk'nin göz göre göre bu ismi tercih ettiği yerde, ne marka değeri konuşulur, ne futbol konuşulur.

Türkçe

380

638

8.9K

146.8K

Salih Joshua Kilicli@math3manticus·7 Mar

@alperbah Osimhen ilk dk’larda Ersin’in baldirima basti cok net sekilde. Cok hizli bir sekilde gecistirdiler ozette, asil acidan vermediler bilerek. Hakem gordu ki sari verdi, bize olsa direk kirmizi verirlerdi. Sane 2 kirmizilik asist yaptirip oyle attilar. Sallai’nin kac tane sarisi var.

Türkçe

179

Aʟᴘᴇʀ Bᴀʜᴀᴅıʀ@alperbah·7 Mar

Osimhen düdükten donra top vurduğu an hızla kafayı cevirdi Atmamak için şu hale düşüyorlar ya sözün bittiği yer

Türkçe

119

112

1.7K

60.1K

Salih Joshua Kilicli@math3manticus·7 Mar

@Theatiba10 @BVekili03 Rakibin 4 tane orta saha cikarip 4 tane defans oyuncusu aldi. Zaten kapali takimlara gol atamiyoruz, ama herkes kötüydü bugün. Bu kadar pas hatasi yaptigimiz bir mac hatirlamiyorum.

Türkçe

The ATİBA ⚫⚪@Theatiba10·7 Mar

@BVekili03 Hakem faktörü eyvallah ama neredeyse 30 dakika 10 kişi oynamış rakibe gol atamadin .. Bunun bahanesi sadece hakem faktörü olamaz .. Ayrıca Bu sezon ligde 4 derbi maçta 2 mağlubiyet aldın Ts'yi bile yenemedin Ee nasıl şampiyon olacaksın bu karne ile ?

Türkçe

Salih Joshua Kilicli@math3manticus·7 Mar

@Yusuf_Remmy @TrollFootball No, this was at the 30th min. He made an assist after this one. Anything for GS in this corrupted league.

English

Captain Remi 🇳🇬@Yusuf_Remmy·7 Mar

@TrollFootball He got a red card for that

English

17.7K

Troll Football@TrollFootball·7 Mar

Galatasaray aren't even hiding it anymore...

English

422

3.7K

31.4K

1.1M

Salih Joshua Kilicli@math3manticus·6 Mar

@jianxliao I had an interview where they asked the structure / pseudo-code instead of forcing me to remember every function, syntax etc. This is the way for future of coding interviews. I prefer not to waste my finite memory for remembering syntax. I can look it up or ask an LLM instead.

English

140

jian@jianxliao·5 Mar

Yesterday I had my first coding interview, at one of the big AI labs, after 4 years of being a founder. It was a disaster. The task? agents algorithm. something I work with literally every single day. I forgot basic js syntax. blanked on how to delete an array element. panicked on recursion. The solution was crystal clear in my head. I could see exactly how to write it. but my hands just... couldn't. The knowledge is there. the muscle memory is gone. 3 years of vibe coding did this to me. I haven't written code manually since. I just read it, design systems, think in architectures. Somewhere along the startup journey, I stopped being a coder. I became someone who just ships. Am I alone in this? Sitting there, embarrassed, I think that's actually the right direction. We used to write code. now we read it. soon with agentic engineering, we won't even need to read it. we'll just architect.

English

272

114

3.1K

383.4K

Salih Joshua Kilicli@math3manticus·5 Mar

@FenerSol1907 Bu öyle bir istatistik ki gercegi çok daha kötüsüdür. Malum takima her dokunmaya faul, diger takimlara her 3-4 faule bir faul veriliyor. Yani aradaki fark gercekte cok daha büyüktür.

Türkçe

1.2K

FenerSol@FenerSol1907·5 Mar

Taylan Antalyalı'nın Anadolu takımlarında oynarken diğer takımlara karşı yaptığı faul ortalamaları. Fenerbahçe 20 faul %41,7 Beşiktaş 13 faul %27,1 Trabzonspor 11 faul %22,9 GS (ayrıldıktan sonra) 4 faul %8,3 x.com/Bastardis_Fb/s…

Türkçe

563

3.3K

197K

Salih Joshua Kilicli@math3manticus·4 Mar

@jeankaddour Base model vs quantized version with learning rate schedulers?

English

119

Jean Kaddour@jeankaddour·3 Mar

ML interview question: What is happening here?

English

156

564

145.3K

Salih Joshua Kilicli@math3manticus·1 Mar

@buyruc Emeklilik diyorsun yani :)

Türkçe

1.2K

emrec@buyruc·1 Mar

sakın unutmayın, bir insanın kariyerinde gelebileceği en ileri nokta linkedin’ini silmektir.

Türkçe

129

592

16.3K

1.1M

Salih Joshua Kilicli@math3manticus·1 Mar

@elonmusk Best thing happened to X so far! I wish there was a filter for AI Flop posts as well :)

English

Elon Musk@elonmusk·28 Şub

If you’d like to see less politics or crypto, just select all except those ones. This is an “early access” feature, so expect it to improve rapidly.

Dan@KettlebellDan

an immediate improvement 🔥

English

7.9K

66.6K

23.1M

Salih Joshua Kilicli retweetou

Charly Wargnier@DataChaz·27 Şub

Anthropic has quietly dropped a massive curriculum of free courses covering the entire AI ecosystem The syllabus is STACKED 🔥 → Claude Code: CLI automation for your workflow → MCP Mastery: building custom tools and resources in Python → API: a complete guide to the Anthropic backend → AI Fluency: frameworks for safe and efficient collaboration → Claude 101: core features for everyday work I added the link to the free @AnthropicAI Academy in the 🧵↓

English

763

4.3K

797.8K

Descobrir

@USMNT @samwhoo @WellnessWisdomm @brianwut @atakankurt2 @ASametSarikaya @firatgunayer @alperbah