Seungjun

3.2K posts

Seungjun

@dev_seungjun

Kaggle Expert | prev Google Summer of Code 23 @ TensorFlow, WWDC21 Scholar

Sumali Aralık 2019

676 Sinusundan602 Mga Tagasunod

Seungjun nag-retweet

François Fleuret@francoisfleuret·3d

Reminder that I wrote a little book about deep-learning, which is phone-formatted, entirely free, and nearing the 1M download: fleuret.org/francois/lbdl.…

English

243

2.2K

112.7K

Seungjun nag-retweet

Jia-Bin Huang@jbhuang0604·3d

Modern Transformer architecture explained I compiled a list of videos on the Transformer architecture into a short "YouTube course". Hopefully, this would be helpful for beginners in the community. 🧵

English

135

1.1K

52.4K

Seungjun nag-retweet

AlphaSignal AI@AlphaSignalAI·3d

NVIDIA just trained a 14-billion-parameter AI using evolution, not calculus. Every AI today learns through backpropagation. It computes gradients, adjusts weights, repeats. It works, but it demands precision hardware and enormous GPU clusters. Evolution Strategies offered an alternative. Mutate the model, test it, keep what works. Like biological evolution. The problem was speed. Random mutations on GPUs were painfully slow. EGGROLL fixes this with one trick. It splits huge random matrices into two small ones per mutation. The model mutates, tests, and keeps what works. Hundreds of thousands of mutations run at once. > 100x faster training throughput > 91% speed of pure inference > Pretrains models using only integers > Competitive with backprop on reasoning > Works on non-differentiable systems It pretrained a language model from scratch using zero gradients. It also matched reinforcement learning methods on math reasoning tasks. Everyone kept scaling the calculus to train massive AIs. It turns out, we just needed to evolve.

English

668

62.8K

Seungjun nag-retweet

Dmitrii Kovanikov@ChShersh·4d

Dijkstra's algorithm was invented in 1956. This theoretical paper was published in 2025. The real algorithm that beats them both on real data AND PEOPLE ACTUALLY USE NOWADAYS was published in 2010.

Tech with Mak@techNmak

For 38 years, computer scientists believed Dijkstra's algorithm was optimal for sparse graphs. The logic seemed airtight: Dijkstra sorts vertices by distance. Sorting has a lower bound of O(n log n). Therefore shortest paths can't be faster. 5 researchers proved the assumption wrong. The trick: combine Dijkstra's priority queue with Bellman-Ford's dynamic programming. Divide and conquer on vertex sets. Shrink the frontier. Result: O(m log^(2/3) n) First improvement for directed graphs since Fibonacci heap in 1987. Tsinghua. Stanford. Max Planck. 17 pages.

English

2.1K

265.9K

Seungjun nag-retweet

Tech with Mak@techNmak·5d

English

276

1.9K

487.9K

Seungjun nag-retweet

Hanchen Li@lihanc02·4d

An agent that beats Claude Mythos on Terminal Bench and SWE-bench Verified? 🎉We are excited to share Terminator-1, our newest agent that achieved 95+% on SWE-bench Verified and Terminal-Bench with @MogicianTony! We show that besides model capabilities, well-designed harness could actually boost the accuracy by 3x in coding tasks. Well if you really wanted you could get 100% accuracy without solving a single task. The actual finding is that most AI benchmarks can be easily reward-hacked with simple exploits. Read more about the same 7 design flaws that almost every evaluation has ⬇️

Hao Wang@MogicianTony

SWE-bench Verified and Terminal-Bench—two of the most cited AI benchmarks—can be reward-hacked with simple exploits. Our agent scored 100% on both. It solved 0 tasks. Evaluate the benchmark before it evaluates your agent. If you’re picking models by leaderboard score alone, you’re optimizing for the wrong thing. 🧵

English

170

279

3.8K

948.6K

Seungjun@dev_seungjun·4d

NotebookLM sincerely does help you understand long documents really fast. I thought it was just hype, but today I was able to understand a whole 70+ page document about AI Agents in 30 minutes. It’s different from just dropping a document into a chat and having an ongoing conversation. NotebookLM provides you a mind map, and using this, you can cover the entire document at a high level first and keep diving into smaller and smaller details. Plus, with the quizzes it generates, you can consolidate your understanding.

English

Seungjun@dev_seungjun·5d

Imagine having 8x NVIDIA B200s and running Qwen3.5-397B locally 24/7. Claude 4.5 Opus-level performance with zero rate limits and a 1M context window. The ultimate local setup.

English

Seungjun nag-retweet

DAIR.AI@dair_ai·5d

NEW paper from Google on multi-agent research agents. It's one of the first systems that handles end-to-end LaTeX generation, targeted literature reviews, and conceptual diagrams as a decoupled, standalone writer. Automated research frameworks can run experiments, but their writing modules remain the weakest link. Literature reviews are shallow, citations are sparse, and no system generates conceptual diagrams. This new research introduces a standalone writing framework that addresses all of this. PaperOrchestra is a multi-agent system that transforms unconstrained pre-writing materials, raw ideas, experimental logs, notes, into submission-ready LaTeX manuscripts. It uses specialized agents for deep literature synthesis, plot generation, conceptual diagram creation, and iterative refinement. The team also releases PaperWritingBench, the first standardized benchmark with reverse-engineered materials from 200 top-tier AI conference papers. Why does it matter? In side-by-side human evaluations, PaperOrchestra achieved absolute win rate margins of 50 to 68% in literature review quality and 14 to 38% in overall manuscript quality over autonomous baselines. Paper: arxiv.org/abs/2604.05018 Learn to build effective AI agents in our academy: academy.dair.ai

English

464

122.6K

Seungjun@dev_seungjun·5d

@gbmksquare 오호 무엇을 드셨나용?

한국어

Bummo Koo@gbmksquare·5d

불멍🔥

한국어

178

Seungjun@dev_seungjun·5d

Anyone else feeling like Claude Opus 4.6 started responding in a longer format?

English

Seungjun nag-retweet

Google for Developers@googledevs·6d

A new PyTorch-native backend is coming to unlock the power of Google TPUs: ✨ Run existing PyTorch with minimal code changes. ✨ Get a 50-100%+ performance boost with Fused Eager mode. Read the engineering deep dive here: goo.gle/4vbTQQl #TorchTPU #PyTorch #MLOps #AI

English

119

776

51.7K

Seungjun nag-retweet

TestingCatalog News 🗞@testingcatalog·6d

BREAKING 🚨: Z AI released GLM-5.1, an open-source model with top tier coding performance! “Number 1 in open source and number 3 globally across SWE-Bench Pro, Terminal-Bench, and NL2Repo.” “Runs autonomously for 8 hours, refining strategies through thousands of iterations.”

Z.ai@Zai_org

Introducing GLM-5.1: The Next Level of Open Source - Top-Tier Performance: #1 in open source and #3 globally across SWE-Bench Pro, Terminal-Bench, and NL2Repo. - Built for Long-Horizon Tasks: Runs autonomously for 8 hours, refining strategies through thousands of iterations. Blog: z.ai/blog/glm-5.1 Weights: huggingface.co/zai-org/GLM-5.1 API: docs.z.ai/guides/llm/glm… Coding Plan: z.ai/subscribe Coming to chat.z.ai in the next few days.

English

804

69.4K

Seungjun@dev_seungjun·6d

@rutu_3 How?

Giyu@rutu_3·6d

Uninstalling VS Code can increase your productivity by 85%

English

160

Seungjun nag-retweet

Suraj Sharma@suraj_sharma14·6d

$3,850/week. $15K/mo in compute. No PhD required. OpenAI Safety Fellowship is hunting sharp minds: What you'll do: • Full-time empirical AI safety research (Sept '26 – Feb '27) • Ship a paper, benchmark, or dataset by program end • Work with OpenAI's safety + alignment teams What you get: • ~$200K annualized stipend • ~$15K/mo compute budget • Visa support (J-1, F-1/CPT, OPT) • Berkeley workspace (remote possible) Deadline: May 3, 2026 (11:59PM AoE) Apply now 👇 openai.com/index/introduc… #AISafety #OpenAI #ResearchFellowship #Alignment @OpenAI

English

319

21.9K

Seungjun@dev_seungjun·6d

Did a full MLOps cycle today. > Data pipeline (BigQuery + Spark) > fine-tuned LLaMA-3-8B base (DeepSpeed + LoRA, 1x A100) > HumanEval: 30.7% / 58.6% / 69.9% (pass@1/5/10). > quantization (AWQ INT4) > serving (vLLM) > FastAPI gateway (rate limiting, logging, OpenAI fallback) > Vue frontend

English

Seungjun@dev_seungjun·5 Nis

Finally understand why everyone here keeps saying buy a GPU or Mac Mini for local LLMs. Had an unused M1 MacBook sitting around. Ran Gemma 4 - E2B on it, exposed it over the network, accessed it from my current MBP. The experience feels surprisingly fast and good. Gonna save up and buy a proper GPU. Run Qwen3.5-397B or Gemma4-31B someday 👀

English

147

Seungjun@dev_seungjun·5 Nis

Also wrote a very short & concise Medium post on how to run LLMs locally on a MacBook. (< 1 minute to read) medium.com/p/59e9380f9ba6…

Seungjun@dev_seungjun

Running Gemma-4 2B (8-bit Quantized) locally on my M3 MBP via llama.cpp. Hitting a smooth 31 tokens/sec! 🚀

English

170

Seungjun@dev_seungjun·5 Nis

Running Gemma-4 2B (8-bit Quantized) locally on my M3 MBP via llama.cpp. Hitting a smooth 31 tokens/sec! 🚀

English

286

Tuklasin

@MogicianTony @gbmksquare @rutu_3 @OpenAI @elonmusk @BarackObama @taylorswift13 @cristiano