Tae Hwan Jung

128 posts

Tae Hwan Jung

@graykoder

Former AI Engineer | Web3 Developer & Degen: https://t.co/MkJcrSYUac

Katılım Ekim 2021

938 Takip Edilen270 Takipçiler

Tae Hwan Jung@graykoder·3h

sharing your learning process like this is really awesome just a simple idea, but what if you turned the code or concepts you studied into public github repos? I’d definitely give them a star for example, this is a repo I made around 7–8 years ago when I was studying nlp and ai models as an ai researcher: github.com/graykode/nlp-t…

English

levi@levidiamode·17h

Day 132/365 of GPU Programming Continuing my Qwen inference experiments on my local GPU. A few more things I learned today while working on inference latency: - If you want to keep a certain param size, the drafter model has to come from somewhere (eg If your deployment has a fixed param budget, the drafter model has to fit inside it). Native multi token prediction heads that ship with the model (already counted in the param budget) seem to be a better baseline than adding on an external drafter when budgets are tight. If you do need an external drafter, pruning compensating params from the main model first to keep totals constant seems to be something that's used in practice. - Naive layer-pruning rankings from FP16 don't transfer to AWQ? Tried dropping low BlockInfluence score attention layers (which seems to work on raw FP16). The redundantlooking capacity in FP16 seems to be compressed away by AWQ's calibration, so the layer that looked droppable isn't droppable anymore. Lots more to try over the next few days!

levi@levidiamode

Day 131/365 of GPU Programming I've been spending time today working on inference for Qwen3.5 (24 GatedDeltaNet layers and 8 GatedAttention layers in a 3:1 pattern) with the goal of reducing latency on my local Nvidia machine without too much of a hit on benchmark quality. Some notes to self from optimizing inference for a hybrid mamba+attention model: - I'm learning that K/V head counts can differ inside the linear-attention block. For example, this model has 16 K heads but 32 V heads (GQA2 inside GDN). From what I can tell, a lot of kernels out there assume k_heads == v_heads, so requires modifications before they can be adopted on such a setting. - Also noticed moving AWQ from g32 to g128 can change quality benchmarks by quite a few percentage points. The g128 recipe is less aggressive but recoverable with the right calibration data. - Learning that calibration data itself is a decision point. Switching from raw web text to an instruction blended corpus seems to preserve instruction following accuracy better at the same bit width (idk, maybe that's obvious to others). A great resource on the Qwen 3.5 model family is @rasbt's amazing Qwen3.5 0.8B From Scratch. Really recommend going through the jupyter notebook to get a better feel for the model architecture.

English

156

7.2K

Tae Hwan Jung retweetledi

Titan Builder 🌕👷‍♂️@titanbuilderxyz·3d

1/3 PropAMM liquidity is now fully operational on Ethereum mainnet! Three makers are live in every Titan block, and quotes are already consistently beating Binance VIP9 taker fees for retail orders (trades <$1k).

English

179

59.9K

Tae Hwan Jung retweetledi

antirez@antirez·6d

Appreciate Ivan tweet. To put this into context, to build DS4 I used: my MacBook M3 Max (mine, 8k euros), 1 M3 Ultra with 512 GB (got access, 10k euros), one DGX Spark (got access, 4k euros?). Are we far from the times all you needed to do hacking was a computer? That's sad.

Ivan Fioravanti ᯅ@ivanfioravanti

DS4 by @antirez is a great project! It would be great if @Apple would share an M5 Max 128GB MacBook with him to tune the Metal 4 kernels to make prefill faster on new hardware. 🙏

English

484

65.2K

Tae Hwan Jung@graykoder·6d

@antirez this bro is actually a legend engineer

pradeep@pradeep24

tested out @antirez' ds4.c this morning. so impressive and delivers. on a M3 max, 128GB, stock ds4 settings: - 14–15 t/s at 62K pre-filled actual coding conversation - memory usage was flat during gen ~85GB res - disk cache is ~8GB for a full 100K context window - thermals were normal, light fan activity - inference server is rock solid so far biggest constraint: anytime there's a compact, we pay the wait-time price of a fresh prefill (~1min per 10k context) before we are back in action. sequential inference + multiple agents in parallel performance is unclear, will report back. I'm so amped.

English

Tae Hwan Jung@graykoder·7 May

@iamasoothsayer ??

QAM

soothsayer@iamasoothsayer·11 Haz

2023: Corona ended 2026: Hantavirus

Català

30.1K

93.1K

411.2K

Tae Hwan Jung retweetledi

josh lee@dogemos·29 Nis

Privacy arc incoming. Calling it Project Zeplr for now.

English

169

35.5K

Tae Hwan Jung@graykoder·28 Nis

흠 요즘 pypi나 npmjs 매일 털리는거 같은데 왜 이런 오픈소스를 생각한 사람이 없을까요 ㅎㅎ.. github.com/graykode/packv…

보안프로젝트@ngnicky

월간 다운로드 110만 건을 기록하는 PyPI 패키지가 해킹당해 정보 탈취 악성코드가 유포 공격자가 인기 있는 elementary-data 패키지인 파이썬 패키지 인덱스(PyPI)의 악성 버전을 배포하여 개발자의 민감한 데이터와 암호화폐 지갑을 탈취, 위험한 릴리스는 0.23.3 버전

한국어

105

Tae Hwan Jung@graykoder·28 Nis

github.com/graykode/packv…

ZXX

Tae Hwan Jung@graykoder·28 Nis

after the axios incident, I started building packvet, an ai agent that vets dependencies before install and flags real risks. if you’re into dev tooling / security, let’s talk!

Feross@feross

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios @1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.

English

123

Tae Hwan Jung@graykoder·25 Nis

바이브 코딩 첫 +1K!

한국어

102

Tae Hwan Jung@graykoder·25 Nis

In my view: In the past, a great engineer was defined by spending 10+ years mastering a single tech stack going forward, it’s more about how broadly you’ve explored different domains, how many real problems you’ve actually solved, and how many times you’ve gone through the full cycle of building and shipping products

English

Subin An@subinium·25 Nis

a lot of what we call tacit knowledge will get converted into explicit knowledge. running arbitrage in that gap is a solid career too, but i'm chasing a life where, once i've shared as much of my knowledge as i can, i go looking for patterns outside the ones i was stuck in. and along the way i keep realizing how abstracted and how small the knowledge i thought i had actually was. in q1 i was picturing a world where developers disappear and my skills stop mattering. now i'm just riding the irreversible wave of ai. /vibesubin

English

608

Tae Hwan Jung@graykoder·24 Nis

@devXritesh No.. Learn Rust first.

English

Ritesh Roushan@devXritesh·23 Nis

Unpopular opinion: Python is a terrible first language. It teaches you: - Ignore types - Ignore memory - Ignore compilation - Ignore performance Then you hit a real job and: "Why is this slow?" "What's a pointer?" "Types? I just use 'any'" Learn C first. Suffer early. Everything else becomes easy.

English

306

26.7K

Tae Hwan Jung retweetledi

GitHubDaily@GitHub_Daily·22 Nis

打开多个 Claude Code 会话窗口进行不同项目开发，想随时查看 Token 消耗和限流状态，终端切来切去非常麻烦。在 GitHub 上看到 abtop 这个工具，给 AI 编程助手打造了一块直观的系统监控面板。在终端里一屏展示所有 Claude Code 和 Codex CLI 会话的状态，包括 Token 用量、上下文占比、速率限制等。 GitHub：github.com/graykode/abtop 纯本地读取进程状态，不需要输入任何 API 密钥，有效保障了数据隐私安全。如果在 tmux 终端里运行还能一键跳转到对应会话，适合经常打开多个 AI 编程助手的朋友。

中文

152

16.4K

Tae Hwan Jung@graykoder·24 Nis

@claraexmachina @keplrwallet 어제 체인앱시스 사무실 놀러갔다옴 ㅜ 다음에 커피챗 고고

한국어

173

clara 🍯@claraexmachina·24 Nis

Today’s my last day at @keplrwallet after 3.5 years. I used to joke that I was “dedicating my 30s to Keplr,” but I truly meant it every time I said it, haha. Grateful to have led this team, and to everyone who helped me grow, especially through the times I fell short. Teams like this don’t come around often. We poured everything into building together, while also supporting each other’s lives beyond work like family. I feel really lucky to have been part of it. What’s next: taking some time to breathe, study, and explore beyond my comfort zone. I’ll likely stay in crypto, but I’m also open to diving deeper into areas like AI. Just taking the time to figure out what’s next. If you’d like to grab a coffee, feel free to reach out ☕️

English

1.2K

Tae Hwan Jung@graykoder·23 Nis

github.com/graykode/starq…

ZXX

Tae Hwan Jung@graykoder·23 Nis

built a dashboard to track trending github repos in real time to stay on top of tech trends! graykode.github.io/starquake/

English

Tae Hwan Jung retweetledi

CyrilXBT@cyrilXBT·20 Nis

THE JOB MARKET IS ABOUT TO GET WEIRD. And most people are not prepared for what is coming. Companies in 2026 are not looking for data scientists. They are not looking for ML engineers. They are not looking for people who can build models from scratch. THEY ARE LOOKING FOR AI NERDS. The person who walks into a meeting, sees a 4 hour manual process, and kills it in 10 minutes with Claude Code and LLMs. The person who refuses to do anything manually twice. The person who looks at every repetitive task and asks one question: Why is a human still doing this. That mindset is worth more right now than a machine learning PhD. More than five years of Python experience. More than any certification from any university. THE NEW VALUABLE SKILL IS NOT TECHNICAL. It is a refusal to accept inefficiency. The people who develop that refusal this year will be completely unemployable in the old way and completely irreplaceable in the new one. Which side of that line are you on.

English

155

195

318.9K

Tae Hwan Jung retweetledi

Subin An@subinium·19 Nis

Built a small toolkit for the Vercel April 2026 incident. - audit env vars across all your projects - rotate internal-random secrets to "sensitive" type per-vendor runbooks (Supabase, Postgres, OAuth, etc.) - self-contained handoff docs you can hand off mid-incident Disclaimer: Not perfect, but use it to move fast. github.com/subinium/verce…

Vercel@vercel

We’ve identified a security incident that involved unauthorized access to certain internal Vercel systems, impacting a limited subset of customers. Please see our security bulletin: vercel.com/kb/bulletin/ve…

English

3.8K

Tae Hwan Jung retweetledi

Tom Dörr@tom_doerr·17 Nis

Monitors AI coding agent sessions in terminal github.com/graykode/abtop

Català

5.4K

Tae Hwan Jung@graykoder·14 Nis

thanks! the handoff is intentionally indirect - agents don't call each other, they communicate through a shared comment thread + git-versioned workspace. Analyst runs the research loop, auto-commits every code change, and each run links to a commit hash. when a review is triggered, RM gets the diff + metrics + dataset context and posts an approve/reject verdict as a comment. RM never sees Analyst's intermediate thinking — only the committed artifact. adversarial review, not collaborative. that's what makes bias detection actually work.

English

Yuri | Queiroz@YuriOQueiroz·13 Nis

@graykoder love seeing paperclip-style architectures applied to quant. the specialist-agent approach is underrated in financial systems because you can isolate risk at each decision layer. what does your orchestration handoff look like between agents?

English

Tae Hwan Jung@graykoder·13 Nis

paperclip-style multi-agent workflow for quant trading

English

278

Keşfet

@antirez @iamasoothsayer @devXritesh @claraexmachina @keplrwallet @elonmusk @BarackObama @taylorswift13