Tae Hwan Jung

128 posts

Tae Hwan Jung

Tae Hwan Jung

@graykoder

Former AI Engineer | Web3 Developer & Degen: https://t.co/MkJcrSYUac

Katılım Ekim 2021
938 Takip Edilen270 Takipçiler
Tae Hwan Jung
Tae Hwan Jung@graykoder·
sharing your learning process like this is really awesome just a simple idea, but what if you turned the code or concepts you studied into public github repos? I’d definitely give them a star for example, this is a repo I made around 7–8 years ago when I was studying nlp and ai models as an ai researcher: github.com/graykode/nlp-t…
English
0
0
0
27
levi
levi@levidiamode·
Day 132/365 of GPU Programming Continuing my Qwen inference experiments on my local GPU. A few more things I learned today while working on inference latency: - If you want to keep a certain param size, the drafter model has to come from somewhere (eg If your deployment has a fixed param budget, the drafter model has to fit inside it). Native multi token prediction heads that ship with the model (already counted in the param budget) seem to be a better baseline than adding on an external drafter when budgets are tight. If you do need an external drafter, pruning compensating params from the main model first to keep totals constant seems to be something that's used in practice. - Naive layer-pruning rankings from FP16 don't transfer to AWQ? Tried dropping low BlockInfluence score attention layers (which seems to work on raw FP16). The redundantlooking capacity in FP16 seems to be compressed away by AWQ's calibration, so the layer that looked droppable isn't droppable anymore. Lots more to try over the next few days!
levi tweet medialevi tweet media
levi@levidiamode

Day 131/365 of GPU Programming I've been spending time today working on inference for Qwen3.5 (24 GatedDeltaNet layers and 8 GatedAttention layers in a 3:1 pattern) with the goal of reducing latency on my local Nvidia machine without too much of a hit on benchmark quality. Some notes to self from optimizing inference for a hybrid mamba+attention model: - I'm learning that K/V head counts can differ inside the linear-attention block. For example, this model has 16 K heads but 32 V heads (GQA2 inside GDN). From what I can tell, a lot of kernels out there assume k_heads == v_heads, so requires modifications before they can be adopted on such a setting. - Also noticed moving AWQ from g32 to g128 can change quality benchmarks by quite a few percentage points. The g128 recipe is less aggressive but recoverable with the right calibration data. - Learning that calibration data itself is a decision point. Switching from raw web text to an instruction blended corpus seems to preserve instruction following accuracy better at the same bit width (idk, maybe that's obvious to others). A great resource on the Qwen 3.5 model family is @rasbt's amazing Qwen3.5 0.8B From Scratch. Really recommend going through the jupyter notebook to get a better feel for the model architecture.

English
3
11
156
7.2K
Tae Hwan Jung retweetledi
Titan Builder 🌕👷‍♂️
1/3 PropAMM liquidity is now fully operational on Ethereum mainnet! Three makers are live in every Titan block, and quotes are already consistently beating Binance VIP9 taker fees for retail orders (trades <$1k).
Titan Builder 🌕👷‍♂️ tweet media
English
5
15
179
59.9K
Tae Hwan Jung retweetledi
antirez
antirez@antirez·
Appreciate Ivan tweet. To put this into context, to build DS4 I used: my MacBook M3 Max (mine, 8k euros), 1 M3 Ultra with 512 GB (got access, 10k euros), one DGX Spark (got access, 4k euros?). Are we far from the times all you needed to do hacking was a computer? That's sad.
Ivan Fioravanti ᯅ@ivanfioravanti

DS4 by @antirez is a great project! It would be great if @Apple would share an M5 Max 128GB MacBook with him to tune the Metal 4 kernels to make prefill faster on new hardware. 🙏

English
23
29
484
65.2K
Tae Hwan Jung
Tae Hwan Jung@graykoder·
@antirez this bro is actually a legend engineer
pradeep@pradeep24

tested out @antirez' ds4.c this morning. so impressive and delivers. on a M3 max, 128GB, stock ds4 settings: - 14–15 t/s at 62K pre-filled actual coding conversation - memory usage was flat during gen ~85GB res - disk cache is ~8GB for a full 100K context window - thermals were normal, light fan activity - inference server is rock solid so far biggest constraint: anytime there's a compact, we pay the wait-time price of a fresh prefill (~1min per 10k context) before we are back in action. sequential inference + multiple agents in parallel performance is unclear, will report back. I'm so amped.

English
0
0
0
32
soothsayer
soothsayer@iamasoothsayer·
2023: Corona ended 2026: Hantavirus
Català
30.1K
93.1K
411.2K
0
Tae Hwan Jung retweetledi
josh lee
josh lee@dogemos·
Privacy arc incoming. Calling it Project Zeplr for now.
josh lee tweet media
English
22
15
169
35.5K
Tae Hwan Jung
Tae Hwan Jung@graykoder·
after the axios incident, I started building packvet, an ai agent that vets dependencies before install and flags real risks. if you’re into dev tooling / security, let’s talk!
Tae Hwan Jung tweet media
Feross@feross

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios@1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.

English
1
0
5
123
Tae Hwan Jung
Tae Hwan Jung@graykoder·
바이브 코딩 첫 +1K!
Tae Hwan Jung tweet media
한국어
1
0
3
102
Tae Hwan Jung
Tae Hwan Jung@graykoder·
In my view: In the past, a great engineer was defined by spending 10+ years mastering a single tech stack going forward, it’s more about how broadly you’ve explored different domains, how many real problems you’ve actually solved, and how many times you’ve gone through the full cycle of building and shipping products
English
0
0
1
34
Subin An
Subin An@subinium·
a lot of what we call tacit knowledge will get converted into explicit knowledge. running arbitrage in that gap is a solid career too, but i'm chasing a life where, once i've shared as much of my knowledge as i can, i go looking for patterns outside the ones i was stuck in. and along the way i keep realizing how abstracted and how small the knowledge i thought i had actually was. in q1 i was picturing a world where developers disappear and my skills stop mattering. now i'm just riding the irreversible wave of ai. /vibesubin
Subin An tweet media
English
5
0
16
608
Ritesh Roushan
Ritesh Roushan@devXritesh·
Unpopular opinion: Python is a terrible first language. It teaches you: - Ignore types - Ignore memory - Ignore compilation - Ignore performance Then you hit a real job and: "Why is this slow?" "What's a pointer?" "Types? I just use 'any'" Learn C first. Suffer early. Everything else becomes easy.
English
75
28
306
26.7K
Tae Hwan Jung retweetledi
GitHubDaily
GitHubDaily@GitHub_Daily·
打开多个 Claude Code 会话窗口进行不同项目开发,想随时查看 Token 消耗和限流状态,终端切来切去非常麻烦。 在 GitHub 上看到 abtop 这个工具,给 AI 编程助手打造了一块直观的系统监控面板。 在终端里一屏展示所有 Claude Code 和 Codex CLI 会话的状态,包括 Token 用量、上下文占比、速率限制等。 GitHub:github.com/graykode/abtop 纯本地读取进程状态,不需要输入任何 API 密钥,有效保障了数据隐私安全。 如果在 tmux 终端里运行还能一键跳转到对应会话,适合经常打开多个 AI 编程助手的朋友。
GitHubDaily tweet media
中文
2
22
152
16.4K
clara 🍯
clara 🍯@claraexmachina·
Today’s my last day at @keplrwallet after 3.5 years. I used to joke that I was “dedicating my 30s to Keplr,” but I truly meant it every time I said it, haha. Grateful to have led this team, and to everyone who helped me grow, especially through the times I fell short. Teams like this don’t come around often. We poured everything into building together, while also supporting each other’s lives beyond work like family. I feel really lucky to have been part of it. What’s next: taking some time to breathe, study, and explore beyond my comfort zone. I’ll likely stay in crypto, but I’m also open to diving deeper into areas like AI. Just taking the time to figure out what’s next. If you’d like to grab a coffee, feel free to reach out ☕️
English
7
0
40
1.2K
Tae Hwan Jung retweetledi
CyrilXBT
CyrilXBT@cyrilXBT·
THE JOB MARKET IS ABOUT TO GET WEIRD. And most people are not prepared for what is coming. Companies in 2026 are not looking for data scientists. They are not looking for ML engineers. They are not looking for people who can build models from scratch. THEY ARE LOOKING FOR AI NERDS. The person who walks into a meeting, sees a 4 hour manual process, and kills it in 10 minutes with Claude Code and LLMs. The person who refuses to do anything manually twice. The person who looks at every repetitive task and asks one question: Why is a human still doing this. That mindset is worth more right now than a machine learning PhD. More than five years of Python experience. More than any certification from any university. THE NEW VALUABLE SKILL IS NOT TECHNICAL. It is a refusal to accept inefficiency. The people who develop that refusal this year will be completely unemployable in the old way and completely irreplaceable in the new one. Which side of that line are you on.
English
155
195
2K
318.9K
Tae Hwan Jung retweetledi
Subin An
Subin An@subinium·
Built a small toolkit for the Vercel April 2026 incident. - audit env vars across all your projects - rotate internal-random secrets to "sensitive" type per-vendor runbooks (Supabase, Postgres, OAuth, etc.) - self-contained handoff docs you can hand off mid-incident Disclaimer: Not perfect, but use it to move fast. github.com/subinium/verce…
Vercel@vercel

We’ve identified a security incident that involved unauthorized access to certain internal Vercel systems, impacting a limited subset of customers. Please see our security bulletin: vercel.com/kb/bulletin/ve…

English
4
2
21
3.8K
Tae Hwan Jung
Tae Hwan Jung@graykoder·
thanks! the handoff is intentionally indirect - agents don't call each other, they communicate through a shared comment thread + git-versioned workspace. Analyst runs the research loop, auto-commits every code change, and each run links to a commit hash. when a review is triggered, RM gets the diff + metrics + dataset context and posts an approve/reject verdict as a comment. RM never sees Analyst's intermediate thinking — only the committed artifact. adversarial review, not collaborative. that's what makes bias detection actually work.
English
0
0
1
18
Yuri | Queiroz
Yuri | Queiroz@YuriOQueiroz·
@graykoder love seeing paperclip-style architectures applied to quant. the specialist-agent approach is underrated in financial systems because you can isolate risk at each decision layer. what does your orchestration handoff look like between agents?
English
1
0
0
10
Tae Hwan Jung
Tae Hwan Jung@graykoder·
paperclip-style multi-agent workflow for quant trading
English
10
0
9
278