chriswireless

1.6K posts

chriswireless

chriswireless

@chriswireless1

Kaiserslautern, Germany Inscrit le Haziran 2012
179 Abonnements42 Abonnés
chriswireless
chriswireless@chriswireless1·
@pierceboggan Subscription, I can’t subscribe. What is the use of the all if we can’t subscribe
English
0
0
0
33
Pierce Boggan
Pierce Boggan@pierceboggan·
What should we improve on the new Github Copilot app? gh.io/app
English
75
12
70
10.6K
GitHub
GitHub@github·
No more waitlist. The GitHub Copilot app's technical preview is now available to everyone currently on Copilot Pro, Pro+, Max, Business, and Enterprise plans. This agent-native desktop experience lets you decide what agents work on, how they work, and what ships. Go from issue to merge all in one place. ✨
English
65
102
712
143.5K
chriswireless retweeté
Patrick Jiang
Patrick Jiang@patpcj·
Introducing Harness-1, a 20B search agent trained with a state-externalizing harness. > frontier-level long-horizon search, rivaling Opus-4.6 and outperforming GPT-5.4 > Context-1-level cost and latency > externalizes candidates, evidence, verification, and search history > open-source
English
90
273
3K
264K
chriswireless retweeté
Andrew Ng
Andrew Ng@AndrewYNg·
New course on serving LLMs efficiently -- how do you serve models to many concurrent users at low latency and reasonable cost? This short course is built with @RedHat and taught by @cedricclyburn. Efficient LLM serving requires efficient memory management. A 70B-parameter model takes ~140 GB just to load the weights. On top of that, every active request needs its own chunk of GPU memory, the KV cache, to store the token context it has built up so far. In this course, you'll learn to reduce a model's memory footprint with quantization and serve it using vLLM, which handles many concurrent requests efficiently through smart memory management. Skills you'll gain: - Quantize a model and measure the accuracy tradeoff - Serve a model with vLLM and watch it handle concurrent requests efficiently - Benchmark your deployment and make informed tradeoffs between speed, cost, and accuracy Join and learn to serve LLMs efficiently: deeplearning.ai/courses/fast-a…
English
93
141
1.1K
104.4K
chriswireless retweeté
Rohit Ghumare
Rohit Ghumare@ghumare64·
Ai Engineering from Scratch comes with 30+ Capstone Projects! Why they're special. You learn to build your own GPT, LLM, OPENCLAW, AGENT, HARNESS, and many more. aiengineeringfromscratch.com
English
1
52
285
15.3K
chriswireless retweeté
Alexey Grigorev
Alexey Grigorev@Al_Grigor·
If you found this post helpful, follow me for more content like this. I publish a weekly newsletter where I share practical insights on data and AI. It focuses on projects I'm working on + interesting tools and resources I've recently tried: alexeyondata.substack.com
English
0
1
2
525
chriswireless retweeté
Deep-ML
Deep-ML@real_deep_ml·
We just launched a new project that teaches you how to build Flash Attention with CUDA, step by step. By the end, you’ll have a working Flash Attention kernel built from the ground up. The project covers: -CUDA primitives warm-up -Matrix operations -Naive attention baseline -Online softmax math -Tiled attention building blocks -Fused Flash Attention kernel -Causal Flash Attention It will be open to everyone for the first 2 weeks, then it will become part of our premium projects.
Deep-ML tweet media
English
18
108
1.2K
47.1K
chriswireless retweeté
Roy van Rijn
Roy van Rijn@royvanrijn·
For curious developers 🧠 I built "The Anatomy of an LLM", an interactive explainer showing how text becomes tokens, vectors, attention, transformer blocks, and finally generated text. royvanrijn.com/anatomy-of-an-…
English
12
154
942
71.9K
chriswireless retweeté
Deep-ML
Deep-ML@real_deep_ml·
Build your own GPT model from scratch using only NumPy. We broke the process down into small, approachable problems that each take around 2–20 minutes to solve. The goal: make it possible for anyone to not just build a GPT, but truly understand how it works under the hood. By the end, you’ll walk away with code that can train a GPT model from scratch.
Deep-ML tweet media
English
4
17
174
6.6K
chriswireless retweeté
Aleksa Gordić (水平问题)
Aleksa Gordić (水平问题)@gordic_aleksa·
new in-depth blog post time: Inside the Transformer: The Life of a Token a deep dive into a modern dense transformer, i cover YaRN (why does pairwise coordinate rotation induce positional information?), hybrid attention (getting to 160k context length), soft capping, QK normalization, etc. as the token flows through the transformer bonus transformer math: FLOPs/token formula (and when is 6N formula broken), cluster sizing (how big of a cluster do you need given the model/data size and experiment throughput of interest), and more
Aleksa Gordić (水平问题) tweet media
English
22
143
1K
49K
chriswireless retweeté
Ahmad
Ahmad@TheAhmadOsman·
DROP EVERYTHING The ultimate step-by-step projects roadmap for BECOMING an AI Researcher is now available online to read FOR FREE Covers building - Tokenizers / embeddings - Positional methods - Attention / multi-head attention - Transformer blocks - Training loops / objectives - Sampling dashboards - Speculative decoding - KV cache / MQA / GQA / MLA - Long context - FlashAttention / hardware budgets - MoE routers - State-space / diffusion LMs - Data pipelines / synthetic data - Scaling laws - SFT / DPO / RLHF / GRPO / RLVR - Quantization - Serving systems - Evaluation harnesses - RAG / tools / agents - Multimodal adapters - Interpretability / safety - Full capstone model system The loop for every project - Build it - Plot it - Break it - Explain it - Ship the artifact You should read this, and if you cannot now then you most definitely wanna bookmark it for later DM me when you're working at a frontier lab
Ahmad@TheAhmadOsman

x.com/i/article/2058…

English
16
67
475
45.4K
chriswireless retweeté
Andrew Ng
Andrew Ng@AndrewYNg·
New course: Build agents that respond to users with not only plaintext, but custom UIs like charts, forms, and whiteboards, generated on demand and displayed right in the chat. This short course is built in partnership with @CopilotKit and taught by @ataiiam, co-founder of CopilotKit. You'll learn three approaches: Your agent can pick from custom components you build, like charts and forms. It can compose new layouts from a set of building blocks you provide, like rows, cards, and text. Or it can incorporate existing third-party apps, like a whiteboard or a calendar, right inside the conversation. Skills you’ll gain: - Build agents that render custom components like charts and forms on demand - Build an app where the agent and user collaborate on shared data, beyond just the chat window - Place third-party apps like maps, calendars, and whiteboards right in your interface Join and build agents that give users something to see and act on! deeplearning.ai/short-courses/…
English
103
233
1.4K
205.3K
chriswireless retweeté
Amit Shekhar
Amit Shekhar@amitiitbhu·
- Math behind Attention - Q, K, and V - Math behind √dₖ Scaling Factor in Attention - Math Behind Backpropagation - Math Behind Gradient Descent - Math Behind Cross-Entropy Loss - Math Behind RoPE (Rotary Position Embedding) - RMSNorm (Root Mean Square Layer Normalization)
Amit Shekhar@amitiitbhu

x.com/i/article/2058…

English
0
32
234
18.8K
chriswireless retweeté
Ahmad
Ahmad@TheAhmadOsman·
INCREDIBLE The MOST COMPLETE GUIDE for understanding LLMs from first principles is now available online to read for free Covers the model mechanics - Tokens / tokenizers - Transformers - Attention - KV cache - Prefill vs decode - Decoding controls - Model packages - Chat templates - Long context - RAG - Agents / tools - Fine-tuning - Multimodal models Then connects that to running models locally - What "local" really means - Open-weight vs opensource - Quantization - VRAM math - Hardware tiers - File formats / load safety - Runtimes / serving modes - Model selection - Privacy - Failure modes - Benchmarks - Practical setup paths You should read this, and if you cannot now then you most definitely wanna bookmark it for later Opensource AI FTW
Ahmad@TheAhmadOsman

x.com/i/article/2057…

English
28
177
1.2K
186.7K
chriswireless retweeté
Pau Labarta Bajo
Pau Labarta Bajo@paulabartabajo_·
60-minute deep dive on how to build a 𝟭𝟬𝟬% 𝗹𝗼𝗰𝗮𝗹 𝘃𝗼𝗶𝗰𝗲 𝗵𝗼𝗺𝗲 𝗮𝘀𝘀𝗶𝘀𝘁𝗮𝗻𝘁 that maps your voice directly to actions (plus slides!). Enjoy ↓
English
7
35
137
5.8K
chriswireless retweeté
Shubham Sharma
Shubham Sharma@HappyyPablo·
open sourcing Marlin-2B 🐟 a tiny VLM to extract structured information from videos Marlin is finetuned for two questions devs want to ask in their videos: what is happening, and when? Best open model in its weight class, competitive with Gemini-2.5-flash at only 2B params 🧵
English
135
522
4.6K
306.3K
chriswireless retweeté
Ajit kumar
Ajit kumar@ajitcodes·
These 9 lectures from Stanford University are the BEST for anyone wanting to learn and understand LLMs in depth Lecture 1 - Transformer: lnkd.in/dGnQW39t Lecture 2 - Transformer-Based Models & Tricks: lnkd.in/dT_VEpVH Lecture 3 - Tranformers & Large Language Models: lnkd.in/dwjjpjaP Lecture 4 - LLM Training: lnkd.in/dSi_xCEN Lecture 5 - LLM tuning: lnkd.in/dUK5djpB Lecture 6 - LLM Reasoning: lnkd.in/dAGQTNAM Lecture 7 - Agentic LLMs: lnkd.in/dWD4j7vm Lecture 8 - LLM Evaluation: lnkd.in/ddxE5zvb Lecture 9 - Recap & Current Trends: lnkd.in/dGsTd8jN Start understanding LLMs in depth from the experts. Go through each step-by-step video Start understanding LLMs in depth from the experts. Go through each step-by-step video
Ajit kumar tweet media
English
17
379
1.3K
57.8K