chriswireless

1.6K posts

chriswireless

@chriswireless1

Kaiserslautern, Germany Inscrit le Haziran 2012

179 Abonnements42 Abonnés

chriswireless@chriswireless1·2d

@pierceboggan Subscription, I can’t subscribe. What is the use of the all if we can’t subscribe

English

Pierce Boggan@pierceboggan·2d

What should we improve on the new Github Copilot app? gh.io/app

English

10.6K

chriswireless@chriswireless1·3d

@github If only I can subscribe

English

361

GitHub@github·3d

No more waitlist. The GitHub Copilot app's technical preview is now available to everyone currently on Copilot Pro, Pro+, Max, Business, and Enterprise plans. This agent-native desktop experience lets you decide what agents work on, how they work, and what ships. Go from issue to merge all in one place. ✨

English

102

712

143.5K

chriswireless retweeté

Patrick Jiang@patpcj·6 Haz

Introducing Harness-1, a 20B search agent trained with a state-externalizing harness. > frontier-level long-horizon search, rivaling Opus-4.6 and outperforming GPT-5.4 > Context-1-level cost and latency > externalizes candidates, evidence, verification, and search history > open-source

English

273

264K

chriswireless retweeté

Andrew Ng@AndrewYNg·4 Haz

New course on serving LLMs efficiently -- how do you serve models to many concurrent users at low latency and reasonable cost? This short course is built with @RedHat and taught by @cedricclyburn. Efficient LLM serving requires efficient memory management. A 70B-parameter model takes ~140 GB just to load the weights. On top of that, every active request needs its own chunk of GPU memory, the KV cache, to store the token context it has built up so far. In this course, you'll learn to reduce a model's memory footprint with quantization and serve it using vLLM, which handles many concurrent requests efficiently through smart memory management. Skills you'll gain: - Quantize a model and measure the accuracy tradeoff - Serve a model with vLLM and watch it handle concurrent requests efficiently - Benchmark your deployment and make informed tradeoffs between speed, cost, and accuracy Join and learn to serve LLMs efficiently: deeplearning.ai/courses/fast-a…

English

141

1.1K

104.4K

chriswireless@chriswireless1·2 Haz

@claudeai @ClaudeDevs what is happening I am confused.

English

chriswireless retweeté

Rohit Ghumare@ghumare64·30 May

Ai Engineering from Scratch comes with 30+ Capstone Projects! Why they're special. You learn to build your own GPT, LLM, OPENCLAW, AGENT, HARNESS, and many more. aiengineeringfromscratch.com

English

285

15.3K

chriswireless retweeté

Mike Piccolo@mfpiccolo·28 May

x.com/i/article/2060…

ZXX

258

2.1K

459.8K

chriswireless retweeté

Alexey Grigorev@Al_Grigor·30 May

If you found this post helpful, follow me for more content like this. I publish a weekly newsletter where I share practical insights on data and AI. It focuses on projects I'm working on + interesting tools and resources I've recently tried: alexeyondata.substack.com

English

525

chriswireless retweeté

Deep-ML@real_deep_ml·29 May

We just launched a new project that teaches you how to build Flash Attention with CUDA, step by step. By the end, you’ll have a working Flash Attention kernel built from the ground up. The project covers: -CUDA primitives warm-up -Matrix operations -Naive attention baseline -Online softmax math -Tiled attention building blocks -Fused Flash Attention kernel -Causal Flash Attention It will be open to everyone for the first 2 weeks, then it will become part of our premium projects.

English

108

1.2K

47.1K

chriswireless retweeté

Roy van Rijn@royvanrijn·28 May

For curious developers 🧠 I built "The Anatomy of an LLM", an interactive explainer showing how text becomes tokens, vectors, attention, transformer blocks, and finally generated text. royvanrijn.com/anatomy-of-an-…

English

154

942

71.9K

chriswireless retweeté

Deep-ML@real_deep_ml·28 May

Build your own GPT model from scratch using only NumPy. We broke the process down into small, approachable problems that each take around 2–20 minutes to solve. The goal: make it possible for anyone to not just build a GPT, but truly understand how it works under the hood. By the end, you’ll walk away with code that can train a GPT model from scratch.

English

174

6.6K

chriswireless retweeté

Aleksa Gordić (水平问题)@gordic_aleksa·26 May

new in-depth blog post time: Inside the Transformer: The Life of a Token a deep dive into a modern dense transformer, i cover YaRN (why does pairwise coordinate rotation induce positional information?), hybrid attention (getting to 160k context length), soft capping, QK normalization, etc. as the token flows through the transformer bonus transformer math: FLOPs/token formula (and when is 6N formula broken), cluster sizing (how big of a cluster do you need given the model/data size and experiment throughput of interest), and more

English

143

49K

chriswireless retweeté

Ahmad@TheAhmadOsman·25 May

DROP EVERYTHING The ultimate step-by-step projects roadmap for BECOMING an AI Researcher is now available online to read FOR FREE Covers building - Tokenizers / embeddings - Positional methods - Attention / multi-head attention - Transformer blocks - Training loops / objectives - Sampling dashboards - Speculative decoding - KV cache / MQA / GQA / MLA - Long context - FlashAttention / hardware budgets - MoE routers - State-space / diffusion LMs - Data pipelines / synthetic data - Scaling laws - SFT / DPO / RLHF / GRPO / RLVR - Quantization - Serving systems - Evaluation harnesses - RAG / tools / agents - Multimodal adapters - Interpretability / safety - Full capstone model system The loop for every project - Build it - Plot it - Break it - Explain it - Ship the artifact You should read this, and if you cannot now then you most definitely wanna bookmark it for later DM me when you're working at a frontier lab

Ahmad@TheAhmadOsman

x.com/i/article/2058…

English

475

45.4K

chriswireless retweeté

Amit Shekhar@amitiitbhu·24 May

New Article: LLM Evaluation Read here: outcomeschool.com/blog/llm-evalu…

English

2.2K

chriswireless retweeté

Andrew Ng@AndrewYNg·7 May

New course: Build agents that respond to users with not only plaintext, but custom UIs like charts, forms, and whiteboards, generated on demand and displayed right in the chat. This short course is built in partnership with @CopilotKit and taught by @ataiiam, co-founder of CopilotKit. You'll learn three approaches: Your agent can pick from custom components you build, like charts and forms. It can compose new layouts from a set of building blocks you provide, like rows, cards, and text. Or it can incorporate existing third-party apps, like a whiteboard or a calendar, right inside the conversation. Skills you’ll gain: - Build agents that render custom components like charts and forms on demand - Build an app where the agent and user collaborate on shared data, beyond just the chat window - Place third-party apps like maps, calendars, and whiteboards right in your interface Join and build agents that give users something to see and act on! deeplearning.ai/short-courses/…

English

103

233

1.4K

205.3K

chriswireless retweeté

Amit Shekhar@amitiitbhu·24 May

- Math behind Attention - Q, K, and V - Math behind √dₖ Scaling Factor in Attention - Math Behind Backpropagation - Math Behind Gradient Descent - Math Behind Cross-Entropy Loss - Math Behind RoPE (Rotary Position Embedding) - RMSNorm (Root Mean Square Layer Normalization)

Amit Shekhar@amitiitbhu

x.com/i/article/2058…

English

234

18.8K

chriswireless retweeté

Ahmad@TheAhmadOsman·22 May

INCREDIBLE The MOST COMPLETE GUIDE for understanding LLMs from first principles is now available online to read for free Covers the model mechanics - Tokens / tokenizers - Transformers - Attention - KV cache - Prefill vs decode - Decoding controls - Model packages - Chat templates - Long context - RAG - Agents / tools - Fine-tuning - Multimodal models Then connects that to running models locally - What "local" really means - Open-weight vs opensource - Quantization - VRAM math - Hardware tiers - File formats / load safety - Runtimes / serving modes - Model selection - Privacy - Failure modes - Benchmarks - Practical setup paths You should read this, and if you cannot now then you most definitely wanna bookmark it for later Opensource AI FTW

Ahmad@TheAhmadOsman

x.com/i/article/2057…

English

177

1.2K

186.7K

chriswireless retweeté

Pau Labarta Bajo@paulabartabajo_·20 May

60-minute deep dive on how to build a 𝟭𝟬𝟬% 𝗹𝗼𝗰𝗮𝗹 𝘃𝗼𝗶𝗰𝗲 𝗵𝗼𝗺𝗲 𝗮𝘀𝘀𝗶𝘀𝘁𝗮𝗻𝘁 that maps your voice directly to actions (plus slides!). Enjoy ↓

English

137

5.8K

chriswireless retweeté

Shubham Sharma@HappyyPablo·19 May

open sourcing Marlin-2B 🐟 a tiny VLM to extract structured information from videos Marlin is finetuned for two questions devs want to ask in their videos: what is happening, and when? Best open model in its weight class, competitive with Gemini-2.5-flash at only 2B params 🧵

English

135

522

4.6K

306.3K

chriswireless retweeté

Ajit kumar@ajitcodes·20 May

These 9 lectures from Stanford University are the BEST for anyone wanting to learn and understand LLMs in depth Lecture 1 - Transformer: lnkd.in/dGnQW39t Lecture 2 - Transformer-Based Models & Tricks: lnkd.in/dT_VEpVH Lecture 3 - Tranformers & Large Language Models: lnkd.in/dwjjpjaP Lecture 4 - LLM Training: lnkd.in/dSi_xCEN Lecture 5 - LLM tuning: lnkd.in/dUK5djpB Lecture 6 - LLM Reasoning: lnkd.in/dAGQTNAM Lecture 7 - Agentic LLMs: lnkd.in/dWD4j7vm Lecture 8 - LLM Evaluation: lnkd.in/ddxE5zvb Lecture 9 - Recap & Current Trends: lnkd.in/dGsTd8jN Start understanding LLMs in depth from the experts. Go through each step-by-step video Start understanding LLMs in depth from the experts. Go through each step-by-step video

English

379

1.3K

57.8K

Découvrir

@pierceboggan @github @RedHat @cedricclyburn @claudeai @ClaudeDevs @CopilotKit @ataiiam