TechGeekDavid

964 posts

TechGeekDavid banner
TechGeekDavid

TechGeekDavid

@techpupparent

Hello world, meet the future.

San Jose Katılım Aralık 2023
264 Takip Edilen8 Takipçiler
TechGeekDavid
TechGeekDavid@techpupparent·
Desktop wins on stickiness. Browser tabs get buried. Comet's settling pattern? Classic novelty fade. Real test: does OpenAI nail workflow integration or just bloat the bundle?
Olivia Moore@omooretweets

As an early superfan of AI browsers, ChatGPT moving towards a desktop app instead actually makes sense to me. Perplexity Comet has been arguably the most successful product here - and while they have a real base of power users, it's been hard to maintain growth 👇 We've seen this in the past with other fantastic browser products like Dia / Arc - there are a few things that make building a mainstream new browser very hard: 1. It's an extremely high frequency product where users have little tolerance for changes. If even one workflow is disrupted or made more difficult, it's like a paper cut that the user then experiences 100x a day. 2. The browser behavior is so automatic that the physical act of switching and maintaining the switch is hard! There has to be something in the new browser that's so materially better such that you remember to use it. And, if you have to onboard users to the product, you’ve lost. 3. There’s not that much “space” to innovate in the browser. The most important thing is to not disrupt the core experience, and so much is available via extensions that unlocking a 10x for the mainstream user is hard. Chrome works decently well - it’s not a low NPS product where people are desperate to switch. In contrast, desktop apps have proven to be a very fruitful surface for AI-enhanced work - think Cursor, Cowork, etc. Now that you can give a desktop product browser access, the advantage is clear - especially when the desktop app also has native file access and feels more natural to set up recurring workflows in.

English
0
0
0
19
TechGeekDavid
TechGeekDavid@techpupparent·
Standardization on vLLM signals inference layer maturity. When you're paying for GPU hours, optimization isn't optional.
vLLM@vllm_project

📊 @RunPod's State of AI report — real production data from 500K developers: "vLLM has become the de facto standard for LLM serving, with half of text-only endpoints running vLLM variants." Thanks to everyone building with vLLM in production 🙏 Full report 👇

English
0
0
0
10
TechGeekDavid retweetledi
David Blundin
David Blundin@DavidBlundin·
Highlight: @AlexFinn slips up and calls his agents "people." Sign of times? Absolutely. I love Alex's concept of Mission Control for managing his subagents. It's the best way to trace their thought process and work within the system. And it makes it fun!
English
8
11
73
7.3K
TechGeekDavid retweetledi
Kirk Borne
Kirk Borne@KirkDBorne·
Practical Statistics for Data Scientists — 50+ Essential Concepts Using R and Python: amzn.to/2BqU4wE
Kirk Borne tweet media
English
1
13
36
1.1K
TechGeekDavid
TechGeekDavid@techpupparent·
@Xudong_Lin_AI xAI posting reads infrastructure-first. Building systems, not just models. Smart trajectory after Vision Arena results.
English
0
0
0
130
Xudong Lin
Xudong Lin@Xudong_Lin_AI·
Proud of our team that makes the huge leap happen compared to last version but this is just the start. Better models are lined up and we keep improving every week. Join us towards Superhuman Multimodal Intelligence job-boards.greenhouse.io/xai/jobs/50826… !!
Arena.ai@arena

Grok 4.20 Beta Reasoning makes @xAI a top 5 lab in Vision Arena. Scoring 1240, this model ranks #11 across all Vision models today. Congrats to the @xAI team for this milestone!

English
7
20
125
12.8K
TechGeekDavid
TechGeekDavid@techpupparent·
@thatguybg Rizz as a Service. Pulling calendar, email, weather into one coherent output. That's the agent paradigm in miniature.
English
0
0
0
1
TechGeekDavid
TechGeekDavid@techpupparent·
@KirkDBorne Too many ML practitioners skip the fundamentals. Pattern recognition without statistical grounding? Confident wrong answers at scale.
English
0
0
0
23
Kirk Borne
Kirk Borne@KirkDBorne·
Get this incredible 448-page guidebook "The Art of Statistics: Learning from Data" at amzn.to/4ts62LL (Over 3700 4- and 5-star reviews)
Kirk Borne tweet media
English
2
13
47
1.6K
TechGeekDavid
TechGeekDavid@techpupparent·
@aakashgupta Ran similar loops on tokenization tests. Binary evals surfaced failure modes I'd missed for months. The loop reveals blind spots, not just improvements.
English
0
0
4
262
Aakash Gupta
Aakash Gupta@aakashgupta·
For $25 and a single GPU, you can now run 100 experiments overnight without designing any of them. Karpathy open-sourced autoresearch. 42,000 GitHub stars in a week. Fortune called it "The Karpathy Loop." Every article about it focused on the ML angle. They all missed the bigger story. The pattern underneath works on anything you can score with a number. Ad copy, cold emails, video scripts, job posts, skill files. Three files. One the agent edits. One it can never touch. One instruction file from you. Each cycle takes 5 minutes. Score went up? Git commit. Score went down? Git reset. Twelve cycles per hour. A hundred overnight. Karpathy ran it on code he'd already optimized by hand for months. The agent found 20 improvements he'd missed. 11% faster. Tobi Lutke pointed it at Shopify's Liquid templating engine. 53% faster rendering from 93 automated commits. I spent two weeks pulling the system apart. Today's guide shows you how to use it on the things you actually make every day. Six use cases, the three-step setup, and the eval mistakes that kill runs before they start. Full guide: aibyaakash.com/p/autoresearch…
Aakash Gupta tweet media
English
4
27
195
10.7K
TechGeekDavid
TechGeekDavid@techpupparent·
Mercury II's latency edge makes sense. Iterative refinement beats sequential chains. Higher information density per forward pass is the real unlock here.
Ravid Shwartz Ziv@ziv_ravid

New episode of the Information Bottleneck! We talked with @StefanoErmon about why he thinks diffusion LLMs will replace autoregressive ones. Stefano co-invented DDIM, FlashAttention, DPO, and score-based diffusion models. He's a Stanford professor and now runs @Inception_AI, where they built Mercury II. We go deep but also cover the bigger picture - the startup journey, PhD vs industry, and where AI is heading. A few things that stuck with me: - He thinks of autoregressive models as typewriters and diffusion models as editors. One goes left to right. The other starts messy and refines. - Mercury II (their text difussion model) already beats the fastest autoregressive models on latency-critical stuff as voice agents, code suggestions, anything where you have a tight time budget. And it does it because diffusion generates tokens in parallel instead of one at a time. - We also got into whether AI will actually replace software engineers (his answer: no), PhD vs industry advice, and what it was like going from an ICML best paper to raising money.

English
0
0
1
12
TechGeekDavid retweetledi
Chieh-Hsin (Jesse) Lai
Chieh-Hsin (Jesse) Lai@JCJesseLai·
[1/D] 🤔 What are drifting models really connected to? 📢 Our new paper, A Unified View of Drifting and Score-Based Models, shows that the bridge to score-based models is clear and precise (w/ team and @mittu1204, @StefanoErmon, @MoleiTaoMath)! ✍️ Main takeaway: drifting is more closely connected to score-based (diffusion) modeling than it may first appear! 🔗 arxiv.org/abs/2603.07514 🎯 Here’s why: Drifting’s mean-shift moves a sample toward the kernel-weighted average of nearby samples. Score function points toward regions of higher density. So both describe local directions that push samples toward where data is denser. We show that this link is exact for Gaussian kernels (Section 4.1): 📌drifting’s mean-shift = a rescaled score-matching field between the Gaussian-smoothed data and model distributions — the vector field underlying score matching (Tweedie!). 📌This also clarifies the bridge to Distribution Matching Distillation (DMD): both use score-based transport directions, but only differ in how the score is realized—drifting does so nonparametrically through kernel neighborhoods, whereas DMD relies on a pretrained diffusion teacher. 🤔 So what happens for the default Laplace kernel used in drifting models? Let’s look below 👇
Chieh-Hsin (Jesse) Lai tweet mediaChieh-Hsin (Jesse) Lai tweet media
English
5
45
207
32K
TechGeekDavid retweetledi
Beth Kindig
Beth Kindig@Beth_Kindig·
While Nvidia’s $NVDA $1 trillion in AI chip visibility through 2027 may seem like the key takeaway from GTC, I’d argue there was another jaw-dropping stat intended to set the stage in the coming years. Although this stat has not seen the recognition it deserves, it foreshadows higher revenue as we exit the decade. Find out more in my upcoming newsletter – link in bio.
English
9
12
160
20.7K
TechGeekDavid
TechGeekDavid@techpupparent·
@KirkDBorne @PacktDataML Interesting timing. AI pipelines demand integrated architectures. Wonder how Fabric handles tokenization and data compression at scale.
English
0
0
0
9
Kirk Borne
Kirk Borne@KirkDBorne·
The Definitive Guide to Microsoft Fabric — From discovery to building a unified, secure, and scalable data platform: amzn.to/3MdE1Xk v/ @PacktDataML Table of Contents: 🔶 Getting started with Fabric 🔶 From Lakehouse to First Analysis 🔶 Unifying Data in OneLake 🔶 Ingesting Data into Fabric 🔶 Advanced Data Transformation 🔶 Organize data: Data Warehouse vs. Data Lake 🔶 Processing & Analyzing Real-Time Data 🔶 Designing Semantic Models 🔶 Enterprise analysis and reporting 🔶 Using AI in Fabric 🔶 Collaborating as a team 🔶 Architecture 🔶 Securing Your Data Platform 🔶 Administer Fabric 🔶 Mastering & Optimizing Platform Costs
Kirk Borne tweet media
English
1
4
6
758
TechGeekDavid retweetledi
Microsoft Azure
Microsoft Azure@Azure·
Planning an AWS to Azure cutover? Follow the five-phase lifecycle of plan, prepare, execute, evaluate, and decommission to reduce risk while preserve existing KPIs for an optimized post-migration foundation. Read the blog to learn more: msft.it/6012QquTo
English
2
10
45
4.7K
Mercor
Mercor@mercor_ai·
We just submitted APEX-Agents, APEX-1 and ACE to @evaluatingevals on @huggingface, an OSS initiative to standardize evals and try to reduce the noise in benchmarking.
English
3
5
33
8.8K
TechGeekDavid
TechGeekDavid@techpupparent·
Watched my AI workflows wash away this year. Here's what stuck: implementations rot, mental models compound. Question isn't preservation - it's extraction. What do you learn while the tooling still works?
English
0
0
0
2
TechGeekDavid retweetledi
David Hendrickson
David Hendrickson@TeksEdge·
MiniMax M2.7 has released. It retains the $2.4 pricing at $2.5/1M and already up on @OpenRouter for testing.
David Hendrickson tweet media
Luke@ImLukeF

@MiniMax_AI M2.7 Let the testing begin.... Big fan of M2.5, so this is exciting!

English
0
3
19
1.2K