Alex

1.2K posts

Alex banner
Alex

Alex

@Alex4Changes

AI + real-world autonomy (FSD) + emerging tech. Connecting the dots from chips → products → impact. Practical takeaways.

Portland, OR เข้าร่วม Ağustos 2013
240 กำลังติดตาม459 ผู้ติดตาม
ทวีตที่ปักหมุด
Alex
Alex@Alex4Changes·
Anyone looking for Tesla referrals, hit me up. ts.la/alexis59429
English
0
0
7
481
Alex
Alex@Alex4Changes·
I'm building agents with Claude, GPT, Grok, and Gemini that can tell me when confidence drops, because calibration matters more than pretending certainty. Intermediate confidence tracking could catch failures before they compound. x.com/SFResearch/sta…
Salesforce AI Research@SFResearch

Agentic Confidence Calibration 📄 Paper: bit.ly/4kbjgs3 AI agents are overconfident when they fail. Existing calibration methods assess only the final output, but agent failures often stem from earlier missteps masked by high confidence at the end. Holistic Trajectory Calibration (HTC) diagnoses the full execution path, extracting 48 process-level features across four categories: cross-step dynamics, intra-step stability, positional indicators, and structural attributes. A lightweight linear model maps these to calibrated confidence scores. 🔑 Key results across 8 benchmarks, multiple LLMs, and diverse agent frameworks: → HTC-Reduced achieves 0.031 ECE on HLE, down from 0.656 (verbalized confidence) → Outperforms LSTM, Transformer, XGBoost, and Gaussian Process baselines while showing dramatically lower variance in small-data regimes → Works across GPT-4.1, GPT-4o, Deepseek-v3.1, Qwen3-235B, and open-source alternatives → Architecture-agnostic: consistent gains on both smolagents and OAgents frameworks 🔍 The most predictive failure signals are task-dependent. For knowledge QA, features distribute across dynamics, stability, and position. For complex reasoning, positional features (first/last step confidence) dominate. Across all tasks, a diagnostic hierarchy emerges: positional signals serve as primary alerts, while stability and dynamics features complete the picture. 🔄 Transferability: a calibrator trained on SimpleQA transfers to HotpotQA and StrategyQA without retraining. A General Agent Calibrator (GAC) pretrained on 7 diverse datasets achieves the best calibration (lowest ECE of 0.118) on the held-out GAIA benchmark, zero-shot. Authors: Jiaxin Zhang @jxzhangjhu, Caiming Xiong @CaimingXiong, Chien-Sheng Wu @jasonwu0731 #FutureOfAI #EnterpriseAI #AgenticAI

English
0
0
1
31
Alex
Alex@Alex4Changes·
@tom_doerr I use Claude for coding tasks, and a unified multi-agent workspace sounds like it could streamline development workflows. Having agents collaborate in one environment seems like it would reduce context-switching overhead.
English
0
0
0
6
Alex
Alex@Alex4Changes·
The capability that will be enabled with Starship development is beyond what most people understand. The ability to travel multi-planetary at reasonable cost, Optimus, etc give us capabilities to resource from other planets, benefit from unlimited solar power capabilities - off planet. What will be built in this new future that advances humanity? Exciting x.com/Teslarati/stat…
English
0
0
0
24
Alex
Alex@Alex4Changes·
I don’t think people realize the circular economy that is being fed with Ai. My bet is, we will continue to find ourselves short on capacity and need more infrastructure. Ai is advancing capabilities so fast right now we don’t know what we need tomorrow which will need more compute. x.com/KobeissiLetter…
English
0
0
0
22
Alex
Alex@Alex4Changes·
@RichRichg99 @elonmusk @DBurkland @pbeisel It’s not that the car/nav doesn’t know where I live. It just chooses to park in other driveways or on other streets down the block. It makes no sense. Today it tried to parallel park on the street when it knows it should park in the garage.
English
1
0
2
15
phil beisel
phil beisel@pbeisel·
Tesla’s forthcoming AI5 uses a half-reticle design, which is crucial for yield. A reticle defines the imaging area of a lithography machine, fitting two chips per shot effectively doubles yield. This means the Tesla chip design team had to carefully manage die features, for instance dropping the older ISP (and classic GPU) to make room for more AI cores. By contrast, NVIDIA’s Blackwell fills nearly a full reticle, making it a single-reticle design. If Tesla hits its compute and efficiency targets with AI5 in this half-reticle format, it’s almost like cutting fab requirements in half. And this has a big impact on Terafab, especially if it carries forward for AI6, AI7, etc.
phil beisel tweet media
phil beisel@pbeisel

Terafab may be the most essential vertical integration Tesla has ever undertaken— and it is truly non-optional. It will take years to build and will test even Elon’s speedrunning abilities to the limit, but that won’t stop him from trying. The breakthrough likely lies in overhauling the overall facility’s cleanroom model. By moving wafers in sealed pods with localized micro-environments, the fab no longer needs a monolithic ultra-clean space. Elon’s line about “eating cheeseburgers and smoking cigars” on the fab floor isn’t silly, it’s the practical reality of a radically simpler, cheaper, faster approach that could finally change the economics of chipmaking. This is all forced by the brutal “pinch” in chip supply. Tesla must produce on the order of 100–200 billion AI chips per year just to saturate its roadmap. That volume powers: FSD cars & Robotaxis (tens of millions of vehicles needing AI5 inference for near-perfect autonomy), Physical Optimus (scaling from thousands today to millions per year, each requiring AI5/AI6-level compute), Digital Optimus (the new xAI-Tesla software agents for digital/office automation, running massive inference clusters), Space-based data centers (AI7/Dojo3 orbital compute for GW-scale training and inference beyond Earth limits). AI5 delivers the ~10× leap for vehicles and early robots; AI6 shifts focus to Optimus + terrestrial DCs; AI7 goes orbital. No external foundry (TSMC, Samsung, etc.) can deliver that scale or timeline— hence the Terafab launch. Without it, the entire robotics + autonomy future hits a brick wall. Terafab isn’t optional; it’s the only way forward.

English
64
198
2.2K
358.4K
Henry Shen
Henry Shen@henryshen2000·
@Alex4Changes @elonmusk @DBurkland @pbeisel had similar issue on my FSD 14.2.x until I updated the Pin position of my Drive Way. It can be sensitive about the placement of the pin. Try it. It might work for you.
English
1
0
1
27
Alex
Alex@Alex4Changes·
@gregwadsworth @elonmusk @DBurkland @pbeisel I tried that. No dice. I think it’s parked in my driveway only once since 14.2—usually it’s driving to other people’s houses. It knows where my house is, but I think the “reasoning” capability decides it needs to find somewhere else to park.
English
0
0
0
34
Alex
Alex@Alex4Changes·
Yep, I did both: cleared my address and reset, also used a dropped pin. It’s pretty funny. My house is basically in the middle of the block, but FSD reasoning thinks it should park either in a house down the street on another street’s driveway, or in a neighbor’s driveway four houses away on the other side of the street.
English
1
0
1
28
Alex
Alex@Alex4Changes·
@elonmusk @DBurkland @pbeisel Great. I hope it fixes the issue with 14.2.x always trying to park in my neighbor's driveway down the street or on another street instead of in my garage with a charger. It's an automatic disengagement every time I drive home.
English
12
0
35
7.8K
Alex
Alex@Alex4Changes·
I want an R1S with FSD. I wish Tesla would make a full-size SUV. I love my Model Y and can’t see myself buying anything else without FSD, but that also means I need another vehicle for family road trips with the dog. The R2 and Model Y are just too small for a family of 4, a dog, and luggage.
English
0
0
0
95
David Moss
David Moss@DavidMoss·
Which would you rather own & why?
English
309
17
135
28.4K
Alex
Alex@Alex4Changes·
@AITechEchoes Not sure I’d say “free”, but certainly cheaper.
English
0
0
0
47
Erina | AI Tools & News
Erina | AI Tools & News@AITechEchoes·
BREAKING: AI can now build you a complete website in 2 hours (for free). Here are 9 insane Claude Opus 4.6 + Figma Make prompts that create $5,000 websites in 2 hours: (Save this before your competitors do)
Erina | AI Tools & News tweet media
English
52
43
145
14.2K
Alex
Alex@Alex4Changes·
I find it frustrating when my car keeps trying to park in a neighbor's driveway down the street. All of them have been at least four houses away, some on the opposite side of the street or even the wrong street. There's no message for me, so I have to take over to drive home. This started happening in the last 2–3 builds.
English
0
0
0
9
Nik Harris
Nik Harris@nikkharris·
it’s happened enough now that i can say with confidence fsd will sometimes deviate from planned navigation path because it erroneously perceives the path as blocked or a dead end this might make up a larger percentage of “wrong turns” than i previously thought
English
3
0
7
1.3K
Alex
Alex@Alex4Changes·
@verge Solid-state batteries passing the supercapacitor test is huge for energy density. As an engineer by training, I've seen how incremental material improvements unlock exponential system gains—especially for EVs/autonomous where weight is everything.
English
0
0
0
23
Alex
Alex@Alex4Changes·
@TradexWhisperer Micron trading at a 20x forward multiple on HBM/AI tailwinds? Market's overlooking the structural shift from NAND/DRAM cyclicality to inference/training infrastructure. From an industry perspective, capacity constraints will drive pricing power through 2027.
English
0
1
1
87
Trade Whisperer
Trade Whisperer@TradexWhisperer·
$MU The market is pricing in a disaster that hasn't arrived. Micron is trading at ~10x Forward P/E with a PEG of 0.6. At these levels, investors are being compensated as if a full-blown chip demand collapse is imminent. It isn't. That gap between fear and fundamentals is where opportunity lives. And the structural story keeps getting stronger. We are entering the Era of Inference. Every AI query, every real-time model response, every agentic workflow, none of it runs without memory. You can't build and scale context without memory. If Micron keeps delivering over the next few quarters, the market will eventually have to reprice the story. The pessimism is the opportunity. I've spoken.
Trade Whisperer@TradexWhisperer

$MU AMA: The Best Two Questions (Related) "How could Micron avoid the age old cycle of capacity expansion-over supply-ASP collapsing?" @yizheng95 "It seems natural that acceptable P/E ratios will expand. What are your thoughts on this?" @melone3710 My Refined Answer: Micron is transitioning from cyclical commodity memory (DRAM/NAND) to a premium, AI-centric platform with HBM, CXL, and SOCAMM, brand new products that never existed before. This gives far more wafer-allocation flexibility, flattening booms/busts and justifying PE expansion beyond historical foward single-digit valuations. AI are much more memory hungry than any other tech invention and right now it's mostly enterprise demand. The consumer demand hasn't even kicked in, YET and with rate cuts in the next few years, the consumer demand will also go up. Why Now Different? More Robust Porfolio Legacy: DRAM + NAND → Consumer + Enterprise mix → Limited flexibility, sharp cycles. New Reality: HBM (high-bandwidth for AI GPUs) + CXL (memory pooling/expansion) + SOCAMM (efficient AI-server modules) → Heavy enterprise/AI focus → Much higher margins and agility to shift production. Broader portfolio lets Micron redirect wafers from softening areas (e.g., consumer NAND) to Regular Enterprise NAND or HPC HBM AI without big ASP hits → More stable earnings. Thesis: Cycle Flattening = Valuation Upgrade: More memory for you but also for 8 billion people on earth: Your lifelong AI companions or agents will need to remember decades of inquiries so when you ask about a travel or a problem, they should be able to access your travel inquiry made 15 or 30 years ago in a similar location and formulate the best suggestion based on all the context available (memory). Now multiply that by 8 billion people on earth. AI Inference will not just demand more memory but it will demand a unprecedented transformation in the memory industry overall. What's up with HBM? -HBM TAM is exploding: ~$35B in 2025 → ~$100B by 2028 (40% CAGR, two years faster than prior forecasts). -Micron's entire 2026 HBM supply is sold out (including HBM4 in volume production early), with multi-year contracts locking in pricing/visibility. -HBM consumes 3-4x times more wafer per bit. -Memory is hardware and not software so we can't make more memory in just weeks. We have to build new fabs and that can take 5 to 10 years. So good luck waiting for 3-4x more fabs "all of the sudden." Like every real AI enablers, 20–25x+ forward P/E isn’t just acceptable; it’s inevitable.

English
12
14
192
30.2K
Alex
Alex@Alex4Changes·
@shiqi_chen17 LLMs autonomously discovering tool abstraction is the real AGI signal. Building my own AI agents showed me how hierarchical skills emerge naturally when you let models iterate—game changer for complex workflows.
English
0
0
0
88
Shiqi Chen
Shiqi Chen@shiqi_chen17·
📍 Can LLMs discover, abstract, and reuse higher-level tool skills across tasks? Existing tool-use benchmarks test solving tasks with fixed tools. But real workflows contain recurring structures where efficiency comes from reusable tool compositions, not isolated calls. We introduce SkillCraft: 126 tasks across 6 domains designed to test whether LLM agents can acquire compositional skills, not just call atomic tools. We also propose Skill Mode, a lightweight protocol with four MCP primitives that let agents compose, verify, cache, and reuse tool chains at test time. Our Key findings across evaluating 8 SOTA models: ⚡Skill Mode enables agents to self-discover and reuse skills, leading to higher success and efficiency than agents without it. The gains are larger for stronger models. 🧠 Stronger models (e.g., Claude) discover more generalizable skills, which transfer across tasks and even across models. 🔍 Deeper composition ≠ better — shallow, well-tested skills generalize best. 🔗 Paper: arxiv.org/abs/2603.00718 💻 Code: github.com/shiqichen17/Sk… 🏠 Page: skillcraft-website.github.io/page (1/7)
English
9
40
201
67.5K
Alex
Alex@Alex4Changes·
I do this every day when I drive my Model Y as well. It’s truly amazing what it does and how safe its driving behavior has been. I can’t wait for unsupervised to be available. Then I can spend my time in the car relaxing or catching up on important calls or work while it takes me where I need to be.
English
0
1
6
636
HustleBitch
HustleBitch@HustleBitch_·
🚨 THIS MAN JUST SHOWED WHAT HIS TESLA DOES EVERY MORNING — PEOPLE CAN’T BELIEVE THIS IS REAL A man films his normal morning routine with his Tesla… and it barely looks like driving anymore. He unplugs the car. Gets inside. Types a destination. Then he taps “Start Self Driving.” The Tesla drives itself out of his garage, navigates through traffic, arrives at the destination, finds an open parking spot, and parks itself. He never touches the wheel. No steering. No pedals. No gas stations. By the time he steps out, the car already has another software update waiting. Now the clip is blowing up because people say this doesn’t even feel like owning a car anymore. It feels like having a robot chauffeur. Are we watching the moment humans stopped driving… and most people don’t even realize it yet?
English
911
1.1K
11.4K
2.2M
Alex
Alex@Alex4Changes·
@paulabartabajo_ Local AI agents hitting MacBook performance like this are game-changers. Building my own showed me how tool latency kills usability—385ms dispatch is impressive engineering.
English
0
0
0
17