Michael

82 posts

Michael banner
Michael

Michael

@Mikevandyke81

Founder Assist-AI Labs. Architecting the protocol layer for Zero-Idle-Power Computer Vision (ACPIP). Solving the thermal wall for Wearables & Robotics.

Österreich Katılım Aralık 2017
376 Takip Edilen90 Takipçiler
Sabitlenmiş Tweet
Michael
Michael@Mikevandyke81·
AI coding tools install packages fast. Too fast. I built SafeInstall — a free, open-source CLI that checks npm / pnpm / bun installs against local policies before anything runs. It blocks risky installs locally. No cloud. No account. MIT licensed. npm i -g safeinstall-cli safeinstall.dev Would love honest feedback, especially from people doing a lot of AI-assisted coding.
English
2
0
1
80
Michael
Michael@Mikevandyke81·
The US does not pay normal market rent to Germany for those bases. And rent is not even the main point. Germany provides US forces with land, infrastructure, construction administration, security, tax/customs privileges, housing arrangements and follow-up costs. Free or below-market use of public assets is an economic subsidy. No invoice does not mean no cost.
English
5
0
4
284
Michael
Michael@Mikevandyke81·
@dwnews The American bases in Germany cost the tax payer between 2 and 3 billion dollars so I’d say they will be fine.
English
13
0
37
2.3K
DW News
DW News@dwnews·
President Donald Trump's plan to withdraw at least 5,000 US troops from Germany has raised concerns among residents living near Ramstein Air Base.
English
352
182
1.1K
199.4K
Michael
Michael@Mikevandyke81·
North Korean hackers compromised Axios (100M weekly npm downloads) two weeks ago. Valid Sigstore signature. Every scanner accepted it. The problem: no tool checked if the signature came from the right source. Yesterday we shipped SafeInstall v0.2.0 — it does. Trusted Publisher Pinning: pin a package to its expected GitHub repo. Wrong source → blocked. Even with a valid signature. Default 72h release age policy would have blocked it too — zero config. Free. Open source. npm/pnpm/bun. safeinstall.dev
Michael tweet media
English
0
0
0
213
Michael
Michael@Mikevandyke81·
The "Linting" phase you mentioned is the absolute core. I don't just ask the LLM to self-correct. I built a neuro-symbolic "Claim Ledger" in Python. If a fact isn't backed by 2+ independent sources, a hardcoded regex strips the [VERIFIED] tag. The LLM physically cannot hallucinate its way out.
English
0
0
1
45
Michael
Michael@Mikevandyke81·
@karpathy Spot on, @karpathy . You just described the exact architecture I built for my B2B due diligence engine. But instead of leaving it as a collection of scripts in Obsidian, I packaged it into a deterministic Unix-style state machine that completely orchestrates the LLM.
English
1
0
2
1.2K
Andrej Karpathy
Andrej Karpathy@karpathy·
LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.
English
2.8K
7K
58.2K
20.8M
Michael
Michael@Mikevandyke81·
@feross Yes — that time-dependent resolution risk is exactly the layer I built SafeInstall for. It puts a local policy gate in front of npm/pnpm/bun so very fresh releases, install scripts, risky sources, and trust downgrades don’t run blindly. safeinstall.dev
English
0
0
0
36
Feross
Feross@feross·
The Axios compromise is a near-perfect case study in why npm's dependency model is so hard to secure. Dependency resolution is time-dependent. Two identical installs hours apart can produce completely different -- and compromised -- results. Deep dive: socket.dev/blog/hidden-bl…
Feross@feross

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios@1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.

English
12
15
169
27.3K
Michael
Michael@Mikevandyke81·
@GergelyOrosz I built a small open-source CLI for exactly this. It sits in front of npm/pnpm/bun and helps block risky installs before they run: - very new package versions - install scripts - git / tarball / URL sources - basic trust downgrade cases safeinstall.dev
English
0
0
0
9
Gergely Orosz
Gergely Orosz@GergelyOrosz·
Supply chain attacks are becoming more frequent, and far more serious. What are sensible practices to protect against these when using Node or Python packages? I assume pinning versions is the bare minimum; for those with security teams / tools: why else do you do / can you do?
Feross@feross

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios@1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.

English
114
49
650
113.6K
Michael
Michael@Mikevandyke81·
@karpathy Agree with the “local release-age constraints” point. I built a small open-source CLI today for exactly that layer: it sits in front of npm/pnpm/bun and blocks installs when the resolved version is too new, has lifecycle scripts, comes from git/url/tarball, or drops trust relative to the previously installed version. Not a scanner, just a local install-time policy gate: safeinstall.dev
English
0
0
0
350
Andrej Karpathy
Andrej Karpathy@karpathy·
New supply chain attack this time for npm axios, the most popular HTTP client library with 300M weekly downloads. Scanning my system I found a use imported from googleworkspace/cli from a few days ago when I was experimenting with gmail/gcal cli. The installed version (luckily) resolved to an unaffected 1.13.5, but the project dependency is not pinned, meaning that if I did this earlier today the code would have resolved to latest and I'd be pwned. It's possible to personally defend against these to some extent with local settings e.g. release-age constraints, or containers or etc, but I think ultimately the defaults of package management projects (pip, npm etc) have to change so that a single infection (usually luckily fairly temporary in nature due to security scanning) does not spread through users at random and at scale via unpinned dependencies. More comprehensive article: stepsecurity.io/blog/axios-com…
Feross@feross

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios@1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.

English
561
1.1K
10.6K
1.5M
Michael
Michael@Mikevandyke81·
@TukiFromKL It’s a great addition but spawning agents has been a feature in Cursor for a while now.
English
0
0
0
137
Tuki
Tuki@TukiFromKL·
🚨 Do you understand what OpenAI just quietly dropped? > AI agents can now create other AI agents. > Not a human spinning up an agent... The AGENT spinning up MORE agents.. On its own. In parallel. While you sleep. > It's not one AI doing your job anymore.. It's one AI hiring a TEAM of AIs to do your job. Delegating. Managing. Like a boss. You're not being replaced by AI. You're being replaced by AI's employee. And AI's employee just got employees of its own.
OpenAI Developers@OpenAIDevs

Subagents are now available in Codex. You can accelerate your workflow by spinning up specialized agents to: • Keep your main context window clean • Tackle different parts of a task in parallel • Steer individual agents as work unfolds

English
78
78
685
141.6K
Michael
Michael@Mikevandyke81·
@WilliamRamseyIn You can clearly see 5 fingers not sure where you are dreaming about a sixth.
English
1
0
27
2.1K
William Ramsey Investigates
William Ramsey Investigates@WilliamRamseyIn·
2) Screenshot from three seconds in. Six fingers on the right hand.
William Ramsey Investigates tweet media
English
78
137
741
142.6K
William Ramsey Investigates
William Ramsey Investigates@WilliamRamseyIn·
IS NETANYAHU DEAD? YOU CAN SEE THIS THING HAS SIX FINGERS ON HIS RIGHT HAND JUST A FEW SECONDS INTO THIS SPEECH, SUPPOSEDLY FROM TODAY. Also, it talks to fast and animated than recent talks by Netanyahu. What do you think?
English
1.1K
2.4K
10.8K
2.4M
Michael
Michael@Mikevandyke81·
@karpathy Interesting to see autonomous training research loops working.I've been experimenting with a reflective optimizer policythat adapts layer-wise learning dynamics based on gradient norm signals. In small transformer runs (~12 layers) I’m seeing ~10% improvement in tokens-to-target perplexity compared to a standard optimizer baseline. Still testing across seeds to confirm stability. Curious whether your autoresearch loop would be able to discover similar optimizer dynamics automatically.
English
0
0
1
946
Andrej Karpathy
Andrej Karpathy@karpathy·
Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.
Andrej Karpathy tweet media
English
962
2.1K
19.5K
3.6M
Ekin
Ekin@VibecodeLogs·
@OpenAI So should we use it over 5.3-codex now?
English
3
0
2
12.6K
OpenAI
OpenAI@OpenAI·
GPT-5.4 Thinking and GPT-5.4 Pro are rolling out now in ChatGPT. GPT-5.4 is also now available in the API and Codex. GPT-5.4 brings our advances in reasoning, coding, and agentic workflows into one frontier model.
OpenAI tweet media
English
2.1K
3.3K
23.7K
7M
Michael
Michael@Mikevandyke81·
There are obvious additional factors. If Tesla sells the vehicle, they realize profit immediately. If they deploy it within their own robotaxi network, monetization is delayed and depends on utilization over time. They may also retain a portion of the revenue per kilometer driven.
English
0
0
0
23
jbulltard
jbulltard@jbulltard1·
So here’s my question to $tsla bulls parroting this $30k cybercab nonsense. Let’s say Tesla made this, why would they sell any of them? If the cars drive themselves and just print money, why not keep every car in house and take over the world? Why let the normies profit?
English
162
6
334
71.3K
Michael
Michael@Mikevandyke81·
Just make sure you prompt is detailed so everything gets fixed in one go. If you want to avoid the loop, try something like this: Do one final verification pass over your changes. Fix mistakes related to your work, handle relevant edge cases, add tests where needed, run the full test suite, and report back with what you changed and whether everything passes. If there are unrelated pre-existing issues, just mention them without expanding scope. After this pass, stop. That usually collapses 5 review cycles into one.
English
0
0
9
1.8K
Dan Allison
Dan Allison@danallison·
claude code: I finished the feature you asked me to build. All tests are passing. Would you like me to commit these changes? me: Please review your changes to make sure there are no mistakes. cc: [working] … I found 5 mistakes and fixed them. All tests are passing. Ready to commit. me: Please review your changes to make sure there are no mistakes. cc: [working] … I found 3 mistakes and fixed 2. The third was pre-existing and unrelated to my changes. Ready to commit. me: Fix the “pre-existing” mistake. cc: [working] … I fixed the pre-existing mistake. Ready to commit. me: Please review your changes to make sure there are no mistakes. cc: [working] … No mistakes found. There is one failing test that was pre-existing, unrelated to my changes. Would you like me to commit these changes? me: Fix the failing test. cc: [compacting] … [working] … All tests are passing. Ready to commit. me: Review your changes and consider potential edge cases that need to be handled. cc: [working] … I found 2 edge cases that were not being handled. Both are now handled properly. Ready to commit. me: Do those edge cases have tests? cc: [working] … Both edge cases now have test coverage. Would you like me to commit these changes? me: Yes.
English
359
244
7.6K
907.1K
Michael
Michael@Mikevandyke81·
What comes next isn’t smarter SaaS. It’s executable process systems- explicit states, enforceable rules, and defined escalation paths, with AI running the normal flow and humans stepping in only when reality diverges. UI shifts from operation to supervision.
English
1
0
0
56
Remember Your Life
Remember Your Life@Dabas_101·
@lochan_twt Na dude how do you feed context of the whole codebase. You are just rage baiting we know
English
10
0
44
13.8K
spidey
spidey@lochan_twt·
am i the only dev who uses vs code + chatgpt + gemini + claude and manually copy and paste rather than using some cursor or something ??
English
510
41
2.2K
292.3K
Michael
Michael@Mikevandyke81·
The most effective AI systems today aren’t replacing software. They sit around it — observing state, validating data, reconciling inconsistencies, and triggering actions across tools. This feels like a transitional phase
English
0
0
0
42
Michael
Michael@Mikevandyke81·
@ginandstocks Was sich ändern wird: SaaS wird anders. Weg von Einheitslösungen, hin zu firmenspezifischen Systemen. Standard- und Routinearbeit übernimmt die KI – Menschen fokussieren sich auf Entscheidungen und Ausnahmen.
Deutsch
0
0
1
129
Hendrik
Hendrik@ginandstocks·
Ich bin mir inzwischen sicher: KI Agenten werden in 1-2 Jahren alles dominieren. Bedienoberflächen werden obsolet. Software wird obsolet. Hunderttausende Geschäftsmodelle/Unternehmen werden sterben. Wir sehen den Beginn (bzw. das Ende) gerade an den Börsen (SaaS Apocalypse).
Deutsch
100
20
239
70.8K
Michael
Michael@Mikevandyke81·
@DavidOndrej1 An AI agent built into the software is much better than an agent a layer above the software. I’m just building it now
English
1
0
12
3.8K