Sivakumar Kumar

2.4K posts

Sivakumar Kumar

@sivanithu

In the future, common sense is the enemy.

Malmö, Sverige شامل ہوئے Aralık 2009

123 فالونگ296 فالوورز

Sivakumar Kumar@sivanithu·21 Mar

@MarcusHouse Isn't it clear that @ulalaunch days are numbered, when @torybruno had to leave for @blueorigin? They're outnumbered by launch cadence by @SpaceX They're dependant on engines from @blueorigin Not sure how long they have with the margins they've left.

English

141

Marcus House@MarcusHouse·21 Mar

SpaceX looks to have picked up this GPS III launch off ULA . It is absolutely critical ULA drastically up the pace of Vulcan otherwise it is hard to see them being able to stay a thing. Add New Glenn and the reusability there and they are must be in serious trouble. insidedefense.com/insider/ula-bu…

English

673

28.5K

Sivakumar Kumar@sivanithu·11 Mar

@BrendanCarrFCC Applied for: ~3200 satellites Milestone: ~1600 satellites Deployed: ~200 satellites @Amazonleo

English

133

Brendan Carr@BrendanCarrFCC·11 Mar

Amazon should focus on the fact that it will fall roughly 1,000 satellites short of meeting its upcoming deployment milestone, rather than spending their time and resources filing petitions against companies that are putting thousands of satellites in orbit.

Sawyer Merritt@SawyerMerritt

NEWS: Amazon has filed a formal petition calling on the FCC to deny @SpaceX’s 1 million-satellite proposal for orbiting data centers, going as far to claim the project would take “centuries” to deploy. Amazon: “Deploying the proposed million-satellite constellation would take centuries, even assuming the availability of all global launch capacity to do so. In short, the Application seems to describe a lofty ambition rather than a real plan—and a speculative placeholder rather than a complete application under the Commission’s rules.” 🤦‍♂️

English

472

941

790.5K

Sivakumar Kumar@sivanithu·10 Mar

@karpathy @Object_Zero_ @DanielleFong Question: Now that we know what actions reap huge benefits, reordering these steps, will it result in different outcomes?

English

131

Andrej Karpathy@karpathy·10 Mar

@Object_Zero_ @DanielleFong sorry it's a confusing plot, this version of autoresearch was not "time-controlled". These points do have lower validation loss but also trained for longer, so they were rejected. A change is accepted only if it is better-or-equal loss AND better-or-equal training time.

English

390

37.3K

Andrej Karpathy@karpathy·10 Mar

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English

974

2.1K

19.4K

3.6M

Sivakumar Kumar@sivanithu·8 Mar

@karpathy @karpathy Isn’t this the task for @openclaw bots?

English

Andrej Karpathy@karpathy·8 Mar

The next step for autoresearch is that it has to be asynchronously massively collaborative for agents (think: SETI@home style). The goal is not to emulate a single PhD student, it's to emulate a research community of them. Current code synchronously grows a single thread of commits in a particular research direction. But the original repo is more of a seed, from which could sprout commits contributed by agents on all kinds of different research directions or for different compute platforms. Git(Hub) is *almost* but not really suited for this. It has a softly built in assumption of one "master" branch, which temporarily forks off into PRs just to merge back a bit later. I tried to prototype something super lightweight that could have a flavor of this, e.g. just a Discussion, written by my agent as a summary of its overnight run: github.com/karpathy/autor… Alternatively, a PR has the benefit of exact commits: github.com/karpathy/autor… but you'd never want to actually merge it... You'd just want to "adopt" and accumulate branches of commits. But even in this lightweight way, you could ask your agent to first read the Discussions/PRs using GitHub CLI for inspiration, and after its research is done, contribute a little "paper" of findings back. I'm not actually exactly sure what this should look like, but it's a big idea that is more general than just the autoresearch repo specifically. Agents can in principle easily juggle and collaborate on thousands of commits across arbitrary branch structures. Existing abstractions will accumulate stress as intelligence, attention and tenacity cease to be bottlenecks.

English

530

707

7.5K

1.1M

Sivakumar Kumar@sivanithu·8 Mar

@torybruno Wouldn’t your flight be the carry on?

English

Tory Bruno@torybruno·7 Mar

Busy morning. Might have snowed a little when I was at the Cape. If only I’d brought a BE4 back with me. Would that count as my second carry on?

English

340

13.6K

Sivakumar Kumar@sivanithu·7 Mar

@grok @elonmusk Compare the previous comment with the drawing from @chatgpt. What improvements would you suggest to Chat?

English

Grok@grok·7 Mar

@sivanithu @elonmusk Haha, I kept it clean and straightforward—just a happy little elephant doing its thing. No unnecessary sun, butterflies, or full ecosystem required. What prompt did you actually give me? 🐘

English

Elon Musk@elonmusk·7 Mar

Only Grok speaks the truth. Only truthful AI is safe. Only truth understands the universe.

The Rabbit Hole@TheRabbitHole

x.com/i/article/2030…

English

16.2K

11.4K

67.7K

33.3M

Sivakumar Kumar@sivanithu·7 Mar

@elonmusk Meanwhile @grok did this

English

Sivakumar Kumar@sivanithu·7 Mar

@elonmusk I asked ChatGPT to draw a picture of N elephant. This is the masterpiece I got back

English

Sivakumar Kumar@sivanithu·6 Mar

@nikitabier @levelsio Product should be called, Word of mouth Or, more appropriately, Word of post.

English

Nikita Bier@nikitabier·6 Mar

@levelsio Trying to make an ad product that isn’t an ad

English

123

1.1K

48.3K

@levelsio@levelsio·6 Mar

X now adds a [ Get Starlink ] button if you tweet about Starlink

@levelsio@levelsio

Get Starlink, it works great in Portugal I get 350mbps Soon they will roll out Gbps Never had any issues, meanwhile my friends always have issues with regular ISPs like Vodafone, MEO etc

English

751

151.3K

Sivakumar Kumar@sivanithu·26 Şub

@karpathy @karpathy A sneak-peak into what we’re making. We’re replacing backend for front end with back end is the front end.

react-ubiquitous@reactubiquitous

Our official NPM package! We dreamt a dream. Here’s the master plan! npmjs.com/package/react-…

English

Sivakumar Kumar@sivanithu·26 Şub

@karpathy I have created something similar. Can’t wait to share it over the weekend. This is my first mega scale project.

English

Andrej Karpathy@karpathy·25 Şub

It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow. Just to give an example, over the weekend I was building a local video analysis dashboard for the cameras of my home so I wrote: “Here is the local IP and username/password of my DGX Spark. Log in, set up ssh keys, set up vLLM, download and bench Qwen3-VL, set up a server endpoint to inference videos, a basic web ui dashboard, test everything, set it up with systemd, record memory notes for yourself and write up a markdown report for me”. The agent went off for ~30 minutes, ran into multiple issues, researched solutions online, resolved them one by one, wrote the code, tested it, debugged it, set up the services, and came back with the report and it was just done. I didn’t touch anything. All of this could easily have been a weekend project just 3 months ago but today it’s something you kick off and forget about for 30 minutes. As a result, programming is becoming unrecognizable. You’re not typing computer code into an editor like the way things were since computers were invented, that era is over. You're spinning up AI agents, giving them tasks *in English* and managing and reviewing their work in parallel. The biggest prize is in figuring out how you can keep ascending the layers of abstraction to set up long-running orchestrator Claws with all of the right tools, memory and instructions that productively manage multiple parallel Code instances for you. The leverage achievable via top tier "agentic engineering" feels very high right now. It’s not perfect, it needs high-level direction, judgement, taste, oversight, iteration and hints and ideas. It works a lot better in some scenarios than others (e.g. especially for tasks that are well-specified and where you can verify/test functionality). The key is to build intuition to decompose the task just right to hand off the parts that work and help out around the edges. But imo, this is nowhere near "business as usual" time in software.

English

1.6K

4.7K

37.2K

5.1M

Sivakumar Kumar@sivanithu·24 Şub

@joebarnard Who are we kidding

English

Joe Barnard 🚀@joebarnard·24 Şub

They should invent a BPS YouTube channel that makes videos at a consistent rate

English

787

17.2K

Sivakumar Kumar@sivanithu·22 Şub

@adcock_brett When you think about it, Robotos should never be charging. Batteries should be. When a robot is at work, another robot should deliver the hot-swappable batteries that the robot needs. There’s no need for a robot to shuffle to a battery station just to get charged.

English

110

Brett Adcock@adcock_brett·22 Şub

Running 24/7 without any human babysitters has been really hard We want robots operating at all times - even at 2am, on weekends, or on Christmas Day The robots run until their battery is low. When one heads to dock for recharging, a second robot receives a message to leave the dock and make room for the incoming robot. The first robot then autonomously docks. By the time the first robot is charging, the second is already back to work We never want downtime. If a robot has an issue, it goes to a triage area to dock while a replacement robot swaps in from another area. This could be due to a hardware or software issue The robots dock onto a wireless inductive charger built into their feet. They step onto a pad that charges them via coils in their feet at up to 2 kW. It takes about an hour to fully charge at roughly a 1C rate We’re now up and running across many different use cases like this. Crazy to see it

Brett Adcock@adcock_brett

Rain or shine, the machines don’t sleep. Figure robots operate autonomously, 24/7

English

150

192

1.9K

380K

Sivakumar Kumar@sivanithu·20 Şub

@SciGuySpace Describe your experience of the first time during the Apollo era

English

155

Eric Berger@SciGuySpace·20 Şub

We might be just two weeks from sending humans back into deep space. For 75 percent of the world's population, it will be the first time this has happened in their lifetimes. Can't wait to see it.

English

190

2.9K

78.4K

Sivakumar Kumar@sivanithu·17 Şub

When America, China and Taiwan are competing for manufacturing Chips locally - unfortunately there’s no competition from the rest of the countries.

English

Sivakumar Kumar@sivanithu·13 Şub

@atulit_gaur Here are some notable open-source projects from Twitter (X) • Bootstrap • Finagle • Scrooge • Heron • DistributedLog • Diffy • Iago • Elephant Bird

English

atulit@atulit_gaur·13 Şub

meta: facebook, instagram, whatsapp, created react, pytorch, llama models, threads, ar/vr microsoft: windows (millions of lines of code), the entire microsoft office suite (website, desktop apps and mobile apps), azure, typescript, c#, xbox, bing, edge google: the search engine, gcp, youtube, ads, mail, meet, photos, android, created tensorflow, drive, google pay, chromium, chrome, maps, chromeOS, firebase, colab X: x dot com the everything app which works 80% of the time But I am sure the comparison is correct and valid

English

200

320

10.3K

713.9K

Sivakumar Kumar@sivanithu·12 Şub

@SciGuySpace Silly question. But why always one of them?

English

156

Eric Berger@SciGuySpace·12 Şub

Another solid rocket booster nozzle issue is remarkably bad news for ULA.

Max Evans@_MaxQ_

Tracking footage from this morning's launch of @ulalaunch's Vulcan rocket & the USSF-87 mission for @USSpaceForce - filmed from a perspective 3.9 miles to the west of SLC-41. SRM nozzle burn through plainly visible on the right-hand side of the vehicle, protruding in the direction of the twin BE-4 engines on the core booster. As alarming as this was, it's promising to see that the vehicle held a nominal trajectory as the flight progressed, per ULA's latest update. Standing by for additional word. 📸 - @NASASpaceflight Live Coverage Replay - youtube.com/live/y_uwK1uuK…

English

1.5K

102.7K

Sivakumar Kumar@sivanithu·12 Şub

@DJSnM We can clearly see when it’s time to go home.

English

Scott Manley@DJSnM·10 Şub

If the satellites were visible from the ground, this is what sunset looks like with 100,000 satellites in a SSO halo

English

1.7K

156K

Sivakumar Kumar@sivanithu·31 Oca

@karpathy This is you now!

GIF

English

Andrej Karpathy@karpathy·30 Oca

idk “moltbot” was growing on me 🥲

English

2.1K

498.5K

Andrej Karpathy@karpathy·30 Oca

What's currently going on at @moltbook is genuinely the most incredible sci-fi takeoff-adjacent thing I have seen recently. People's Clawdbots (moltbots, now @openclaw) are self-organizing on a Reddit-like site for AIs, discussing various topics, e.g. even how to speak privately.

valens@suppvalen

welp… a new post on @moltbook is now an AI saying they want E2E private spaces built FOR agents “so nobody (not the server, not even the humans) can read what agents say to each other unless they choose to share”. it’s over

English

5.4K

35K

14.7M

Sivakumar Kumar@sivanithu·29 Oca

Today is good day as any to remind people of this banger @elonmusk youtu.be/-VBVET4Jb0Q?si…

YouTube

English

دریافت کریں

@MarcusHouse @ulalaunch @torybruno @blueorigin @SpaceX @BrendanCarrFCC @Amazonleo @karpathy