A Fellow Struggler

388 posts

A Fellow Struggler

@BotanicBinary

Just another guy trying to make machines learn to eliminate us. Currently @microsoft to build "another Copilot"

参加日 Mayıs 2025

153 フォロー中87 フォロワー

固定されたツイート

A Fellow Struggler@BotanicBinary·24 Haz

Thanks for this @cneuralnetwork I will take this up personally. Broad aims: - end to end understanding of the ML aspect of modern AI mostly in the post training and RL side of things. Includes detailed equations, paper readings and code implementations. - get into the application layer and build stuff which i will be proud of using. I want to achieve something which may take 2-3 years in normal time in these 6 months . As Elon said, may not achieve everything but the effort will put me ahead of a lot of people in these few months. See ya all tomorrow. Caio.

neural nets.@cneuralnetwork

i want to start something small but powerful a movement called "180 Days of Whatever" here’s the idea: for the next 180 days, you’ll do two things: - set one goal you’re determined to achieve in these 6 months, big or small, personal or professional - show up daily: document your learnings on Notion, or if you're coding, push it to GitHub i believe you can genuinely turn your life around in 180 days but only if you stay consistent and let others witness your journey starting tomorrow, i’ll post my daily post and i want you to reply with yours you'll build in public, support each other, and make your progress visible. my reach will amplify your posts. let’s begin today reply to this with the one goal you want to achieve in 180 days write it down. make it real

English

A Fellow Struggler@BotanicBinary·3d

Look promising. One feature request which maybe I can fork as well: In the edu mode, add something called as "assignment" mode? The idea will be: setup the research paper as an assignment with boilerplate code and `# TODO` blocks to help students implement the imp methods themselves.

English

pdawg@prathamgrv·3d

I made a Claude Code skill that turns any arxiv paper into working code. Every line traces back to the paper section it came from & any implementation detail the paper skips will be flagged, and not assumed. open sourcing it - github.com/PrathamLearnsT…

English

283

2.6K

195.6K

A Fellow Struggler@BotanicBinary·6d

@karpathy "Software engineering will be dead in the next 6 months" lol

English

Andrej Karpathy@karpathy·31 Mar

New supply chain attack this time for npm axios, the most popular HTTP client library with 300M weekly downloads. Scanning my system I found a use imported from googleworkspace/cli from a few days ago when I was experimenting with gmail/gcal cli. The installed version (luckily) resolved to an unaffected 1.13.5, but the project dependency is not pinned, meaning that if I did this earlier today the code would have resolved to latest and I'd be pwned. It's possible to personally defend against these to some extent with local settings e.g. release-age constraints, or containers or etc, but I think ultimately the defaults of package management projects (pip, npm etc) have to change so that a single infection (usually luckily fairly temporary in nature due to security scanning) does not spread through users at random and at scale via unpinned dependencies. More comprehensive article: stepsecurity.io/blog/axios-com…

Feross@feross

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios @1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.

English

552

1.1K

10.5K

1.4M

A Fellow Struggler@BotanicBinary·30 Mar

@satyanadella What about latency? Do enterprise users even have the patience to allow crtiques to happen, if a single model can give good enough results?

English

222

Satya Nadella@satyanadella·30 Mar

Introducing Critique, a new multi-model deep research system in M365 Copilot. You can use multiple models together to generate optimal responses and reports.

English

427

505

4.2K

1.4M

A Fellow Struggler@BotanicBinary·30 Mar

ZXX

A Fellow Struggler@BotanicBinary·27 Mar

@himanshustwts @karpathy Sorry i mean RoPE instead of two muons

English

A Fellow Struggler@BotanicBinary·27 Mar

@himanshustwts @karpathy Nanochat has muon, gqa with kv caching and muon optimizer in it now

Filipino

himanshu@himanshustwts·27 Mar

nanoGPT by @karpathy is still the most relevant reference to hack and learn if someone is starting out in ai research. i tried to look (been a longtime!) what all work has been done to beat the baseline: > Architectural modernization (RoPE, QK-norm, ReLU, RMSnorm etc) > Optimizer innovation (Muon, alts to AdamW, NorMuon, SOAP etc) > Attention Kernel work (FlexAttention, FA3, long-short sliding window attention etc) > Context/Window scheduling (window size warmup, explicit max sequence length schedules etc) > Residual engineering (skip connections from the embedding to every block, skip from block 3 to 6, value residuals, Partitioned HPC etc) > Value-path augmentation > Logit stabilization / output-head hacks (FP8 matmul for the LM head, asymmetric rescale, softcapped / tanh-capped logits etc) > Initialization tricks > Weight tying became a tunable knob (untie embed and lm_head at 2/3 of training etc) > Data curation and dataset swaps (Fineweb-Edu, nvidia Climbmix etc) > Data packing / document-boundary handling > Gradient/update scheduling tricks > Batch-size + Sequence length scheduling > Backout mechanisms > Local memory/lookback hacks > Sparse/gated mixing > Token-feature enrichment(Partial Key Offset and Bigram hash embedding) > Compiler/software stack upgrades > Alternative model families are now entering the same benchmark frame there is many and many incredible hack been done and dude like we see someone beating the baseline every week.

English

308

10.4K

A Fellow Struggler@BotanicBinary·17 Mar

@reach_vb And the only Anthropic benchmark visible is "Haiku 4.5". lol

English

118

Vaibhav (VB) Srivastav@reach_vb·17 Mar

it's a good model, sir!

English

131

4.7K

A Fellow Struggler@BotanicBinary·15 Mar

@mohitwt_ @jino_rohit Nah man this is unreal, don’t even have half of this ability. Keep us updated!

English

mohit@mohitwt_·15 Mar

@BotanicBinary @jino_rohit none, sometimes i use the web interface to ask ai some clarifications or better, more efficient ways to do things but never agents and stuff

English

mohit@mohitwt_·15 Mar

In last 2 days, In my Framework I added: activation functions: - ReLU - GELU - SiLU - Softmax / LogSoftmax optimizers: - SGD - Adam - AdamW loss functons: - CE - MSELoss - BCEWithLogitsLoss normaliztion layers: - BatchNorm1d - BatchNorm2d - LayerNorm image: - Conv2d - MaxPool2d - AvgPool2d Along with a number of helper utilities and internal improvements. but here, I ran several end to end integration tests over framework and tested everything from tensors ops, broadcasting, autograd, deep layers and connecting everything together. it works. The framework successfully trained multiple models last night, i need to implement multiple architectures now to train bigger models with more accuracy and then move to transformers. will share detailed training results tomorrow.

mohit@mohitwt_

i think i did it. all coming together, ~27 days.. it fucking works man. much better than last time

English

195

15.6K

A Fellow Struggler@BotanicBinary·15 Mar

@ravikiran_dev7 Lets hope they don’t remove it for the employees as well. Or is it because we use so much that students are being cut?

English

130

Ray🫧@ravikiran_dev7·14 Mar

Guess what !! Yesterday I got copilot pro student and today they removed claude's best models 😭😭 My luck is the worst thing 🥲

Ray🫧@ravikiran_dev7

Just got GitHub Copilot pro for absolutely free for next 2 years 😋🎉

English

239

54.6K

A Fellow Struggler@BotanicBinary·15 Mar

Refactored my website from trash to somewhat reasonable in one day with Claude code. Just used claude.md, context management and playwright mcp. Check it out here: sohammistri.github.io, more stuff incoming. (I promise :)

English

A Fellow Struggler@BotanicBinary·14 Mar

Had my first weekend with Claude Code. I have used Github Copilot Chat and Agent on VS Code in work with Opus 4.6 so I have surface level understanding of how good these agentic coding tools are. The main advantage of the Claude Code CLI I have seen so far is the ability to add a lot of custom commands, and have a local memory using CLAUDE.md. Next up are the intermediate level stuff like skills, mcp and hooks. Will do it tomorrow.

English

136

A Fellow Struggler@BotanicBinary·11 Mar

The worlld around me is changing. Sometimes feel like quitting and start building something

English

A Fellow Struggler@BotanicBinary·10 Mar

Earlier 4 and 4o series were released on Copilot days after openai got them out on ChatGPT. Even a single ckpt of 4o which was much better for chat was shipped months later, because we didn’t know our 4o ckpt was not chat and tool calling finetuned. Its from GPT 5 that we have been shipping on same day. Though quality gaps exist but are less severe

English

himanshu@himanshustwts·9 Mar

never thought i had see microsoft two months behind state-of-the-art

Satya Nadella@satyanadella

Announcing Copilot Cowork, a new way to complete tasks and get work done in M365. When you hand off a task to Cowork, it turns your request into a plan and executes it across your apps and files, grounded in your work data and operating within M365’s security and governance boundaries.

English

6.1K

A Fellow Struggler@BotanicBinary·7 Mar

@YuvrajS9886 Or distill behaviours into them

English

Yuvraj Singh@YuvrajS9886·6 Mar

Time to distill Sarvam models ig

English

1.3K

A Fellow Struggler@BotanicBinary·6 Mar

@birdabo @DevinSoto @OpenAI And how do we know you are also not doing the same?

English

sui ☄️@birdabo·6 Mar

@DevinSoto @OpenAI people does that because it gets them engagement == x creator rev.

English

2.1K

sui ☄️@birdabo·6 Mar

gpt-5.4 seems like a failure. > zero SM-Bench category leads (worse than 5.2 and crushed by 5.1) > creative writing: dead. > hallucinates worse than gpt-5.1 lmao. > guesses instead of clarifying. > chat in chatgpt is officially dead. what the happened to @OpenAI?

English

318

26.4K

A Fellow Struggler@BotanicBinary·4 Mar

ZXX

A Fellow Struggler@BotanicBinary·1 Mar

@adonis_singh Based on the model weights we see, 5.2 is also a 64 layer deep encoder but different pre training run against 5.1 and below model. I think 4.5 was 120 layer deep iirc. I am afraid that rl will be so expensive with rollouts and updates maybe grok can do something with all its GPUs

English

644

adi@adonis_singh·1 Mar

I still believe no one has trained a reasoning model with a base that's as strong as gpt-4.5

English

303

40.3K

A Fellow Struggler@BotanicBinary·1 Mar

@ajay_2512x Why the compulsory X and StackOverflow account?

English

2.8K

Ajay Bhakar@ajay_2512x·1 Mar

🚀 Hiring at Sarvam AI 📍 Bengaluru, Karnataka (On-Site) 🕒 Full-Time & Internship Opportunities Open Roles • AI Researcher / Research Engineer – Autonomous Agents • Machine Learning Engineer – Computer Vision & VLM • Machine Learning Engineer – Sarvam Studio • ML Engineer • Backend AI Engineer – Sarvam Studio • Software Engineer – Backend Systems (Autonomous Agents) • Backend Engineer – API Team • Backend Engineer – Samvaad • Frontend Engineer – API Platform • Senior Frontend Engineer • Frontend Engineer Intern – Arya Team • Principal Security Engineer Be part of India’s AI revolution 🇮🇳 Apply here: careers.kula.ai/sarvam-ai #Hiring #AIJobs #MachineLearning #StartupHiring #Bengaluru #AutonomousAgents #SarvamAI

English

121

178

2.5K

803.6K

A Fellow Struggler@BotanicBinary·24 Şub

@shravi_aj Work from cafe is the fakest stuff to ever exist. Nothing serious gets done in such places, and costs a lot.

English