hypyaml

178 posts

hypyaml

hypyaml

@vbppl

horrendous at coming up with good variable names. wannabe (cracked eng)

Katılım Mart 2014
716 Takip Edilen14 Takipçiler
Juned Khatri | Engineer Turned Recruiter 🇮🇳
Hiring AI engineers in Gurgaon is challenging, Good candidates who are based in Gurgaon are already with with @zomato @letsblinkit and few good orgs. Candidates from Mumbai or BLR don't really wanna leave their current city and move to Gurgaon. How would you deal with this situation?
English
74
1
169
31K
Harsh Bhatt
Harsh Bhatt@harshbhatt7585·
@vbppl it given push of ~ +200-300 tokens/sec
English
2
0
0
5
Harsh Bhatt
Harsh Bhatt@harshbhatt7585·
Training at 1M token/s Here’s how i m doing it, I have been keep reading nanochat training pipeline and that’s a great example on what squeezing the model and high-throughput training. TorchAO js really great because it quantising not the the entire model but selected layers layers of the model. TorchAO is a PyTorch-native optimization library for quantization, sparsity, and low-precision training. I’m selecting the layers where the computation is dense, mostly the heavy matrix multiplication parts, and converting those parts to float8. And other parts I am using bfloat16. That means faster matrix multiplications and better GPU throughput. TorchAO already supports float8 training, and PyTorch mentions speedups up to around 1.5x at large training scale with torch.compile. That’s how you squeeze more tokens per second from the same hardware. And this is one of the underrated parts of training LLM
English
4
2
42
1.7K
Harsh Bhatt
Harsh Bhatt@harshbhatt7585·
@vbppl linear layers are heavy so we can convert them to f8
English
1
0
0
17
Indu Tripathi
Indu Tripathi@InduTripat82427·
ANDREJ KARPATHY JUST DROPPED THE MOST IMPORTANT PYTHON FILE FOR ANYONE LEARNING LLMs He wrote microgpt.py a complete GPT from scratch... no libraries... no frameworks... pure python the entire algorithm fits in one file the only imports are math, random & os here's what's inside: ▫️a from-scratch autograd engine (yes, he built backprop by hand). ▫️a full transformer token embeddings, positional embeddings, multi-head attention, MLP, RMSNorm ▫️Adam optimizer implemented from scratch ▫️training loop + inference in under 200 lines the comment at the top says it all: "this file is the complete algorithm. everything else is just efficiency." and the community went insane 5000+ stars, ports in rust, go, OCaml, julia, CUDA & JS already exist if you've ever used transformers without really understanding what's happening underneath this is your weekend project read it once...you'll never look at LLMs the same way → t.co/2MbPGJJeJ1
Indu Tripathi tweet mediaIndu Tripathi tweet media
English
13
76
552
33.6K
Mini mal
Mini mal@gatelevelanon·
@alexocheema You are comparing a garage built go kart with a le mans Ferrari. Not a fair comparison
English
2
0
40
3.6K
Alex Cheema
Alex Cheema@alexocheema·
My M4 Max MacBook gets 3,756,165 tok/sec in pure C, compared to ~50,000 tok/sec with the FPGA. Try it yourself: github.com/AlexCheema/tal…
luthira@luthiraabeykoon

We implemented @karpathy 's MicroGPT fully on FPGA fabric. No GPU. No PyTorch. No CPU inference loop. Just a transformer burned into hardware, generating 50,000+ tokens/sec. The model is small, but the idea is not: inference does not have to live only in software 👇

English
67
99
1.7K
228.4K
Himanshu
Himanshu@himanshutwtxs·
I've joined @mem0ai as Member of Technical Staff Huge thanks to @taranjeetio and @deshrajdry for the opportunity and really grateful to be part of what they're building! Memory layer for AI agents and harnesses is still a largely unsolved problem and @mem0ai is one of the few teams actively working on it If you're building in this space, feel free to hit a message. would love to chat
GIF
English
34
0
140
4.8K
luthira
luthira@luthiraabeykoon·
We implemented @karpathy 's MicroGPT fully on FPGA fabric. No GPU. No PyTorch. No CPU inference loop. Just a transformer burned into hardware, generating 50,000+ tokens/sec. The model is small, but the idea is not: inference does not have to live only in software 👇
English
272
695
7.4K
815.9K
Ajay Bhakar
Ajay Bhakar@ajay_2512x·
🚀 Prime Intellect Hiring Across Engineering, Research, and More! Engineering Roles • Member of Technical Staff – Full Stack • Member of Technical Staff – GPU Infrastructure • Member of Technical Staff – Inference • Member of Technical Staff – Security Research Roles • AI Research Resident – Open Source AGI (Part-time) • Research Engineer – Distributed Training • Research Engineer – Reinforcement Learning • Research Engineer – RL Infrastructure Finance & Operations • Business Operations Lead Growth • Revenue Operations Lead, AI Infrastructure Other Opportunities • Open Application for Unconventional Talent Locations: San Francisco & Remote (varies by role) Employment Type: Full-time (unless specified) jobs.ashbyhq.com/PrimeIntellect #Hiring #AI #Engineering #ResearchJobs #TechCareers
English
2
3
37
2.2K
hypyaml
hypyaml@vbppl·
@NehraWorkss they both with eventually end up with the same package
English
0
0
0
24
Deep
Deep@NehraWorkss·
2 candidates. Same Tier-2 college, 2025 batch Candidate-1: Brilliant in coding, high LeetCode, CF ratings, But he could only secure a 13 LPA offer. Candidate-2: Not as strong as Candidate-1 in coding, but still good. He cracked a 30 LPA FAANG offer right after graduation. What do you think made the difference here?
English
38
6
124
16.6K
Gautam Kamath
Gautam Kamath@thegautamkamath·
if you are able to get 14 papers accepted to ICML, maybe you do not actually need to post about getting 14 papers accepted to ICML
English
13
29
776
104.3K
neural nets.
neural nets.@cneuralnetwork·
boring weekend what are y'all doing
English
43
0
84
4.3K
hypyaml
hypyaml@vbppl·
@shekhu04 @rambuilds_ start up idea: a tool to figure out and teach the claude code user inefficiencies -- this when there are limits are hit, flash cards for when the user reaches token limits and also for the company to learn patterns and push out blogs?
English
1
0
1
20
Shikhar
Shikhar@shekhu04·
@rambuilds_ add cost monitoring → token + infra cost tracking (this becomes a real problem fast)
English
2
0
1
390
R𝛼m🦅
R𝛼m🦅@rambuilds_·
The LLM Engineering stack: 1. PySpark for Data Engineering 2. HuggingFace datasets for data 3. Unsloth for SFT/RLHF/QLoRA fine-tuning 4. vLLM for Inference optimization 5. FastAPI for backend 6. Redis for cache 7. Qdrant for vector database 8. Celery for task queue 9. Kafka for event-driven scenario 10. Perfect for orchestration 11. LangGraph for agents 12. DeepEval for LLM evaluation 11. Langsmith for LLM observability 12. Ollama for naive LLM calls 13. Postgres for storing data and conversations 14. Docker for containerization 15. AWS for deploying compute-heavy system (Kubernetes cluster) 16. Railway for agentic projects 17. Next JS for UI 18. Prometheus/Grafana for system observability 19. Ray for distributed systems 20. Slack for system alerts What else am I missing?
English
16
58
387
12.7K
neural nets.
neural nets.@cneuralnetwork·
I've used codex and claude code and opencode over last 2 months codex is the best 🙏
English
38
9
678
20.4K
hypyaml
hypyaml@vbppl·
@AishwaryaDevv in the next 6 months? you are HER if it doesn't already lol
English
0
0
0
29
Aish
Aish@AishwaryaDevv·
Software Engineers, what’s your plan B if Artificial Intelligence writes better code than you in next 6 months?
English
171
2
100
14.9K
Rud
Rud@Rudraksh1426330·
@HoNestSaPieN7 revealed her company after probation. I have already said it she is a big sus. she has cracked these hiring games while people like me are grinding on leetcode and linkedin apply button.
English
1
0
15
5.1K
hypyaml
hypyaml@vbppl·
@HelloVyom props to her though google -> palantir is an amazing jump
English
0
0
1
288
hypyaml
hypyaml@vbppl·
@HelloVyom no hate, she very likely is a good engineer but her engineering advice on instagram horrible
English
1
0
7
6.5K
VG🌪️
VG🌪️@HelloVyom·
Anu is working at Palantir now😯
VG🌪️ tweet mediaVG🌪️ tweet media
English
103
44
2.5K
1.2M