Sergiu Nistor

2K posts

Sergiu Nistor

Sergiu Nistor

@SergiuNistor6

Research Engineer/ML Engineer Ex-FAANG, Independent Researcher - Deep Learning, AI

Katılım Ocak 2021
1.2K Takip Edilen485 Takipçiler
Sergiu Nistor
Sergiu Nistor@SergiuNistor6·
@yulintwt Well that's pretty much every deep learning paper ever. Very few papers provide mechanistic interpretability motivations. Most just provide intuition-based motivations, or just present the results on some benchmarks and call it a day. AI research is engineering, not research.
English
0
0
0
121
Scholarship for PhD
Scholarship for PhD@ScholarshipfPhd·
Say hi and I’ll recommend a research topic that perfectly fits your profile.
English
11.1K
383
12.8K
904.7K
Sergiu Nistor
Sergiu Nistor@SergiuNistor6·
@GenAI_is_real You're confusing ML engineer with AI engineer. AI engineers use LLMs in the context of agents. ML engineers fine-tune LLMs, train other models, handle data pipelines, build training pipelines etc.
English
0
0
5
421
Chayenne Zhao
Chayenne Zhao@GenAI_is_real·
ml engineer is a legacy title. by 2030, you’re either an architect who knows how to orchestrate thousands of agents, or you’re just a casual user. the real job is 'system builder'—ensuring that the training-inference loop is tight and the data distribution doesn't drift. vibe coding is the entry point, but high-concurrency systems like sglang are the foundations that actually make those vibes scale.
aditii@aditiitwt

only jobs left in 2030 - vibe coder - prompt engineer - bug fixer - devops engineer - ml engineer - ai engineer - cyber security expert

English
13
31
496
53.6K
Sergiu Nistor retweetledi
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
From my observation of friends around me, those who’ve worked at frontier AI labs experience exponential growth. It’s not just technical. It’s a deeper shift in how they view the world, trends, and themselves. Being immersed in an environment full of other exceptional people led to exponential growth. There’s a clear lesson here for startups: hire the very best, put them together, and you get compounding effects. And for each individual, find the environment that puts you on an exponential curve.
English
36
27
667
50.4K
Sergiu Nistor retweetledi
λux
λux@novasarc01·
if you want to get into research this is actually one of the best times to do it. as a beginner you don’t need to wait for opportunities to come to you or for the “right” people to guide you...you can just start. pick a domain you’re curious about (RL, mech-interp, diffusion models, etc.) and begin running experiments + learn about ml systems, inference, infra while building projects. i remember as a beginner some of the experiments i worked on (and i am still experimenting!) involved analyzing what changes inside a model during post-training, track where the model changes internally, how those changes evolve across checkpoints and which layers causally implement the new behaviors, looking at internal geometric shifts as training progresses, understanding how reward hacking happens by creating synthetic datasets, intentionally pushing models to reward-hack in an environment and then analyzing what’s going on internally. in this economy the only real proof of work is what you’ve built, the messy experiments you run and the knowledge you gain while fixing and debugging things. everything else is just a proxy.
Noam Brown@polynoamial

I'm often asked how to land a research job at a frontier AI lab. It's hard, especially without a research background, but I like to point to @kellerjordan0 as an example showing it can be done. Keller graduated from UCSD with no publication record and was working at an AI content moderation startup when he landed a cold call with @bneyshabur (who was at Google) and presented an idea to improve upon Behnam's recent paper. Behnam agreed to mentor him, which led to an ICLR paper. Sadly there's less open research today, but improving upon a researcher's published work is a great way to demonstrate excellence to someone inside a lab and give them the conviction to advocate for an interview. Later, Keller got on @OpenAI's radar thanks to the NanoGPT speed run he started. All his work was documented and it was easy to measure his success, so the case for hiring him was strong. Keller is one example, but there's plenty of other success stories as well: 🧵

English
3
21
491
24.9K
Sergiu Nistor
Sergiu Nistor@SergiuNistor6·
@mishig25 @xenovacom I'd totally love this. I've been looking for the hardcover version of it for quite a while.
English
0
0
0
15
Sergiu Nistor
Sergiu Nistor@SergiuNistor6·
Just finished the Data Analytics Specialization from DeepLearning.AI. 📊 A more relaxed, practical pause from deep learning. Focused on applied stats, analytics workflows, and making insights clear and actionable for stakeholders.
Sergiu Nistor tweet media
English
0
0
0
124
Sergiu Nistor
Sergiu Nistor@SergiuNistor6·
First cert of the year 🚀 Completed DeepLearning.AI’s Mathematics for ML & Data Science Specialization. - Strengthened understanding of eigenvectors - Reinforced Bayesian fundamentals Now reading Bishop’s "Pattern Recognition & ML" New year, stronger fundamentals 📈
Sergiu Nistor tweet media
English
0
0
0
39
Martin Shkreli
Martin Shkreli@MartinShkreli·
55 billion minutes spent on AI sites last month. 64% chatgpt, 15% gemini. deepseek and grok the only two on a trajectory to come even close.
Martin Shkreli tweet media
English
239
230
2.6K
936.1K
Sergiu Nistor retweetledi
elie
elie@eliebakouch·
Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably huggingface.co/spaces/Hugging…
elie tweet media
English
125
1K
6.8K
1.9M
Sergiu Nistor
Sergiu Nistor@SergiuNistor6·
Christmas is close, but the learning continues! I completed the Machine Learning Specialization from DeepLearning.AI and found the reinforcement learning and recommender system modules especially strong. Andrew Ng’s clarity explaining complex ideas stands out. Onward!
Sergiu Nistor tweet media
English
0
0
1
80
Rondo_AI
Rondo_AI@rondo_ina_condo·
@SergiuNistor6 @DeepLearningAI I literally just started like 48 hours ago. So I can't answer that cuz I don't know. Definitely get back to you on this 😅
English
1
0
1
16
Sergiu Nistor
Sergiu Nistor@SergiuNistor6·
Just wrapped up the Deep Learning Specialization by @DeepLearningAI 🎉 Big takeaway: the course really nails why models are built the way they are. It connects math function design directly to goals like memory and attention, so architectures actually make intuitive sense.
Sergiu Nistor tweet media
English
2
1
11
3.5K
Sergiu Nistor
Sergiu Nistor@SergiuNistor6·
@Rondo_ina_Condo @DeepLearningAI Best of luck! What do you think about the mathematical interpretations for the GRU and LSTM memory gates, and for self-attention, and transformer positional encodings? Personally it really changed the way I think about these models. It makes much more sense now.
English
1
0
1
33
em
em@embuildstech·
@SergiuNistor6 @DeepLearningAI how was the process working on this? I’ve used this site before and felt like it takes a while.
English
2
0
1
31
Sergiu Nistor
Sergiu Nistor@SergiuNistor6·
@embuildsai @DeepLearningAI Unfortunately I am a bit of a special case since I already have experience in the field. I think it is indeed quite hard, but they do start with the basics. I think it's more of an intermediary/advanced specialization, as it explains concrete methods and models in an intuitive.
English
0
0
0
6