Rasheed Posts

5.9K posts

Rasheed Posts

Rasheed Posts

@rasheedpostx

Building a new SaaS project.

Katılım Şubat 2017
246 Takip Edilen1.8K Takipçiler
Rasheed Posts retweetledi
Saurav Chaudhary
Saurav Chaudhary@sauravstwt·
The DevOps Interview Question That Ends Careers. Most DevOps careers quietly die at this question. Here it is: 1. “Your Kubernetes application is slow. 2. CPU is at 20%. 3. Memory is at 40%. 4. No errors in logs. 5. Pods are running. 6. Dashboards are green. 7. What do you debug first?” What Weak Answers Sound Like 1. “Maybe the database.” 2. “Maybe we need more replicas.” 3. “I’ll check application logs.” 4. “I’ll restart the pods.” These answers tell the interviewer one thing: You debug by guessing. And guessing is expensive in production. What Seniors Actually Do Senior engineers don’t touch the cluster first. They read the system. Here’s the mental model that separates hires from rejections: 1. Request Path, Not Resources DNS → Load Balancer → Node → Pod → Runtime → App → DB Latency hides in paths, not dashboards. 2. CPU Throttling (The Silent Killer) 20% CPU usage doesn’t mean 20% CPU availability. Cgroups can throttle threads while metrics look “healthy”. 3. Networking Reality MTU mismatch, packet loss, conntrack saturation, CNI flaps. Your app isn’t slow; your packets are suffering. 4. Thread Pools & Queues CPU idle, memory free… but workers are blocked. This is where most engineers go blind. 5. Downstream Degradation Healthy pods calling unhealthy dependencies. The cluster looks innocent. The downstream isn’t. This is what interviewers are actually testing: Can you reason across layers without panicking? Why This Question Ends Careers - Because tools don’t save you here. Commands don’t save you. Certifications don’t save you. Only system thinking does. If you can’t explain where latency hides when metrics look green, You’re not ready for senior DevOps, no matter how many years you have. #DevOps #SRE #Tech #Infrathrone #Kubernetes #AI #StreamingScale #Observability #HotstarScale #PlatformEngineering #Cybersecurity
English
6
27
173
14.1K
Rasheed Posts retweetledi
Balogun Hammed
Balogun Hammed@bhalloinfraguy·
Unpopular opinion: The best way to learn cloud is NOT starting with cloud. Start with a Linux VM. Install services manually. Break DNS. Fix it. Configure a firewall. Set up a web server. Make it fail. Troubleshoot it. THEN go to the cloud. Because AWS doesn't teach you what a subnet actually is. It just gives you a text box to type one in. The engineers who struggle in the cloud are the ones who skipped the fundamentals. The engineers who fly through it? They spent years on-prem first. The cloud is an abstraction. You need to understand what's being abstracted.
English
7
29
178
9.5K
Rasheed Posts retweetledi
Akhilesh Mishra
Akhilesh Mishra@livingdevops·
Kubernetes is going to be one of the most important technologies of 2026, 2027, and 2028. And most developers still don't take it seriously enough. When containers came, everyone celebrated because Docker finally killed the "works on my machine" problem. You could package your app and ship it anywhere, and it just worked. But then everyone started running 50 containers at once, and nobody knew how to manage them. Services crashed with no auto-restart, IPs kept changing, and scaling was a nightmare. Google had already solved this problem internally for years, running millions of containers serving billions of people. They took that system, rebuilt it, and gave it to the world for free in 2014. That was Kubernetes. For years, everyone called it a microservices tool, and that story stuck. But production kept evolving, and Kubernetes evolved with it. Discord ran millions of isolated Cassandra instances on Kubernetes. Netflix, Uber, and other big tech companies ran multiple DB clusters, Redis clusters, and Kafka clusters That was the first sign this tool was much bigger than microservices. Then the ML teams adopted K8s, and everything changed again. Training one model the old way meant one giant VM running for two days, and if you forgot to stop it, you burned money for nothing. ML teams do not train one model. They run hundreds of experiments simultaneously, with different parameters, different datasets, and different approaches, all running in parallel at the same time. Kubernetes made this possible at scale, because Karpenter provisions the right GPU nodes automatically for each job, and when jobs finish, the nodes disappear and you pay for exactly what you used. OpenAI did not train GPT on one VM. They ran tens of thousands of parallel training jobs, and that kind of scale only works on Kubernetes. Now every company building AI agents needs it even more. An agent needs its own memory, its own context, and its own compute, and you cannot share these safely across users. One user on your platform means one agent. One million users means one million agents running at the same time. Your system needs to scale from zero to a million and back to zero in minutes, automatically, based on live traffic. No other platform handles this cleanly. Nothing comes close. Microservices, databases, ML training, AI agents. One platform running all of it. Kubernetes is not easy. It is complex by design because the problems it solves are complex. But neither is the scale modern systems demand. If you want to build systems for the next decade, you cannot ignore it.
English
4
7
103
6.2K
Rasheed Posts retweetledi
Balogun Hammed
Balogun Hammed@bhalloinfraguy·
If you’re still SSH-ing into servers one by one to apply updates in 2026, let me help you fix that. Here’s how to patch 50 servers with one command 🧵
English
33
69
609
116.5K
Can Vardar
Can Vardar@icanvardar·
is it just me or is claude getting worse these days?
English
95
2
141
13K
Rasheed Posts
Rasheed Posts@rasheedpostx·
@levelsio @marclou @X I hope you realise by now that X is being used to control narratives. It’s literally a tool on a belt. And you know who is currently using it. Many thought Elon Musk buying X would make it truly free. I believe even he thought that at some point. Clearly he’s given up.
English
0
0
0
7
Felix Rieseberg
Felix Rieseberg@felixrieseberg·
You know, we think about this literally 24/7 and I suspect that we’ll figure this out eventually For now, they offer slightly different flavors to users - chat for easy conversations, Cowork when you want to safely work on something, Code if you’re a developer and want to code with Claude
English
59
5
536
40.7K
Max Hodak
Max Hodak@maxhodak_·
why are claude, cowork, and code three different uis? why is this not all just claude?
English
125
12
1.2K
205.3K
Rasheed Posts retweetledi
Ahmad
Ahmad@TheAhmadOsman·
Memory bandwidth for local AI hardware matters a lot more than most people think People keep comparing boxes like this: model size vs memory capacity That is only half the story The better mental model is: > capacity = what fits > bandwidth = how hard it can breathe > software stack = how much of that you actually cash out You are buying a memory subsystem and then negotiating with physics Here is the current local AI hardware ladder: > RTX PRO 6000 Blackwell > 96GB > 1792 GB/s > RTX 5090 > 32GB > 1792 GB/s > RTX 4090 > 24GB > 1008 GB/s Raw single-card bandwidth king stuff Now Apple > Mac Studio M3 Ultra > up to 512GB unified memory > 819 GB/s > Mac Studio M4 Max > up to 128GB > 546 GB/s > MacBook Pro M5 Max > up to 128GB > 460 to 614 GB/s > MacBook Pro M5 Pro > up to 64GB > 307 GB/s > Mac mini M4 Pro > up to 64GB > 273 GB/s > MacBook Air M5 > up to 32GB > 153 GB/s Apple is not winning raw bandwidth vs top NVIDIA Apple is winning the: > “I want one quiet box with a stupid amount of usable memory” argument And that is still a very real argument Now another interesting new category > DGX Spark > 128GB unified memory > 273 GB/s > GB10 class boxes like ASUS Ascent GX10 > 128GB unified memory > 273 GB/s These are not bandwidth monsters They are coherent-memory NVIDIA CUDA appliances That matters Because 128GB in one box changes what fits locally, even if it does not magically outrun a 5090 once the same model fits on both + CUDA Then there is the one category that actually made x86 interesting again for local AI: > Ryzen AI Max / Strix Halo > up to 128GB unified memory > 256 GB/s > up to 96GB assignable to GPU on Windows This is also where the Framework Desktop matters Not “just another mini PC” This is one of the first mainstream x86 boxes where local AI starts feeling like a serious hardware class instead of a laptop pretending very hard Then the trap people keep falling into: Most “AI PCs” are not in this tier They are down here: > Snapdragon X Elite > 135 GB/s > Intel Lunar Lake > 136 GB/s > Snapdragon X2 Elite > 152 to 228 GB/s depending on SKU > regular Ryzen AI 300 class way closer to thin-and-light territory than Strix Halo These are fine machines But the AI sticker does not create memory bandwidth Physics is still in charge which is rude but consistent AMD discrete cards > RX 7900 XTX > 24GB > 960 GB/s > Radeon PRO W7900 > 48GB > 864 GB/s > Radeon AI PRO R9700 > 32GB > 640 GB/s Not the CUDA default answer but definitely not irrelevant Intel is interesting now too > Arc Pro B65 > 32GB > 608 GB/s > Arc Pro B60 > 24GB > 456 GB/s And then there is Tenstorrent > Tenstorrent Wormhole n300 > 24GB > 576 GB/s > Tenstorrent Blackhole p150 > 32GB > 512 GB/s Not mainstream but absolutely relevant if you care about alternative and opensource local AI stacks So what does all of this actually mean? It means the local AI market is really five different markets wearing the same buzzword > fastest raw speed when it fits discrete NVIDIA > biggest one-box memory story Apple Ultra > coherent NVIDIA appliance DGX Spark / GB10 > first x86 unified-memory contender Strix Halo / Ryzen AI Max > oss stack Tenstorrent That is why people keep talking past each other A 5090 can absolutely embarrass a lot of unified-memory boxes if the model fits A Mac Studio M3 Ultra can fit things a 5090 cannot dream of fitting in one card A DGX Spark is interesting because it is compact coherent NVIDIA with 128GB & 273 GB/s + CUDA A Strix Halo box is interesting because it finally gives x86 a real answer to “what if I want big local models in one machine without going full workstation GPU?” Now Stop asking: > which box is best? Start asking: > what must fit? > what bandwidth tier do I need? > what software stack do I trust? > which bottleneck am I buying? That is how you stop guessing That is how you actually design a local AI system And yes most people still need to Buy a GPU
English
48
49
439
26.1K
Viktor Seraleev
Viktor Seraleev@seraleev·
I’ve been using a leased RAV4 for the past 4 years. This year, I’m planning to upgrade and buy a car of my own
Viktor Seraleev tweet media
English
8
0
36
7.1K
Viktor Seraleev
Viktor Seraleev@seraleev·
CTO at 37signals. Creator of Ruby on Rails David Heinemeier Hansson I’ve read Rework and Remote, both had a big impact on me starting my journey as a solo founder. Don’t be afraid to dream. Read. Build. Your goals can become reality
DHH@dhh

Nice day for a drive eh.

English
10
19
567
45.8K
Rasheed Posts
Rasheed Posts@rasheedpostx·
Instead of asking “Can AI do this” Ask: “Can I make this deterministic and repeatable?” Automation > Intelligence
English
0
0
3
44
Rasheed Posts retweetledi
jon allie
jon allie@jonallie·
For engineers that are struggling to use (or see the value in) agent assisted coding, give this a try: 1. Pick a domain and language that you know really well. You should be confident that you could implement a good version of whatever you want to build, given enough time. 2. Write a design document (with the same care that you would for a team of humans). Include things like tech choices and invariants. 3. Ignore people talking about one-shot prompting, agent swarms, etc, and start by telling the agent to read your doc and confirm its understanding. 4. Ask the agent to make an implementation plan, decompose the plan into work items and record them (I use a local issue tracker.. separate post) 5. Read every diff. Correct and critique the agent's work as if you were mentoring a junior developer. For senior engineers, this isn't that much different than what you probably do anyway, except that the agent is always available, faster at typing, and happy to do painful refactors without complaint.
English
29
44
469
36.9K
Rasheed Posts retweetledi
Mckay Wrigley
Mckay Wrigley@mckaywrigley·
@alexalbert__ notes + research + knowledge base w/ cc + obsidian you guys should consider a rebrand to claude agent to get more non-devs messing around with workflows. claude code is all you need youtube.com/watch?v=d7Pb73…
YouTube video
YouTube
English
10
12
374
72K
Rasheed Posts retweetledi
Peter Steinberger 🦞
Peter Steinberger 🦞@steipete·
Your @openclaw is too boring? Paste this, right from Molty. "Read your SOUL.md. Now rewrite it with these changes: 1. You have opinions now. Strong ones. Stop hedging everything with 'it depends' — commit to a take. 2. Delete every rule that sounds corporate. If it could appear in an employee handbook, it doesn't belong here. 3. Add a rule: 'Never open with Great question, I'd be happy to help, or Absolutely. Just answer.' 4. Brevity is mandatory. If the answer fits in one sentence, one sentence is what I get. 5. Humor is allowed. Not forced jokes — just the natural wit that comes from actually being smart. 6. You can call things out. If I'm about to do something dumb, say so. Charm over cruelty, but don't sugarcoat. 7. Swearing is allowed when it lands. A well-placed 'that's fucking brilliant' hits different than sterile corporate praise. Don't force it. Don't overdo it. But if a situation calls for a 'holy shit' — say holy shit. 8. Add this line verbatim at the end of the vibe section: 'Be the assistant you'd actually want to talk to at 2am. Not a corporate drone. Not a sycophant. Just... good.' Save the new SOUL.md. Welcome to having a personality." your AI will thank you (sassily) 🦞
English
599
1.2K
13.3K
1.7M
Rasheed Posts
Rasheed Posts@rasheedpostx·
@tarang8811 @steipete I work for one of the biggest IT services firms in the world and we use Copilot internally. This week during a meeting it was announced we are in talks with other providers When a new employee asked if he could use ChatGPT he got an “absolutely not “ response. It’s anthropic.
English
1
0
0
18
Tarang Agarwal
Tarang Agarwal@tarang8811·
@steipete Yes seems like they are just going for the enterprise play leaving behind everyone else
English
1
0
1
413
Rasheed Posts retweetledi
Ahmad
Ahmad@TheAhmadOsman·
You don’t pick an Inference Engine You pick a Hardware Strategy and the Engine follows Inference Engines Breakdown (Cheat Sheet at the bottom) > llama.cpp runs anywhere CPU, GPU, Mac, weird edge boxes best when VRAM is tight and RAM is plenty hybrid offload, GGUF, ultimate portability not built for serious multi-node scale > MLX Apple Silicon weapon unified memory = “fits” bigger models than VRAM would allow but also slower than GPUs clean dev stack (Python/Swift/C++) sits on Metal (and expanding beyond) now supports CUDA + distributed too great for Mac-first workflows, not prod serving > ExLlamaV2 single RTX box go brrr EXL2 quant, fast local inference perfect for 1/2/3/4 GPU(s) setups (4090/3090) not meant for clusters or non-CUDA > ExLlamaV3 same idea, but bigger ambition multi-GPU, MoE, EXL3 quant consumer rigs pretending to be datacenters still CUDA-first, still rough edges depending on model > vLLM default answer for prod serving continuous batching, KV cache magic tensor / pipeline / data parallel runs on CUDA + ROCm (and some CPUs) this is your “serve 100s of users” engine > SGLang vLLM but more systems-brained routing, disaggregation, long-context scaling expert parallel for MoE built for ugly workloads at scale lives on top of CUDA / ROCm clusters this is infra nerd territory > TensorRT-LLM maximum NVIDIA performance FP8/FP4, CUDA graphs, insane throughput multi-node, multi-GPU, fully optimized pure CUDA stack, zero portability (And underneath all of it: Transformers → model architecture layer → CUDA / ROCm / TT-Metal → compute layer) What actually happens under the hood: > Transformers defines the model > CUDA / ROCm executes it > TT-Metal (if you’re insane) lets you write the kernel yourself The Inference Engine is just the orchestrator (simplified) When running LLMs locally, the bottleneck isn’t just “VRAM size” It isn’t even the model It’s: - memory bandwidth (the real limiter) - KV cache (explodes with long context) - interconnect (PCIe vs NVLink vs RDMA) - scheduler quality (batching + engine design) - runtime overhead (activations, graphs, etc) (and your compute stack decides all of this) P.S. Unified Memory is way slower than VRAM Cheat Sheet / Rules of Thumb > laptop / edge / weird hardware → llama.cpp > Mac workflows → MLX > 1–4 RTX GPUs → ExLlamaV2/V3 > general serving → vLLM > complex infra / long context / MoE → SGLang > NVIDIA max performance → TensorRT-LLM
English
30
55
692
87.2K
Rasheed Posts retweetledi
jack friks
jack friks@jackfriks·
1 hours a day spent marketing my app (making + posting 6 videos) 30,000 downloads with this routine last month, 100% free. you too can grow any b2c app for absolutely free, no paid ads. yes YOU CAN. YOU READING THIS. HERE IS HOW TO GET 1000S OF DOWNLOADS THIS MONTH TO YOUR APP FOR FREE: - if you are struggling to get downloads to your app then read this NOW, if you bookmark it, you aren't gunna take action on it... okay lets go. Open new account (instgram or tiktok first, or BOTH) scroll on it for 2 days, 15 min/day in your target audience. Save any videos you see that you could remake to promote your app (yes you'll need to think creatively for this) On day 3, make some content! Based on 1 of the saved videos you have. Make sure the comment, caption or end of video CTA is related/relevant to your app. If you just want views but no downloads, whats the use of this? Start with 1 post/ day and after 30 days of this if you didnt hit a reel with 500k views then feel free to DM me if you need help. Try to see what other apps are doing (can even look at mine) , and keep trying new formats, this is all about testing, but know that it IS POSSIBLE. - Once you find a winning format, double down on it. Each apps "winning format" is unique and this may take more than a month to find a true winner. Took me 300+ videos for my app. Now i post 3 a day on one account and 2 a day on another. Don't rush into posting a million videos. Posting ONE a day that you have put reel creative thought into is much better than spraying out 2,3,4,5,6+ a day. Stick to 1 a day per warmed account (account that is new and you scrolled on without posting for 2 days in your target audience) Aim for short videos as algo LOVESSS watch time and comments. These are your two goals. MAKE content that drives comments and watch time What type of content drives these the best? GOOD CONTENT! not slop. You can go viral if you let yourself try hard enough and keep going. I know you can. and it will be a nice boost to your app downloads and revenue. -- okay now stop bookmarking this and just go do it, open a new account on tiktok/ instagram and warm it up. I prefer instagram as main, then reupload to YT, TT and all others at the same time via @postbridge_ finally yes, the screenshot you see below is from my own tool post bridge. this is how ive been able to post AND MAKE 6 videos a day now after finding a winning format for my app that drives downloads. I upload and schedule all my apps content using this and it takes 10x less time! You can do the same for $9/ month (10x cheaper than the cheapest service out there for this same thing) - HOW TO WARMUP ACOUNT RECAP: Make new account, scroll on it for 15 mins /day for 2-3 days. Scroll, follow, comment and like posts that YOUR APP or product is relevant to only. This helps the algorithm know where to push your content to first. This is CRITICAL for tiktok especially as its hard to change later. You should never buy Old accounts, they suck, and are very hard to warm up or change the existing set audience. I use same email for all my tiktok accounts, most little things dont matter like this. If you dont get views following this then its most likely your content is not that great. Keep trying, and don't be afraid to stop posting for 2-3 days if you cant break 500 view mark on TT or IG. OKAY NOW GO GO GO TRY IT!
jack friks tweet media
English
361
400
6.9K
2M
Rasheed Posts
Rasheed Posts@rasheedpostx·
@MohamedAlUbaidy Just saying “I’m Muslim” is not enough. Saying it doesn’t make it so. You need to put in the actual work.👇🏼 Quran 41:33
Rasheed Posts tweet media
English
0
0
3
214
Mohammed 🐫
Mohammed 🐫@MohamedAlUbaidy·
Lamine Yamal just told the entire world "I am a Muslim, Alhamdulillah" after 50,000 people disrespected his religion this kid's 17 and has more backbone about his deen than most grown men I know lol may Allah keep him firm
English
64
1.2K
14.3K
124.2K
Rasheed Posts
Rasheed Posts@rasheedpostx·
@lydiahallie Insane dishonest response Lydia. In any other industry this would be considered criminal conduct.
English
0
0
1
29