Puneet Patwari
969 posts

Puneet Patwari
@system_monarch
Principal @Atlassian | Helping engineers reach Staff/Principal | 1:1 Mentorship & Mock Interviews | 90+ System design fundamentals - https://t.co/Ots2nRhO5f





Incredible video by randomly sacked Atlassian engineer telling all about the entire company Love this genre, like LinkedIn green banner with zero fcks given







I'm a Principal with 12 years of experience. If I was coaching you to crack system design rounds for Sr to Staff+ AI/ML roles at companies like Meta, Google, Salesforce, Amazon, etc. I would 100% ask you to work on these fundamentals before we would start talking about interviews. Because AI system design is still system design. The only difference is that now your bottlenecks are not just databases, caches, and queues. They are tokens, context windows, retrieval quality, inference cost, hallucinations, model latency, evals, and user trust. Here are the fundamentals I would start with: ➤ LLM Basics ↬ Tokens ↬ Context Window ↬ Prompt Design ↬ System Prompts ↬ Temperature ↬ Top-p Sampling ↬ Structured Outputs ↬ JSON Mode ↬ Function Calling ↬ Tool Calling ↬ Agents ↬ Memory ↬ Guardrails ↬ Hallucinations ↬ Model Latency ↬ Model Routing ↬ Small vs Large Models ↬ Fine-tuning vs Prompting ↬ Open-source vs Closed Models ➤ RAG & Retrieval ↬ Embeddings ↬ Vector Search ↬ Vector Databases ↬ Chunking ↬ Chunk Overlap ↬ Metadata Filtering ↬ Hybrid Search ↬ Keyword Search ↬ Semantic Search ↬ Reranking ↬ Retrieval Recall ↬ Retrieval Precision ↬ Query Rewriting ↬ Document Freshness ↬ Permission-aware Retrieval ↬ Citation Grounding ↬ Evidence Selection ↬ Context Packing ↬ Missing Information Detection ➤ AI System Architecture ↬ API Gateway ↬ Request Routing ↬ Model Gateway ↬ Prompt Service ↬ Inference Service ↬ Retrieval Service ↬ Ranking Service ↬ Feature Store ↬ Offline Pipelines ↬ Online Serving ↬ Async Processing ↬ Queueing ↬ Streaming Responses ↬ Rate Limiting ↬ Fan-out/Fan-in ↬ Batch Inference ↬ Real-time Inference ↬ Human-in-the-loop Systems ↬ Fallback Workflows ➤ Cost & Performance ↬ Token Budgeting ↬ Prompt Compression ↬ Prompt Caching ↬ Semantic Caching ↬ Response Caching ↬ Batch Requests ↬ Model Quantization ↬ Distillation ↬ Latency Budgets ↬ Cold Starts ↬ GPU Utilization ↬ Throughput ↬ Cost per Query ↬ Cost per User ↬ Model Selection ↬ Inference Scaling ↬ Backpressure ↬ Load Shedding ➤ Evaluation & Quality ↬ Offline Evals ↬ Online Evals ↬ Golden Dataset ↬ Human Review ↬ LLM-as-Judge ↬ A/B Testing ↬ Regression Testing ↬ Answer Relevance ↬ Factual Accuracy ↬ Faithfulness ↬ Groundedness ↬ Toxicity Checks ↬ Safety Checks ↬ Drift Detection ↬ Feedback Loops ↬ Confidence Scoring ↬ Escalation Criteria ↬ Quality Monitoring ➤ Reliability & Security ↬ Timeouts ↬ Retries ↬ Circuit Breakers ↬ Failover ↬ Model Fallbacks ↬ Graceful Degradation ↬ Observability ↬ Tracing ↬ Prompt Logs ↬ Token Metrics ↬ Error Budgets ↬ PII Redaction ↬ Data Privacy ↬ Access Control ↬ Prompt Injection ↬ Jailbreak Defense ↬ Audit Logs ↬ Compliance




As a Principal Backend Engineer with over 12 years of experience, I can tell you quite certainly that if you're still getting rejections in system design interviews after good efforts, I think your fundamentals are not strong... Dedicate 2-3 months to mastering these design fundamentals, then practice designing a few systems(and do plenty of mock interviews). Scaling & Architecture ↬ CDN ↬ Caching ↬ Sharding ↬ Queueing ↬ Replication ↬ Partitioning ↬ API Gateway ↬ Rate Limiting ↬ CAP Theorem ↬ Microservices ↬ Load Balancing ↬ Fault Tolerance ↬ Database Scaling ↬ Service Discovery ↬ Consistency Models ↬ Eventual Consistency ↬ Distributed Transactions ↬ Monolith vs Microservices ↬ Leader Election Databases & Storage ↬ Leader-Follower Replication ↬ WAL (Write Ahead Log) ↬ Asynchronous Processing ↬ Transaction Isolation ↬ Read/Write Patterns ↬ Consistent Hashing ↬ Redis/Memcached ↬ Backup & Restore ↬ Hot/Cold Storage ↬ Data Partitioning ↬ Object Storage ↬ SQL vs NoSQL ↬ Data Retention ↬ Data Modeling ↬ OLAP vs OLTP ↬ ACID & BASE ↬ Bloom Filters ↬ File Systems ↬ S3 Basics ↬ B+ Trees ↬ Indexing Communication & APIs ↬ JWT ↬ CORS ↬ OAuth ↬ Throttling ↬ Serialization ↬ API Security ↬ Long Polling ↬ WebSockets ↬ API Gateway ↬ Idempotency ↬ Service Mesh ↬ Retry Patterns ↬ REST vs gRPC ↬ API Versioning ↬ Circuit Breaker ↬ API Rate Limits ↬ Fan-out/Fan-in ↬ Protocol Buffers ↬ Message Queues ↬ Dead Letter Queue Reliability & Observability ↬ Metrics ↬ Alerting ↬ Failover ↬ Logging ↬ Rollbacks ↬ Monitoring ↬ Heartbeats ↬ Retry Logic ↬ Autoscaling ↬ SLO/SLI/SLA ↬ Load Testing ↬ Error Budgets ↬ Health Checks ↬ Circuit Breaker ↬ Incident Response ↬ Chaos Engineering ↬ Distributed Tracing ↬ Canary Deployments ↬ Graceful Degradation ↬ Blue-Green Deployment








Starting June 15, paid Claude plans can claim a dedicated monthly credit for programmatic usage. The credit covers usage of: - Claude Agent SDK - claude -p - Claude Code GitHub Actions - Third-party apps built on the Agent SDK


