Badrinath
20 posts

Badrinath
@Badrinath248
Software Engineer | AI • OpenSearch • Kafka • AWS • Go • React | Building reliable systems that scale.
India Beigetreten Ağustos 2020
155 Folgt46 Follower

@0xlelouch_ Spent hours troubleshooting a ghost issue caused entirely by caching.
English

@0xlelouch_ Repetitive tasks like security fixes, test cases & documentation...etc can be handled with agent rules while developing.
English

@system_monarch Great share! A very good example to understand connection/thread pools.
English

Think of your connection pool as a hotel with 20 rooms.
Each guest (query) checks in, stays for 5ms, checks out. 20 rooms is plenty. Hundreds of guests flow through every second.
Now one guest (a bad query) checks in and stays for 30 seconds. One room gone. Not a big deal, 19 left.
But if 10 of these long-staying guests show up at once, that's 10 rooms occupied for 30 seconds each.
Now there are 10 rooms for everyone else. A line forms at the lobby. Guests are waiting 5, 8, 12 seconds just to get a room. Your API starts timing out.
English

💡 Bulk API is only half the story.
Indexing performance depends on how you use it.
Key factors I look at:
📦 Bulk size — too small wastes network calls, too large causes memory pressure.
⚡ Parallelism — more workers help until the cluster becomes the bottleneck.
🔄 Refresh interval — frequent refreshes can significantly reduce throughput.
💾 Replicas — great for availability, expensive for indexing.
🖥️ Node resources — CPU, JVM heap, disk IOPS, and network bandwidth often determine your ceiling.
📊 Metrics — monitor CPU, JVM pressure, thread pool rejections, indexing latency, and merge activity.
🚨 Don't stop at HTTP 200.
Always inspect the Bulk response—individual documents can still fail.
The goal isn't the biggest Bulk request.
It's achieving the highest stable throughput without overwhelming the cluster.
#OpenSearch #BackendEngineering #Performance #DistributedSystems
English

🚀 One of the simplest ways to improve OpenSearch indexing throughput is to use the Bulk API.
Instead of sending:
📄 1 document → 1 request
Send:
📦 1000+ documents → 1 request
Why it matters:
✅ Fewer network round trips
✅ Lower request processing overhead
✅ Better shard-level batching
✅ Higher indexing throughput
In high-volume ingestion pipelines, Bulk API isn't an optimization—it's often the default choice.
The biggest performance gains often come from reducing overhead rather than adding more infrastructure.
👀 In my next post, I'll share the key factors I consider when tuning Bulk API workloads, including batch size, parallelism, refresh intervals, and common bottlenecks.
#OpenSearch #BackendEngineering #DistributedSystems #Performance
English

💡 One OpenSearch issue reinforced the importance of platform governance.
An index was automatically created with dynamic mappings.
Everything looked fine at first...
Until the mappings didn't match application expectations.
The result:
❌ Inconsistent schema
❌ Downstream failures
❌ Operational overhead
The solution:
🚫 Disable auto index creation
📋 Define mappings through index templates
🚀 Create bootstrap indices upfront
🔗 Use aliases instead of direct index references
📦 Configure rollup jobs and lifecycle policies
The lesson:
Reliability isn't just about search performance.
It's about controlling how data enters and evolves within the platform.
#OpenSearch #SoftwareEngineering #SearchEngineering
English

Hi, I'm Badrinath 👋
I'm a Software Engineer working on search and data platforms.
Recently, I've been spending time on:
• OpenSearch • Kafka • Distributed Systems • Go • React
I've learned that building software is rarely about writing code alone.
Understanding scale, performance, and trade-offs is where things get interesting.
I'll be sharing what I'm building, exploring, and figuring out along the way.
English




















