Saumitra Srivastav

147 posts

Saumitra Srivastav

@_saumitra_

Bangalore, India Katılım Şubat 2012

149 Takip Edilen108 Takipçiler

Wrote a tutorial on how to create a lakehouse-based AI evaluation platform using open-source stack. Blog: saumitra.me/2026/2026-03-0… Code: github.com/saumitras/ai-e… We will see how to solve typical scale problems like: 1. Fragmented tooling: each team builds its own eval tooling, schemas, and scoring logic 2. No shared standard: model, prompt, retriever, and dataset versions are tracked inconsistently, making cross-team governance and cross-team knowledge sharing hard. 3. Weak lineage: teams can see a score change but cannot reliably answer what exact configuration caused it. 4. Poor observability: traces and metrics are often separated from run metadata, which slows root-cause analysis. 5. Replay gaps: failures found in production cannot be deterministically reproduced for safe comparisons. 6. Throughput limits: simple eval pipelines cannot keep up with enterprise-scale experiment volume. 7. BI disconnect: analytics teams cannot query cross-app eval data easily through a single pane 8. Failure patterns stay hidden: teams see individual failed cases, but without clustering they miss recurring failure modes and cannot prioritize fixes effectively. Technologies: AWS #S3, @ApacheIceberg, @apachepolaris , @ApacheAirflow, @deepeval , @raydistributed , @apachekafka , @ApacheSpark , @PostgreSQL , @trinodb , @apachesuperset , Google Agent Development Kit, @OpenAI , #llm, #mcp

English

Saumitra Srivastav@_saumitra_·9 Mar

@gunnarmorling Tried it. After 3-4 months of handling prod edge cases, eventually ended up having something similar to schema registry, so dropped it in the next release and switched back to Confluent's 🙂 Not worth pursuing IMO

English

Gunnar Morling 🌍@gunnarmorling·9 Mar

Question for the Kafka community: has anyone ever explored a (de-)serializer which would keep JSON/Avro schemas within a Kafka topic? I.e. in this model, there'd be no registry whatsoever, provided all the schemas could be kept in memory for efficient access. Worth exploring?

English

Saumitra Srivastav@_saumitra_·3 Mar

@manjunath_t_m @rockthejvm Even though I use Scala as the primary language wherever available, but with Java implementing ideas from Scala and bad ecosystem support for Scala 3, I am concerned about Scala's future. Especially in ML ecosystem, its now much easier to use Python or Java because of lib support

English

Manjunath T M@manjunath_t_m·3 Mar

@rockthejvm Scala is more capable than just writing spark jobs. I would like to see Scala to be used more for building rock solid products

English

Rock the JVM@rockthejvm·2 Mar

Just finished a live 2-day #Scala training session with Microsoft (!). Apparently they needed it for Spark. They had no idea what Scala is capable of. Left the training blown away.

English

189

Saumitra Srivastav@_saumitra_·8 Oca

@moxie Any non-blockchain engineer willing to get into web3, should first start with learning about currently available L1 & L2 chains, and not just focus on bitcoin and eth. Even @opensea is taking steps in right direction by adding @0xPolygon to reduce or get rid of gas fees... 3/n

English

Saumitra Srivastav@_saumitra_·8 Oca

@moxie Anyone new coming into the space sees hyped NFTs/metaverse projects in their current form as web3, but they are not. a16z backed opensea, when using eth-1.0, is a misleading example to explain to someone what web3 is/will be... 2/n

English

Moxie Marlinspike@moxie·8 Oca

Wrote some notes summarizing my first impressions of web3: moxie.org/2022/01/07/web…

English

966

6.2K

25.8K

Saumitra Srivastav@_saumitra_·11 Kas

@tlberglund this is brilliant @tlberglund. I would pay to watch a series "The legend of Bare Metalsson" where he leads a cult of Metalsson(s) in a (losing??) battle against the evil cloud. Or perhaps an origin story🦹‍♂️ More Bare Metalsson, please!😂

English

Tim Berglund@tlberglund·11 Kas

You know, there are good reasons to run Kafka on-prem. And so many bad ones. Don't be like this guy. youtube.com/watch?v=AXxr0p…

YouTube

English

140

Saumitra Srivastav@_saumitra_·5 Şub

Join us on 22nd-Feb in #bangalore at @near for @apachekafka #meetup with talks from @confluentinc, @near, @goibibo, @gojektech about data and #streamprocessing platforms and an introduction to @ksqlDB meetup.com/Bangalore-Apac…

English

Saumitra Srivastav retweetledi

Apache Flink@ApacheFlink·15 Oca

New on the Flink blog: Using a Case Study of a #FraudDetection System to apply powerful Flink patterns for building streaming applications by @alex_fedulov! Find out more: flink.apache.org/news/2020/01/1… #streamprocessing #frauddetection

English

Saumitra Srivastav@_saumitra_·24 Ara

@gwenshap I try to maintain a knowledge base of "what brought my X service down". Have few items for Kafka too🙂 It wud be great to have a centralized wiki of "what brought my Kafka cluster down" as a troubleshooting guide for beginners. Is there one already where I can add mine too? 2/2

English

Saumitra Srivastav@_saumitra_·24 Ara

@gwenshap That's because things that are obvious to experts, because they understand underlying architecture, code and config, are not known to beginners and hence they don't mind messing with those. It's not that hard to severely degrade the performance of a Kafka cluster🙂 1/2

English

Gwen (Chen) Shapira@gwenshap·24 Ara

Still amazed at how it sometimes takes us weeks to try and reproduce specific workloads that bring Kafka to its knees, while our least knowledgable customers do this effortlessly.

English

133

Saumitra Srivastav@_saumitra_·19 Ara

@apachekafka @WalmartTech We meet up every other month, so if you are interested in giving a talk about @apachekafka or its ecosystem project at the next event of the group, please fill out the following form forms.gle/KmtmZZYn1TVJMN… and we will get back to you.

English

Saumitra Srivastav@_saumitra_·19 Ara

Looking forward to the @apachekafka and stream processing meetup at @WalmartTech on 22nd-Dec by Bangalore Apache Kafka group. Join the group if you are interested in #kafka and #streamprocessing. meetup.com/Bangalore-Apac…

English

Saumitra Srivastav retweetledi

Timo Walther@twalthr·12 Kas

My India journey was awesome. So much culture, food, and fascinating people. Here are my slides from the #Bengalore Meetup about @ApacheFlink: slideshare.net/TimoWalther/in… A recording of the talk is available here: youtube.com/watch?v=Ych5bb… I promise to bring more stickers next time!

YouTube

English

Saumitra Srivastav@_saumitra_·2 Kas

@twalthr @ApacheFlink @qubole @Razorpay @gojektech @glassbeam Zoom stream is live now at fox.zoom.us/j/993707920

English

Timo Walther@twalthr·2 Kas

Looking forward to introduce @ApacheFlink to developers in #Bangalore today Nov 2 at 12:15 pm alongside speakers from @qubole, @Razorpay, @gojektech & @glassbeam! The #meetup is fully packed but will be streamed via Zoom, don't miss out! meetup.com/Bangalore-Apac… #streamprocessing

English

Saumitra Srivastav retweetledi

Apache Flink@ApacheFlink·2 Eki

In #Bangalore and interested in learning about Apache Flink and how it works with #ApacheKafka? Join the meetup on Nov 2 hosted by @hotstartweets! Talks by @VervericaData, @qubole, @Razorpay, @gojektech and @glassbeam. meetup.com/Bangalore-Apac… #streamprocessing #Meetup

English

Saumitra Srivastav@_saumitra_·12 Eyl

congratulations @nehanarkhede to you and entire @confluentinc team! Its inspirational and feels proud to see a fellow Indian at forefront of a company and technology that will undoubtedly power the whole world in coming years. 🇮🇳🚀🎉

Neha Narkhede@nehanarkhede

It's @confluentinc's 5th birthday and I got a chance to inaugurate our first office in my home country 🇮🇳 with our amazing team in Bangalore. This goes pretty high up in the list of highlights on this immigrant founder journey 🌟 Happy 5th birthday, Confluent! 🧡💙

English

Saumitra Srivastav@_saumitra_·12 Eyl

congratulations @shalinmangar! looking forward to try it out.

Shalin Mangar@shalinmangar

I’m very excited and proud to present the work my team has been doing the past year. Come hear all about it today at 10:45am @Activate_Conf —Lucidworks Managed Search: A Multi-tenant Apache Solr Service on Public Cloud activate-conf.com/agenda/session… #Activate19

English

Saumitra Srivastav@_saumitra_·9 Eyl

@kellabyte @rustlang has all the ingredients and will undoubtedly gain mass popularity in coming years

English

Kelly Sommers@kellabyte·9 Eyl

Since it takes around 10 years for a programming language or database to gain mass popularity I wonder which ones today are going to boom tomorrow. What are your guesses?

English

Saumitra Srivastav@_saumitra_·8 Eyl

@nehanarkhede @apachekafka @hotstartweets event is live now. join remotely through zoom fox.zoom.us/j/578251564

English

Saumitra Srivastav retweetledi

Neha Narkhede@nehanarkhede·8 Eyl

Really looking forward to my first @apachekafka meetup keynote in India. I will be speaking at one of the largest Kafka meetup groups in the world today at 2pm. Thanks for hosting us @hotstartweets and hope to meet the amazing tech community in Bangalore! meetup.com/Bangalore-Apac…

English

208

Keşfet

@ApacheIceberg @apachepolaris @ApacheAirflow @deepeval @raydistributed @apachekafka @ApacheSpark @PostgreSQL