Sunny Bains @TiDB

4.5K posts

Sunny Bains @TiDB banner
Sunny Bains @TiDB

Sunny Bains @TiDB

@sunbains

swe@PingCAP - The company behind TiDB. Oracle/MySQL/InnoDB team lead in a past life

California, USA Katılım Nisan 2012
263 Takip Edilen5.2K Takipçiler
Sunny Bains @TiDB
Sunny Bains @TiDB@sunbains·
Agentic Scale for databases is for real. I think the architecture where you spin up separate instances because a single instance can’t handle millions of tables has a short shelf life.
English
0
0
4
400
Sunny Bains @TiDB
Sunny Bains @TiDB@sunbains·
@redixhumayun Yes, not sure very WAL write in InnoDB has to be synced. It’s very different from the Postgres implementation.
English
1
1
2
118
Zaid Humayun
Zaid Humayun@redixhumayun·
I'm surprised to learn that both Postgres and InnoDB use buffered I/O for WAL paths. Is this a common design decision? Postgres This isn't all that surprising after I learnt that Postgres still does buffered I/O for all paths. Direct I/O looks like it's still behind a debug flag. InnoDB This was the big surprise because InnoDB enables direct I/O. I found an old blog post from Mark Callaghan talking about the redo log using buffered I/O and the reason for that being allowing the OS to coalesce multiple small writes. Confusingly, InnoDB docs talk about innodb_flush_method which sets the value of whether to use buffered or direct IO for data and log files. But, digging through the code, I can see that this isn't really enabled for OS log files which is odd.
Zaid Humayun tweet mediaZaid Humayun tweet mediaZaid Humayun tweet media
English
4
2
66
3.1K
Sunny Bains @TiDB
Sunny Bains @TiDB@sunbains·
Don’t fetishize 100% correctness. Especially if you are just getting started.
English
0
1
17
718
Sunny Bains @TiDB
Sunny Bains @TiDB@sunbains·
Both, I get the model to write most tests. My general mode is: Review auto or manual, I get the LLM to first reproduce the problem. Only after it is reproduced, sometimes it requires manual tweaking, do I guide the LLM to write the fix. It’s an iterative process, just faster. The LLM has a tendency to solve issues using shortcuts. I monitor that part very carefully. Guide it to make architectural changes that are generic and extensible instead of ad hoc changes to make things work.
English
0
0
1
35
Zaid Humayun
Zaid Humayun@redixhumayun·
@sunbains Do you ever let the model write it's own tests? Or do you do that yourself?
English
1
0
0
28
Sunny Bains @TiDB
Sunny Bains @TiDB@sunbains·
@redixhumayun The waterfall model never works. Make it work, figure out what’s wrong and what’s working, write the tests and then improve the code, design and architecture, like a feedback loop.
English
1
0
1
22
Zaid Humayun
Zaid Humayun@redixhumayun·
@sunbains Yeah, it seems like I'm ironically not holding AI's hand enough at this point. I figured that more design work up front means I can let AI spin for more hours but perhaps we're not quite there yet.
English
1
0
0
44
ahmetb
ahmetb@ahmetb·
kinda crazy how VictoriaMetrics/VictoriaLogs came out of nowhere and got popular. I have no idea where they rank as an APM. what are primary reasons people are choosing this solution? is there a better scalability story?
English
12
4
77
21.8K
Sunny Bains @TiDB
Sunny Bains @TiDB@sunbains·
This is getting better by the day 🙂. Made the insert … select parallel. The server now runs the generated query graph on a purpose built VM. This improved the ops/s significantly 🙂 I was using Tokio initially for scheduling but its context switching overhead stymied the performance. Ended up implementing a custom scheduler for the VM, one that is more focused on the CPU side of things and the query graph execution. This also allowed me to add plugins for controlling and allocating resources at a very fine level.
Sunny Bains @TiDB tweet media
English
2
1
35
1.9K
Sunny Bains @TiDB
Sunny Bains @TiDB@sunbains·
@sloppyquorum The context switch overhead was very high ~39%. This is a tighter loop and the vm threads do the context switching based on the scheduling and the query graph shape and resource settings. It’s not general purpose.
English
0
0
3
88
Sunny Bains @TiDB
Sunny Bains @TiDB@sunbains·
@acodechef Yes and no (mostly yes). The no part is that it has pluggable policies that track cpu, net, mem usage and IO. In these tests it’s a single DML so it ends up as an event loop because the detail is set as burstable.
English
1
0
2
85
rugwiro🚀
rugwiro🚀@acodechef·
@sunbains Interesting stuff. Is your VM some kind of event loop?
English
1
0
0
86
Sunny Bains @TiDB
Sunny Bains @TiDB@sunbains·
Switching context frequently kills mental and software performance.
English
3
4
38
3.3K
Sunny Bains @TiDB retweetledi
siddontang
siddontang@siddontang·
Your SSD is a cache. S3 is the real database. That sounds wrong until you see the math: • S3: $0.023/GB/month • EBS gp3: $0.08/GB/month • 3x replication on EBS: $0.24/GB/month At 10TB, that's $230 vs $2,400/month. The trick is making S3 feel like local disk. That's what disaggregated storage engines do — hot path stays fast, cold path stays cheap. The future of databases isn't faster disks. It's smarter caching on infinite storage. The next challenging question is how to build a latency-sensitive OLTP database based on S3. #CloudNative #S3 #ObjectStorage #DatabaseEngineering #DistributedSystems #TiDB #DataInfra
English
17
25
567
222.2K
Sunny Bains @TiDB
Sunny Bains @TiDB@sunbains·
@ahmetb TiDB had to change to VictoriaMetrics for a big deployment because VictoriaMetrics solved the problem.
English
1
0
7
374
ahmetb
ahmetb@ahmetb·
@sunbains i wanna hear real-world stories of how many billion time series, how many storage nodes setup etc. but I'm not sure if any uber-scale companies took it up yet. just saw somewhere openai uses it.
English
3
0
1
1K
Sunny Bains @TiDB retweetledi
Murat Demirbas (Distributolog)
I wrote a Hybrid Logical Clock visualizer app using claude code. (Ok, claude did the work, I just PMed claude.) You can create a send event by drag and drop, a local event by double click, and take a snapshot at T-1, by pressing snapshot button. Pretty neat.
Murat Demirbas (Distributolog) tweet media
English
1
8
64
4.1K