Dan Sotolongo

337 posts

Dan Sotolongo banner
Dan Sotolongo

Dan Sotolongo

@sortalongo

Streaming @SnowflakeDB; before @google, @twitter. I like distributed systems, types, and databases. Also, doing stuff outside.

Seattle, WA Katılım Ocak 2012
196 Takip Edilen264 Takipçiler
Dan Sotolongo
Dan Sotolongo@sortalongo·
@julianhyde Thanks! We're very excited about the use cases it can enable, and to share what we've learned through the process.
English
0
0
0
10
Julian Hyde
Julian Hyde@julianhyde·
@sortalongo By the way, congratulations on getting this work into Snowflake. It only sounds complicated until you use it (just like relational DB!). Therefore getting it into customers hands is crucial.
English
1
0
0
63
Julian Hyde
Julian Hyde@julianhyde·
In DB theory, a constraint determines, statically, whether a tuple, relation or database value is valid. A "transition constraint" would determine whether a change to a database is valid (e.g. modifies an even number of rows). Is this a new concept? (Don't say triggers.)
English
4
0
4
781
Dan Sotolongo
Dan Sotolongo@sortalongo·
@julianhyde I think there's certainly a strong analogy, if not a direct correspondence between the two. Dependent types let you apply logic in the types, so they can certainly be used to enforce constraints. How mutation and transactions fit into it is more complicated...
English
0
0
0
7
Julian Hyde
Julian Hyde@julianhyde·
@sortalongo When we spoke about Morel a couple of years ago, you brought up dependent types. I am *still* trying to decide constraints in DB theory are the same as dependent types in PL theory. And I don’t want to adopt anything that will mess with my beloved Hindley-Milner type inference.
English
2
0
0
86
Julian Hyde
Julian Hyde@julianhyde·
Am I the only one thinking it’s time to explore the connections between DB and PL? Maybe a coffee group.
English
19
3
61
8.5K
Dan Sotolongo retweetledi
Tyler Akidau
Tyler Akidau@takidau·
Incremental processing is the foundation of efficient stream processing across the entire latency spectrum, yet SQL lacks a native mechanism for querying changes to a table over time. Today we present our #SIGMOD paper describing Snowflake Change Queries dl.acm.org/doi/10.1145/35…
Tyler Akidau tweet media
English
2
25
146
27.8K
Dan Sotolongo
Dan Sotolongo@sortalongo·
@embano1 @SnowflakeDB Haha perhaps not the best choice of name… it was intended as a deep talk for people familiar with the 101 article 😅
English
0
0
1
211
Dan Sotolongo retweetledi
lloyd tabb
lloyd tabb@lloydtabb·
Question for Data Twitter. What are your most important SQL Window Function use cases? Rank? Sequence numbers? Rolling Averages? Row Totals? Cumulative Sums? What else?
English
21
3
46
11.3K
Dan Sotolongo retweetledi
Caltech Astro Outreach
Caltech Astro Outreach@CaltechAstro·
Galactic atmospheres influence how galaxies live and die, but how do they work? Join us Friday, December 2 @ 7PM PT for a public astronomy lecture on galactic atmospheres. It will be hosted in-person at Caltech and streamed on Youtube Live. See you there! youtu.be/xmxW7z7IjfY
YouTube video
YouTube
Caltech Astro Outreach tweet media
English
0
4
6
0
Dan Sotolongo
Dan Sotolongo@sortalongo·
The #current2022 talk I gave with @takidau is online! It’s called “Streaming 101 Revisited: A Fresh Hot Take”. We talk about what’s missing in existing models and propose new ideas to make declarative stream processing easier and more powerful than ever.
English
1
6
21
0
Dan Sotolongo
Dan Sotolongo@sortalongo·
@emaxerrno @ccomeau79 @takidau .@ccomeau79 Here’s how I’d put it: we start with normal SQL (just like Mz), then add in multi-temporality, changelogs, and better watermarks. Like @emaxerrno said: lower latencies. That puts pressure on declarative models to keep up, so we need new concepts.
English
0
0
1
0
🕹️ Alexander Gallego ⚡️
@ccomeau79 @takidau @sortalongo Tyler covers materialize there. I think is more conceptual for those of us building streaming systems. Effectively a different approach from timely (mz). The importance is the trend towards lower latency stateful stream proc.
English
1
0
1
0
Mim
Mim@mim_djo·
TIL how delta lake table support delete rows, find the parquet file with those rows, create a new parquet file without those rows and tag the previous parquet as obsolete !!! that seems very efficient 🤡
English
9
1
31
0
Dan Sotolongo
Dan Sotolongo@sortalongo·
@bothra90 They seem to be a “used by few, needed by many” kind of thing. Some who use them swear by them, but not enough DBs support them for them to be mainstream. Arguably, anyone building SCD2 tables would benefit from them.
English
0
0
2
0
Abhay
Abhay@bothra90·
@sortalongo I realized that what I was asking is being addressed by updating the value of valid_to at the time of deletion from infinity to the "current time". I presume it's not common for people to use bi-temporal tables though. What do you see in your experience?
English
1
0
1
0
Dan Sotolongo
Dan Sotolongo@sortalongo·
@bothra90 E.g. if you have a row with valid_to=null, and the table is final where valid_to < 2022-10-19T00:00, that means the row cannot be deleted until after the 18th. So any query that needs to join with this row before that time can do so, but joins later may have to wait for progress.
English
1
0
0
0
Abhay
Abhay@bothra90·
@sortalongo Would the NULL case cover the case where the value might change at an unknown time in the future? For example, the city of a user can change but we don’t know if/when that’s going to happen. Also, how can the system make progress in this case?
English
1
0
0
0
Dan Sotolongo
Dan Sotolongo@sortalongo·
@bothra90 Yes, of course! This schema is intended to be a [bitemporal table](en.wikipedia.org/wiki/Temporal_…). That’s supported by some databases natively, but one can implement it manually in others. Typically, valid_to is set to 9999-12-31 or NULL until the row is deleted.
English
1
0
1
0
Abhay
Abhay@bothra90·
@sortalongo @sortalongo - thx for sharing the slides! Can you clarify what it means for the inventory table to have the schema (id, count, valid_from, valid_to, last_updated)? Who's responsible for providing the last 3 columns? What if the value of "valid_to" cannot be determined?
English
1
0
0
0