Igor Canadi

512 posts

Igor Canadi

Igor Canadi

@igorcanadi

Building big databases at @OpenAI. Previously Rockset and Facebook.

San Francisco Katılım Nisan 2009
1.3K Takip Edilen935 Takipçiler
Igor Canadi retweetledi
Venkat Venkataramani
Venkat Venkataramani@iamveeve·
@RocksetCloud is on a mission to eliminate all the cost and complexity associated with real-time analytics. Today's Rollup launch is a very important milestone in this journey. Read more👇
English
1
8
23
0
Igor Canadi
Igor Canadi@igorcanadi·
Very excited about our Series B today! A great milestone on our mission to build some of the finest database tech, optimized for developers. P.S. We are hiring - DMs are open.
English
2
1
15
0
Igor Canadi
Igor Canadi@igorcanadi·
Dream team!
English
1
0
7
0
Igor Canadi
Igor Canadi@igorcanadi·
@narayanarjun @MarkCallaghanDB This is how we use it, exactly - we download all files locally. Still investigating if we can keep a portion of the data in the cloud without a local cache. Likely possible for scan queries - S3's throughput is impressive.
English
0
0
1
0
Arjun Narayan
Arjun Narayan@narayanarjun·
@MarkCallaghanDB But shouldn't basically every read be served from your local cache? This way you don't need to replicate your RocksDB - the S3 takes care of that - and your local cache can be on non-persistent SSDs.
English
1
0
0
0
Igor Canadi
Igor Canadi@igorcanadi·
@MarkCallaghanDB @iamveeve Surprisingly accurate. ;) We allow updates to existing documents, so we keep PK -> rowid index. We use RocksDB's merge operator, but the efficiency of reading chunks that have merge updates is still an open question for us.
English
0
0
0
0
Igor Canadi
Igor Canadi@igorcanadi·
@vfonic Strongly disagree. GraphQL is exposing business logic, which is rarely fully expressible in SQL. Opening up SQL interface to the world is also very risky -- you need strong protection against expensive queries.
English
0
0
1
0
Igor Canadi
Igor Canadi@igorcanadi·
@janicduplessis Hey @janicduplessis I work at Rockset, would love to have you try it out and share your thoughts. First 2GB are free, happy to help if you get stuck with anything.
English
0
0
1
0
Janic
Janic@janicduplessis·
Has anyone tried Rockset? Looks interesting for analytics + search on top of nosql data stores rockset.com
English
1
0
1
0
Igor Canadi
Igor Canadi@igorcanadi·
@tlipcon @markcallaghan @MongoDBEng @RocksetCloud That’s correct. We have specialized IValue types for arrays of same-type scalars, on which we can do vectorized. For mixed-type arrays (columns) you are out of luck, but when you have an implicit schema the speed should be similar as with explicit schema.
English
0
0
1
0
Todd Lipcon
Todd Lipcon@tlipcon·
@igorcanadi @markcallaghan @MongoDBEng @RocksetCloud Looks like the 128-bit IValue (and dynamic typing in general) would really inhibit your ability to do vectorized query execution unless you specialize operators for runs of same-typed data and avoid materializing IValues in those cases?
English
1
0
0
0
Igor Canadi
Igor Canadi@igorcanadi·
@markcallaghan @MongoDBEng @RocksetCloud Still early, but we have a project to make columnar better by combining many values into a single key-value in RocksDB through merge operators. This is a user-space approach though; haven't explored improvements to the LSM itself.
English
0
0
3
0
Todd Lipcon
Todd Lipcon@tlipcon·
@markcallaghan @MongoDBEng @RocksetCloud Yea still fixed schema. Funny enough, the original design looked more like rockset with schema inference but started with fixed schema and never went and did the schemaless
English
1
0
0
0
Igor Canadi
Igor Canadi@igorcanadi·
@FranckPachot @markcallaghan @MongoDBEng @RocksetCloud We build search indexes on columns, which allow us to quickly evaluate conjunction of predicates (intersecting posting lists). There are some queries that would be faster with multi-column indexes, I agree. Future work. :)
English
1
0
2
0