Sergei Vassilvitskii

143 posts

Sergei Vassilvitskii

Sergei Vassilvitskii

@vsergei

Mostly in SF Katılım Kasım 2009
251 Takip Edilen442 Takipçiler
Sergei Vassilvitskii retweetledi
Jalaj Upadhyay
Jalaj Upadhyay@jalajupadhyay·
The third iteration of NYC Privacy Day is going to happen this October. Register and learn about the cutting-edge work done in security and privacy in the last few months :) rsvp.withgoogle.com/events/nyc-pri…
English
1
2
12
1.6K
Aryeh Kontorovich
Aryeh Kontorovich@aryehazan·
In fact, as a challenge, name an empirically successful algorithm that arose from theory. All I got is boosting.
Aryeh Kontorovich@aryehazan

@rogtron Theory is almost always playing catch-up to empirically successful algorithms. This was certainly the case with regression, Nearest Neighbor, SVM, random forest.

English
49
3
94
46.7K
Sergei Vassilvitskii
Sergei Vassilvitskii@vsergei·
@aryehazan The history of original k-means is interesting, but won't fit in the margins of this tweet. However, k-means++ as the now standard way to initialize k-means did come from theory.
English
0
0
6
211
Aryeh Kontorovich
Aryeh Kontorovich@aryehazan·
@vsergei Are you telling me that k-means arose from theory and nobody was running it on data before some sort of analysis became available?
English
2
0
0
853
Sergei Vassilvitskii retweetledi
Michael Dinitz
Michael Dinitz@mdinitz·
Another new paper on the arxiv to talk about: arxiv.org/abs/2308.10316 . This paper is my first foray into differential privacy, which was fun and forced me to learn a lot.
English
1
4
27
7.6K
Sergei Vassilvitskii retweetledi
Michael Dinitz
Michael Dinitz@mdinitz·
New paper just hit the arxiv, and which was one of the most fun and interesting research projects that I've ever worked on: arxiv.org/abs/2308.05067 . Long story short, we found some super interesting and surprising behavior in the most well-studied online problems: ski rental!
English
2
7
68
8.8K
Sergei Vassilvitskii retweetledi
Alexey Kurakin
Alexey Kurakin@alexey2004·
Training ML models with differential privacy could be challenging. To aid practitioners, we wrote a detailed survey with known best practices of DP-training of ML models: arxiv.org/abs/2303.00654
English
1
12
38
5.9K
Sergei Vassilvitskii retweetledi
Michael Dinitz
Michael Dinitz@mdinitz·
Super excited about a new preprint, "Faster Matchings via Learned Duals", with Sungjin Im, Thomas Lavastida, Ben Moseley, and @vsergei . Long story short: we can use ML to massively speed up min-cost perfect matching computations! arxiv.org/abs/2107.09770
English
5
5
29
0
Sergei Vassilvitskii
Sergei Vassilvitskii@vsergei·
@Aaroth A nice simple exercise. Suppose you are estimating the mean of a Gaussian distribution from iid samples. Compare the DP error to the finite sample error. TL;DR; with DP you need O(\sqrt{log n}) more samples to get parity.
English
0
0
1
0
Aaron Roth
Aaron Roth@Aaroth·
This is good news for anyone who is worried that differential privacy will render Census data unusable; Its effect on statistics seems to be comparable to taking a very large random sample of the data, which is better than what statisticians usually get to work with.
English
1
7
12
0
Aaron Roth
Aaron Roth@Aaroth·
"if 𝜖 = 1.0 ... TopDown will be like the uncertainty introduced by working with a 50% sample of the full dataset; if 𝜖 = 2.0, it will be like working with a 75% sample; and if 𝜖 = 6.0, it will have accuracy matching a 95% sample, which is pretty close to having the full data"
English
1
11
20
0
Brendan Dolan-Gavitt
Brendan Dolan-Gavitt@moyix·
Does anyone have an example of a pre-registered replication of a paper in computer science ?
English
5
7
6
0
Sergei Vassilvitskii
Sergei Vassilvitskii@vsergei·
Coming out of twitter hibernation to say that part 1 of the clustering book with @geomblog is available at clustering.cc ! As we say in the intro: Clustering is more than just a collection of tools... it is a systematic way to think about how data should be organized.
English
2
8
36
0