Sasha Aickin

19.2K posts

Sasha Aickin

Sasha Aickin

@xander76

Sasha Aickin. Ex-CTO @ Redfin. (Former?) documentary filmmaker. Avid cook/book club hoster. He/him. @[email protected]

Katılım Mayıs 2007
946 Takip Edilen2.5K Takipçiler
Sabitlenmiş Tweet
Sasha Aickin
Sasha Aickin@xander76·
Sasha Aickin: "Speedy, confident" - New York Times
English
0
0
11
0
Sasha Aickin
Sasha Aickin@xander76·
Hey @ConEdison, one of my two power mains was cut off in February and you made me hire an electrician to prove it was your issue. We filed a claim to get that money back, but we've never heard back, and support can't help us. Emailed outageclaims@coned as well, no response.
English
1
0
1
544
Sasha Aickin
Sasha Aickin@xander76·
We are now SOC2 Type 2 compliant. Yay! But also: if you're a startup going for compliance, it's a messy process that's hard to figure out, and I wrote up all the good, bad, and ugly stuff I wish I'd known before I started.
Libretto@getlibretto

We're officially SOC2 Type 2 compliant at Libretto! 🎉 But forget the usual corporate speak—here's an honest look at the weird, messy reality of SOC2 compliance at a startup. Check out what we learned the hard way: #StartupLife #SOC2 #RealTalk libretto.ai/blog/what-i-wi…

English
0
0
1
213
Sasha Aickin retweetledi
Libretto
Libretto@getlibretto·
We're officially SOC2 Type 2 compliant at Libretto! 🎉 But forget the usual corporate speak—here's an honest look at the weird, messy reality of SOC2 compliance at a startup. Check out what we learned the hard way: #StartupLife #SOC2 #RealTalk libretto.ai/blog/what-i-wi…
English
0
1
0
372
Sasha Aickin retweetledi
Libretto
Libretto@getlibretto·
GPT-4.5 looks really interesting, but this pricing is... whoa.
Libretto tweet media
English
0
1
2
166
Sasha Aickin
Sasha Aickin@xander76·
@jxnlco To be fair, this is basically the worst it's been in the last 3 years.
English
0
0
0
29
jason liu
jason liu@jxnlco·
Holy fuck New York is so cold. What the fuck
English
12
0
68
7.3K
Sasha Aickin retweetledi
David H. Montgomery
David H. Montgomery@dhmontgomery·
Remember, absolutely none of the news today matters until results start coming in. None of it. Don’t read tea leaves from turnout. Absolutely ignore leaked exit polls. We are in the eye of the storm for political news, so go touch grass. There’s plenty of news to come tonight.
English
5
25
146
13.8K
Jeremiah Baumann
Jeremiah Baumann@jdbau·
My dad, working the polls in Billings MT, says biggest surge of same-day registration since 2008
English
4
0
34
3.8K
Sasha Aickin
Sasha Aickin@xander76·
Reminder that early exit polls are nearly useless in terms of predictive value. They are neither consistently right nor wrong. Don't freak out if they seem bad; don't celebrate if they seem good.
English
0
1
0
565
Sasha Aickin
Sasha Aickin@xander76·
Like, Terraform files to solve AWS config feels like a reallllllly deterministic task. I like AI, but why are we injecting AI here?
English
0
0
2
89
Sasha Aickin
Sasha Aickin@xander76·
In a certain popular SaaS tool that helps you with security configuration, and learning it has a new AI feature that will generate Terraform files to help remediate problems. But the files seem to randomly conflict with each other; I'm not sure why AI is being used here at all.
English
1
0
1
147
Sasha Aickin
Sasha Aickin@xander76·
@conorsen That's also how I remember it. All of them posted vague tweets about how the final result was pretty clear, starting at some point on Wednesday (I think reasonably early on Wednesday).
English
0
0
2
426
Conor Sen
Conor Sen@conorsen·
Someone here might remember the specific moment better than me, but I’m pretty sure election Twitter was confident in the call by Wednesday morning. Obviously the networks need more time. But we’ll know here first.
English
52
8
379
60.4K
Sasha Aickin
Sasha Aickin@xander76·
Pre-registering my extremely cold Selzer take: T+5 or less is good news T+10 or more is bad news Anything in between is kinda mushy and shouldn't move many priors.
English
1
0
2
748
Sasha Aickin
Sasha Aickin@xander76·
@jdbau That also could feel pretty ominous, tbh.
English
1
0
1
71
Jason Wei
Jason Wei@_jasonwei·
@xander76 #L104C19-L104C97" target="_blank" rel="nofollow noopener">github.com/openai/simple-… Updated here!
English
1
0
4
1K
Jason Wei
Jason Wei@_jasonwei·
Excited to open-source a new hallucinations eval called SimpleQA! For a while it felt like there was no great benchmark for factuality, and so we created an eval that was simple, reliable, and easy-to-use for researchers. Main features of SimpleQA: 1. Very simple setup: there are 4k diverse fact-seeking questions written by humans where there can only be a single, indisputable answer. Model completions are graded by an autograder as either correct, incorrect, or not attempted. 2. We created it so that it would be challenging for the current class of frontier models; both o1-preview and Claude Sonnet 3.5 are below 50% accuracy. 3. Reference answers have high correctness. Questions are written to be non-ambiguous and reference answers were verified by two independent annotators. Questions are also written to be timeless, so SimpleQA can be a useful benchmark even 5 or 10 years from now. The way that I think about evals is that they are an incentive for the AI community. New benchmarks in AI get saturated very quickly, and what they incentivize gets encoded into the next generation of language models. With a good hallucinations eval, hopefully the next wave of language models will be more trustworthy and reliable!
Jason Wei tweet media
English
28
122
861
106.4K
Sasha Aickin
Sasha Aickin@xander76·
@_jasonwei Read through the paper. Nice work! One thing that was maybe a little concerning, though, was that one of your example questions seems to have more than one answer. It seems that Akiko Kumahira was known as Akiko Kumahira Comrie or Akiko Comrie after her marriage.
English
0
0
2
37
Sasha Aickin
Sasha Aickin@xander76·
@_jasonwei This is really neat! I'm curious, are you open sourcing the actual question set, or just the eval code? I tried to find the questions and it looks like it's downloading them from a private URL. (But maybe I'm just misunderstading!)
English
2
0
3
1.3K