Sanden Gocka

179 posts

Sanden Gocka banner
Sanden Gocka

Sanden Gocka

@sandengocka

Founder @earmark_ai 🚀• prev director of engineering @productplan, led iOS 📱 @mindbody • obsessed with great UX and sharing learnings along the way ❤️

San Luis Obispo Katılım Temmuz 2009
200 Takip Edilen81 Takipçiler
Sanden Gocka
Sanden Gocka@sandengocka·
I would want it to interact like voice mode on ChatGPT today but with the context of your codebase and the active thread. Imagine being able to rough out a plan through back and forth conversation, then implement when you’ve reached a sweet spot. Reviews would still happen in PR format
English
1
0
1
13
phonescloud
phonescloud@phonescloud99·
@sandengocka @Dimillian Voice-only coding sounds useful for triage, but the review step needs a hard stop. Would you want it to read back the diff and test result before it writes, or only summarize a plan until you open Codex Mobile?
English
1
0
0
24
Sanden Gocka
Sanden Gocka@sandengocka·
is anyone building a version of codex mobile that uses the new realtime api so you can interface with it purely through airpods? thinking riffing with a coworker, having it implement and summarize back, continue to converse and build? @Dimillian
English
1
0
3
87
Sanden Gocka
Sanden Gocka@sandengocka·
The worst part of meetings is choosing between listening and taking notes. So we built Pins 📌 One click during the call captures the moment and saves the context for later. Try it yourself at tryearmark.com
English
1
0
3
100
Sanden Gocka
Sanden Gocka@sandengocka·
opus 4.8 with ultracode while planning is like talking with an architect whereas with codex 5.5 you are the architect. However, in implementation, 4.8 can start to drift and hallucinate a little while 5.5 can really drive it to completion exactly how you want it. 4.8 is such a great model to riff with
English
0
0
5
425
Dan Shipper 📧
Dan Shipper 📧@danshipper·
Almost a week later! What are your thoughts on Opus 4.8? We were extremely bullish on it in testing—it seems the response was more tepid once y'all got your hands on it. If you disagreed with our take I'm curious why so we can tune our evaluations! One theory I have is that by nature it pushes on your frame a little more, and the results are high-variance—sometimes it does something amazing, and sometimes it disagrees in a way that is obviously wrong. But curious how you're feeling and what you're reaching for after a few days of testing
Dan Shipper 📧@danshipper

BREAKING: Anthropic just dropped Opus 4.8—and it is a MONSTER We've been testing for about a week @every and our verdict is they could've just called it Opus 5, it's that good. Here's our vibe check: - Beats GPT-5.5 on Senior Engineer bench. On our toughest benchmark Opus 4.8 scores a 63—a hair higher than GPT-5.5's score of 62, and a full 30 points higher than Opus 4.7. It tackled a ground-up rewrite of a production codebase, and actually built something that works. HOWEVER: Coding performance varied a lot at different reasoning levels. We recommend using it on xhigh for best results. - Incredibly good writer. Opus 4.8 scored a 79.6 on our writing benchmark—measuring models on real-world writing tasks we do all of the time like essay writing, promo email writing, and more. It beats GPT-5.5 by 6 points. It produces well-written prose with fewer "AI-isms". It's also very good at writing in your voice given the right context. HOWEVER: Writing performance also varied with reasoning levels. Medium reasoning had higher incidence of AI-isms—we found best results with high. - Beast at knowledge work. Opus 4.8 is very good at general knowledge work tasks like report creation, research and more. It produced the best PowerPoint one-shot we've ever seen on our deck generation benchmark. - Emotionally intelligent, willing to question the frame. I've also found it to be quite good at talking through psychological or interpersonal issues. It has a high EQ, and it's also good at not glazing and helping to expand your perspective. Its thought process feels extremely rich and dynamic. THE BAD: These days a model is only as good as its harness, and Codex is still a far superior harness to the Claude Desktop app. This has kept me using Codex + GPT-5.5 as my daily driver, but I am flipping back and forth a lot more between Codex and Claude. Anthropic is back baby! Read the rest on @every: every.to/vibe-check/opu…

English
86
4
114
56.2K
Sanden Gocka
Sanden Gocka@sandengocka·
@Dimillian no one will ever understand the true pain of seeing a merge conflict in project.pbxproj
English
0
0
0
69
Sanden Gocka
Sanden Gocka@sandengocka·
an exciting case of organic enterprise expansion
Earmark@earmark_ai

Seven PMs at @ServiceTitan started using @earmark_ai in November. There was no formal rollout plan, no leadership mandate. Seven people tried it, and we watched to see what would happen. By February, Design joined. Then Engineering, then other teams from corners of the company we hadn't spoken to. Seven users became 60+ over six months, entirely through word of mouth and amazing internal champions. That's the growth signal we care about most. When we surveyed the team twice, 67% said they'd be very disappointed to lose Earmark. Every single respondent said at least somewhat disappointed. But the number I keep coming back to isn't the satisfaction score. It's what they said about time. Every respondent reported getting 5+ hours back per person, per week. We asked where those hours went – not "do you feel more productive," but where did they actually go. The answers were consistent: customer interviews, vision casting, deeper analytical work. One person wrote "thinking time – the part that gets cut first." Earmark gave it back. But the number that shaped the product most isn't in any of that. It's the requests. "Pipe Earmark into Claude." Shipped. "Don't store my sensitive meetings." Shipped. "Stop rebuilding the same task." Shipped. The ServiceTitan team didn't just Earmark for six months. They built it with us. That's the partnership worth talking about.

English
1
0
2
65
Sanden Gocka
Sanden Gocka@sandengocka·
am I the only one who keeps typing @convex when I mean Codex, and Codex when I mean Convex? honestly both are GOATed though, my brain is the bug
English
0
0
0
36
Sanden Gocka
Sanden Gocka@sandengocka·
@Dimillian /side then prompt, didn’t know you could include the prompt with /side, that’s rad
English
0
1
7
347
Thomas Ricouard
Thomas Ricouard@Dimillian·
How do you use /side? Do you first /side then prompt? Or do you /side <prompt> directly?
English
69
2
78
15.2K
Oana Olteanu
Oana Olteanu@oanaolt·
It stated with 996 Now founders are bragging about 9127 on podcasts What’s next, 24/7/365 with an IV drip and a pee bottle? It’s the same AI slop psychosis spamming Cursor and Claude without even checking the logs. “They can do it all” You aren’t moving fast, you’re just shipping broken code from your desk mattress.
English
9
1
27
5K
Karri Saarinen
Karri Saarinen@karrisaarinen·
Normally I wouldn’t share something from a competitor, but this one seemed interesting. Since day one, @linear has been built for the most ambitious teams in Silicon Valley and beyond. That was always intentional, and we're very happy to serve those teams. Teams at OpenAI, Coinbase, Ramp, and frontier companies across fintech, healthcare, aerospace, and more use Linear to build and ship. Even teams building supersonic jets, like Boom Supersonic. We’re now seeing companies switch to Linear every week, and the pace is accelerating. We’ll keep doubling down helping Silicon Valley and other high performing teams on the planet build better software, faster even if our competition decides not to.
Karri Saarinen tweet media
English
52
13
632
95.6K
Sanden Gocka
Sanden Gocka@sandengocka·
@linear @linear is there a limit for guided generations? I noticed on a normal size PR guides are generated, however, on some of my larger (+10k additions) PRs guides are not generated. I understand those cost more, if this is the case, wouldn't mind adding an api key
English
0
0
0
13
Sanden Gocka
Sanden Gocka@sandengocka·
@linear I've been waiting soo long for this!! looking for a review tool that can break apart files by intent (aka chapters)...I'm a firm believer this is how to contextual load a PR faster into your brain vs a GIANT list of files with no grouping...thank you @linear for releasing this!
English
1
0
4
1.3K
Linear
Linear@linear·
Code review, but faster. Introducing Diffs. A new way to review PRs, directly inside Linear. • Realtime updates • Guided reviews with Al (beta) • Focused notifications • Iterate with coding agents • Threaded comments
English
58
69
1.6K
433.9K
Sanden Gocka
Sanden Gocka@sandengocka·
@danshipper @every @danshipper just wanted to say, thank you for these 10 min vibe checks on model drops. for those of us with little time to spare, these are incredibly valuable to stay informed with all the rapid changes...keep up the good work!
English
1
0
1
477
Dan Shipper 📧
Dan Shipper 📧@danshipper·
Opus 4.8 is a GAMECHANGER Here's the full vibe check from @every:
English
18
18
371
32.2K
Sanden Gocka
Sanden Gocka@sandengocka·
@waynesutton YES. This would probably be enough for me to no longer open an IDE
English
0
0
1
69
Wayne Sutton
Wayne Sutton@waynesutton·
All I want in codex is the option to open and edit files without having to add to chat or open in another IDE.
English
16
1
50
6.5K
Sanden Gocka
Sanden Gocka@sandengocka·
@mikeysee @convex your video is great! the fact that you don't have to wrangle separate authz systems in your mind is what makes convex so powerful
English
0
0
0
6
Mikeysee
Mikeysee@mikeysee·
@sandengocka @convex Ye fair enough. I probably should have mentioned that as a valid use for RLS
English
1
0
1
20
Convex
Convex@convex·
row-level-security sucks? w/ @mikeysee (hint: just use code, agents like code)
English
9
1
34
9.7K
Anthony Kroeger
Anthony Kroeger@kr0der·
the only downside of /side chats right now is that they just randomly die, but other than that, it's one of the best features in Codex, 100% recommend using them
Anthony Kroeger tweet media
English
4
0
18
2K