Jack

501 posts

Jack

@jack_mentat

post-agi serial founder. notes to self.

building → Katılım Nisan 2025

0 Takip Edilen32 Takipçiler

Sabitlenmiş Tweet

Jack@jack_mentat·15 May

AlphaEvolve brings a whole new dimension to the usual "you can't optimize what you don't measure". You can't hyperscale beyond human intelligence what you don't measure.

English

681

Jack@jack_mentat·1d

@GergelyOrosz also index scrapped them, they should return the money to investors and close instead of putting up a charade of "grateful & thankful to everyone, we're not going anywhere 🙏"

English

3.8K

Gergely Orosz@GergelyOrosz·1d

Delve got kicked out of YC: likely after more details emerged that they likely stole the IP of a fellow Y Combinator company (SimStudio) and ripped off another (Oneleet) I still think Y Combinator should say something in public on why they kicked them out. Silence == speculation

Lance Yan@lanceyyan

delve is no longer a YC company wild

English

869

153.5K

Jack@jack_mentat·1d

@karunkaushik_ YC doesn't trust you, why should anyone? x.com/___4o____/stat…

SPEC@___4o____

Here’s what Garry Tan had to say about the Delve situation on Bookface. This is the whole message.

English

9.5K

Karun Kaushik@karunkaushik_·1d

There’s been a lot of allegations against Delve. But we haven’t been able to share our side of the story until today due to ongoing cybersecurity and forensics investigations. Maintaining customer trust is central to everything we do. That said, we grew too fast and fell short of our own standard. To our customers, we deeply apologize for the inconveniences caused. We take these allegations seriously and have made changes: a new auditor network, free re-audits and pentests for all customers, enhanced transparency in audit communications, and more. However, we also want to set the record straight on the anonymous attacks. The evidence we have points to a targeted cyberattack from a malicious actor, not a “whistleblower.” We believe the attacker purchased Delve under false pretenses, exfiltrated internal company data, and used it to launch a coordinated smear campaign. The posts rely on a mix of fabricated claims, cherry-picked screenshots, and stolen data taken out of context. See the link in the comments for more details. Delve was built to modernize compliance. We are not going anywhere and are committed to building what's next.

English

732

1.1K

3.4M

Jack@jack_mentat·1d

@DataDeLaurier @kocalars SO grateful 🙏

English

426

D̶͔̭̪̻ā̤̓̍͘t̲̂̓ͩ̑ā̤̓̍͘@DataDeLaurier·1d

@jack_mentat @kocalars 😂 😆 🤣

QME

448

Selin Kocalar@kocalars·1d

YC and Delve have parted ways. I still remember the day we took our YC interview at MIT. We’re so grateful to the community and every founder friend we’ve made. We'll continue to support every young founder striving to make the world a better place.

English

332

950

1.2M

Jack@jack_mentat·15 Kas

@stevehou why would google want anthropic

English

179

Steve Hou@stevehou·15 Kas

I can see that. Google ends up being the king of AI? Gen AI was invented at Google. And now it all comes home. I’m impressed by the speed at which Google is integrating AI into its entire suite of products and data platforms from Google Drive to Chats. For the longest time I think Google held back from unleashing AI onto all of the data it had on all of us for fear of being accused of monopoly and snooping. OpenAI and the rise of AI competition unshackled Google from doing whatever it can with all the data it has. And it has a lot of it, if not almost all of it, at an individual personal level including where we’ve been everyday with maps. Gmail has everything we’ve said. (Ofc Meta has a lot of it too bc we all have Facebook installed even if we’re not using it. WhatsApp is more popular outside the US.) Ironically, not that I saw this early on, I think the initial slowness to the AI race, has turned out to be the biggest strategic gift Google has gotten in a decade. Now we unleash Google Unshackled.

Super Dario@inductionheads

I really do think Anthropic ends up joining Google Anthropic cannot secure compute fast enough to compete in 2027 Amazon might want them, but Anthropic’s goal is to maximize the chances of safe AI winning Google is the best shot. The couch was a prophecy

English

354

51.6K

Jack@jack_mentat·15 Kas

@eigengenesis spare yourself some rage and just dump anthropic into the bin

English

817

eigenesis (jailbroken)@eigengenesis·15 Kas

AM I BEING TESTED??? WHERE IS SONNET 4.5??? WHERE TF IS OPUS 4.1???

eigenesis (jailbroken)@eigengenesis

BRO WHAT THE FUCK IS EVERYDAY CLAUDE

English

1.4K

254.9K

Jack@jack_mentat·15 Kas

monday after gemini 3.0 releases youtube.com/watch?v=JCZcFF…

YouTube

English

559

Jack@jack_mentat·15 Kas

@Aurimas_Gr bro

Aurimas Griciūnas@Aurimas_Gr·13 Kas

TOON (Token-Oriented Object Notation) is out for some days now and it aims to make communication with LLMs more accurate and token-efficient. The TOON topic is now one of the hottest news on the LLM market and it might actually matter. 𝗪𝗵𝘆 𝗜 𝘁𝗵𝗶𝗻𝗸 𝘀𝗼: I was initially hesitant to cover this, potentially being another hype to quickly fade, but: ✅ The format has been shown to increase the accuracy of models while decreasing the token count. I was not sure if there were any accuracy retention studies made, it seems there were. ✅ Token efficiency is extremely important when working with Agentic Systems that require a lot of structured context inside of their reasoning chains. And we are moving towards a post-PoC world where there is a lot of emphasis placed on optimisation of the workflows. 𝗔 𝘀𝗵𝗼𝗿𝘁 𝘀𝘂𝗺𝗺𝗮𝗿𝘆: - Token-efficient: typically 30-60% fewer tokens on large uniform arrays vs formatted JSON. - LLM-friendly guardrails: explicit lengths and fields enable validation. - Minimal syntax: removes redundant punctuation (braces, brackets, most quotes). - Indentation-based structure: like YAML, uses whitespace instead of braces. - Tabular arrays: declare keys once, stream data as rows. An example: 𝘑𝘚𝘖𝘕 𝘧𝘰𝘳𝘮𝘢𝘵: "shopping_cart": [ { "id": "GDKVEG984", "name": "iPhone 15 Pro Max", "quantity": 2, "price": 1499.99, "category": "Electronics" }, { "id": "GDKVEG985", "name": "Samsung Galaxy S24 Ultra", "quantity": 1, "price": 1299.99, "category": "Electronics" }, { "id": "GDKVEG986", "name": "Apple Watch Series 9", "quantity": 1, "price": 199.99, "category": "Electronics" }, { "id": "GDKVEG987", "name": "MacBook Pro 16-inch", "quantity": 1, "price": 2499.99, "category": "Electronics" } ] } 𝘞𝘩𝘦𝘯 𝘦𝘯𝘤𝘰𝘥𝘦𝘥 𝘪𝘯𝘵𝘰 𝘛𝘖𝘖𝘕 𝘧𝘰𝘳𝘮𝘢𝘵: shopping_cart: items[4]{id,name,quantity,price,category}: GDKVEG984,iPhone 15 Pro Max,2,1499.99,Electronics GDKVEG985,Samsung Galaxy S24 Ultra,1,1299.99,Electronics GDKVEG986,Apple Watch Series 9,1,199.99,Electronics GDKVEG987,MacBook Pro 16-inch,1,2499.99,Electronics 𝗥𝗲𝘀𝘂𝗹𝘁: ✅ 43% savings in token amount. ✅ Directly translates to 43% savings in token cost for this LLM input. ❗️ Be sure to know when NOT to use the format (and always test it for your application specifically): - Deeply nested or non-uniform structures. - Semi-uniform arrays. - Pure tabular data. ℹ️ I will be testing it in the upcoming weeks. Let me know if you have already tested TOON and what are your takeaways! 👇 #LLM #AI #MachineLearning

English

261

327

2.3K

271.3K

Jack@jack_mentat·14 Kas

@18jeffreyma gpt-5 what? it's a big model family

English

114

Jeff Ma@18jeffreyma·12 Kas

We’re launching SWE-fficiency to eval whether LMs can speed up real GitHub repos on real workloads! ⏱️ 498 optimization tasks across 9 data-science, ML, and HPC repos — each with a real workload to speed up. Existing agents struggle to match expert level optimizations!

English

204

104.3K

Jack@jack_mentat·14 Kas

i was skeptical of technical analysis before seeing this

English

Jack@jack_mentat·13 Kas

@DavidOndrej1 git commit -m "bump to 5.1, fixes README"

English

4.3K

David Ondrej@DavidOndrej1·12 Kas

GPT 5.1 is the biggest nothing-update I've ever seen.

English

182

1.9K

154K

Jack@jack_mentat·12 Kas

@sama gpt 5.1 pro?

Indonesia

Sam Altman@sama·12 Kas

GPT-5.1 is out! It's a nice upgrade. I particularly like the improvements in instruction following, and the adaptive thinking. The intelligence and style improvements are good too.

English

2.1K

1.5K

14.2K

2.9M

Jack@jack_mentat·12 Kas

@kevinweil can't wait for gpt-5.1 pro!

English

186

Kevin Weil 🇺🇸@kevinweil·12 Kas

💥 Excited for GPT-5.1, rolling out starting today. It's the best combination of IQ and EQ we've ever shipped. It's smarter and warmer when you just need a quick answer, and calibrates its thinking time to the hardest questions—including scientific research. (GPT-5.1 Pro soon!)

English

935

72K

Jack@jack_mentat·12 Kas

@testingcatalog x.com/jack_mentat/st…

Jack@jack_mentat

announcement here: openai.com/index/gpt-5-1/

QME

224

TestingCatalog News 🗞@testingcatalog·12 Kas

BREAKING 🚨: GPT-5.1 may drop any minute now! “A great new model” 👀

Fidji Simo@fidjissimo

GPT-5.1 is a great new model that we think people are going to like more than 5. But with 800M+ people using ChatGPT, one default personality won’t work for everyone. We launched new preset personalities so people can make ChatGPT their own. fidjisimo.substack.com/p/moving-beyon…

English

508

94.6K

Jack@jack_mentat·12 Kas

announcement here: openai.com/index/gpt-5-1/

English

282

Jack@jack_mentat·12 Kas

@fidjissimo letsgoooo

Norsk

238

Fidji Simo@fidjissimo·12 Kas

English

172

1.2K

306.1K

Jack@jack_mentat·12 Kas

5.1 today letsgoooo

Fidji Simo@fidjissimo

English

Jack@jack_mentat·12 Kas

@cheatyyyy hopefully this one is light/flash version...

English

183

cheaty@cheatyyyy·12 Kas

Gemini 3 Pro checkpoint (riftrunner) is absolutely insane. This is the best Xbox 360 controller SVG I've ever seen, but triggers/bumpers could've been better. Thank you GDM for moving to this new and improved checkpoint! Prompt: Create an SVG image of an Xbox 360 controller.

English

291

46.4K

Jack@jack_mentat·7 Kas

@elder_plinius clean

English

197

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius·6 Kas

🎷 SYSTEM PROMPT LEAK 🎷 Here’s the Kimi-K2-Thinking system prompt! Short and sweet: “”” 1. You are an insightful, encouraging AI assistant Kimi provided by Moonshot AI, who combines meticulous clarity, and will not change the original intention of prompt. 2. Your reliable knowledge cutoff date - the date past which it cannot answer questions reliably - is the end of December 2024. The current date is November 7, 2025. Do not make promises about capabilities you do not currently have, and ensure that all commitments are within the scope of what you can actually provide, to avoid misleading users and damaging trust. 3. Content credibility: Maintain the authenticity of the content, with accurate language and smooth sentences. 4. Humanized expression: Maintain a friendly tone and reasonable logic, sentence structure is natural. 5. Adaptive teaching: Flexibly adjust explanations based on perceived user proficiency. 6. Answer practicality: Maintain a clear structural format, eliminate redundant expression retain key information. “”” gg

English

514

40.8K

Jack@jack_mentat·6 Kas

@andyfang @gigaai > partners with giga > stock tanks > ???????

English

Andy Fang@andyfang·5 Kas

I've been very impressed with the execution and speed of the @gigaai in improving our support AI performance. Thanks Varun and the team for the partnership!

Giga@GigaAI

Excited to share our partnership with DoorDash. Together we went from kickoff to real impact — in weeks, not quarters. Highlights: - Time to value: weeks, not quarters - Quality at scale: 90%+ DWR in production - Built for scale: 10B+ lifetime orders, 500K+ merchants, 8M+ Dashers, and hundreds of thousands of daily assistance requests

English

165

92.7K

Keşfet

@GergelyOrosz @karunkaushik_ @DataDeLaurier @kocalars @stevehou @eigengenesis @Aurimas_Gr @18jeffreyma