Kole Sam

6.7K posts

Kole Sam banner
Kole Sam

Kole Sam

@Kole_Sam

AI Enthusiast | Product Manager

Los Angeles, CA 参加日 Eylül 2010
604 フォロー中244 フォロワー
The Startup Ideas Podcast (SIP) 🧃
95% of people are wasting tokens on their agent.md files. Here's why skills are better: agent.md files: - Your entire file loads into context every single turn. - 1,000 lines = ~7,000 tokens. - Every. Single. Run. Skills: - Only the title and description sit in context. - Agent sees the name, decides it's relevant, then pulls the full doc. It's called progressive disclosure. agent.md = always on, always burning tokens Skills = loaded only when needed "The models are already good." Stop overloading context. Start building skills.
The Startup Ideas Podcast (SIP) 🧃@startupideaspod

x.com/i/article/2041…

English
10
25
194
19.9K
Kole Sam
Kole Sam@Kole_Sam·
Looks like Meta has abandoned their open source focus and is shifting toward a closed source first approach. But the real question is whether their closed source models can compete with existing frontier models. Only time will tell.
Alexandr Wang@alexandr_wang

1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵

English
0
0
0
6
Gabriel
Gabriel@gabriel_horwitz·
@alexandr_wang a closed source model too. interesting to see. i hope future versions leverage the billions of users meta has on their platforms and builds the first real, authentic sounding ai system like a real person.
English
1
0
1
7.3K
Alexandr Wang
Alexandr Wang@alexandr_wang·
1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵
Alexandr Wang tweet media
English
720
1.2K
10.3K
4.4M
Kole Sam
Kole Sam@Kole_Sam·
This might sound funny, but it actually makes sense. The English language has a lot of filler words, and many can be removed from a sentence without changing the meaning. So having Claude code speak like a caveman could be more token efficient, since fewer words are used and therefore fewer tokens. It shows that the more we rely on AI for coding, the more we look for ways to make it efficient so we are not burning a crazy amount of tokens, which translates to money. I call that being token responsible.
This Week in Startups@twistartups

“Caveman Claude” is the most underrated AI tip right now. What is it? Humans tend to add fluff to what needs to really be said 90% of the meaning we wanna convey can be communicated in about 20% of the words we actually say. No changes made to the meaning of the prompt, but it uses WAY fewer tokens.

English
0
0
0
34
Kole Sam
Kole Sam@Kole_Sam·
My guess is OpenAI has realized those side projects like Sora aren’t really paying the bills, and the costs keep adding up. So they need to focus on being profitable or at least getting close to it, which means putting more attention on the big-ticket areas like enterprise and business customers. Those are the ones willing to spend serious money on coding tools and productivity. Because of that, we’ll probably start seeing more of these “side quest” projects that were fun early on gradually get phased out.
Sora@soraofficialapp

We’re saying goodbye to the Sora app. To everyone who created with Sora, shared it, and built community around it: thank you. What you made with Sora mattered, and we know this news is disappointing. We’ll share more soon, including timelines for the app and API and details on preserving your work. – The Sora Team

English
0
0
0
20
Prince Canuma
Prince Canuma@Prince_Canuma·
Just implemented Google’s TurboQuant in MLX and the results are wild! Needle-in-a-haystack using Qwen3.5-35B-A3B across 8.5K, 32.7K, and 64.2K context lengths: → 6/6 exact match at every quant level → TurboQuant 2.5-bit: 4.9x smaller KV cache → TurboQuant 3.5-bit: 3.8x smaller KV cache The best part: Zero accuracy loss compared to full KV cache.
Prince Canuma tweet media
Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English
146
412
5.2K
736.5K
Kole Sam
Kole Sam@Kole_Sam·
Google just introduced a new compression algorithm for LLMs called TurboQuant. This is the kind of breakthrough that actually moves the industry forward. Every time you talk to Gemini, Claude, ChatGPT or any other AI, the model has to remember everything you’ve said in the conversation. That memory is called the “key value cache.” You can think of it like short term memory that keeps track of the conversation so the model doesn’t have to start over every time. As the conversation gets longer, that cache grows, and so does the cost to run it. It’s also one of the main bottlenecks eating up VRAM. TurboQuant basically shrinks that memory footprint in a big way. It’s like compressing that short term memory so it takes up way less space but still works the same. Here’s what that actually means in real-world use: - Models that used to be too large for consumer hardware suddenly become usable. Instead of needing a $20,000 GPU, you could realistically run solid LLMs on something like a MacBook. And the wild part is there’s little to no loss in performance or accuracy. - This is especially useful right now with GPU and VRAM shortages. Efficiency is starting to matter just as much as raw model size. - In practice, this means you could run an open-weights model like Llama 8B locally on a MacBook Neo. That opens the door for edge computing, better privacy, and even running models directly on devices like mobile phones, instead of relying on AI companies and closed models. I remember a while back, during lunch with a couple of folks, we were talking about vendor lock in and how over reliance on AI providers could eventually lead to price hikes. This is the kind of advancement that helps keep things competitive. It makes open weight models a real alternative to closed ones, which helps keep pricing in check.
Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English
0
0
0
58
Kole Sam
Kole Sam@Kole_Sam·
Even Meek Mill is using Claude to streamline his business. He mentioned it’s helping “at a high rate,” which says a lot about the kind of impact it’s having. The people who treat AI like a superpower and use it to amplify what they already know how to do are going to come out way ahead in this era. Just by using tools like Claude, you’re already putting yourself miles ahead of most people.
English
0
0
0
416
MeekMill
MeekMill@MeekMill·
Claude is helping me organize my whole music career and other businesses in days ... and it's moving my business forward at a high rate! Some tech youngbull I met on LinkedIn gave me a incredible template! Who else can help me with Claude
English
945
724
12.4K
3.7M
Kole Sam
Kole Sam@Kole_Sam·
Even Meek Mill is using Claude to streamline his business. He mentioned it’s helping “at a high rate,” which says a lot about the kind of impact it’s having. The people who treat AI like a superpower and use it to amplify what they already know how to do are going to come out way ahead in this era. Just by using tools like Claude, you’re already putting yourself miles ahead of most people.
MeekMill@MeekMill

Claude is helping me organize my whole music career and other businesses in days ... and it's moving my business forward at a high rate! Some tech youngbull I met on LinkedIn gave me a incredible template! Who else can help me with Claude

English
0
0
0
25
Peter Quadrel
Peter Quadrel@Peter_Quadrel·
NanoBanana 2 just made your static ad agency obsolete. And I just open sourced the entire tool. Drop your product page URL. It pulls your logos, product images, fonts, colors, and brand voice automatically. Builds a full brand guide for you. Then generates ad creatives at scale using nearly 4,000 high-performing ad templates across 8 niches. It dynamically matches the best templates to your brand and brief. Here's what makes it different: → Instant resizing Get any ad in 1x1, 4x5, 9x16 with one click. No regeneration. No broken text. → Highlight-to-edit See an issue? Highlight the area and tell it what to fix. → Multiple brand profiles Run different brands or segments from one tool. → Auto persona building from real customer reviews → Multiple QC loops on briefs and final assets Catches AI-isms before you do. → Upload your own templates or use ours Runs locally. Just needs your Claude and Google API keys. This is the lite version of what we use internally. You get the full finished tool AND the open source code to make it your own. Creatives still design the system, this handles iteration and scale. Want a copy to download? 1. Like this post 2. Comment "AI" Will DM you the tool along with a tutorial shortly after.
English
2.5K
234
4.4K
287.9K
Kole Sam
Kole Sam@Kole_Sam·
Yep, another feature addition inspired by OpenClaw. I expect we’ll keep seeing more of this trend, with companies pulling specific ideas from OpenClaw and integrating them into their own products. They’ll likely position these features as a “safer” alternative, especially given the narrative that OpenClaw itself isn’t secure.
English
0
0
0
389
Ejaaz
Ejaaz@cryptopunk7213·
Anthropic launching an openclaw competitor :) 'Dispatch' lets you text claude to do work for you while you're away, claude spins up agents to do it all. - just instruct agents to complete a task and come home to finished work - also launched persistent memory so claude keeps context across multiple tasks this turns your phone into a personal ai computer very cool
Felix Rieseberg@felixrieseberg

We're shipping a new feature in Claude Cowork as a research preview that I'm excited about: Dispatch! One persistent conversation with Claude that runs on your computer. Message it from your phone. Come back to finished work. To try it out, download Claude Desktop, then pair your phone.

English
96
105
2.2K
547.6K
Kole Sam
Kole Sam@Kole_Sam·
Another feature addition inspired by OpenClaw. I expect we’ll keep seeing more of this trend, with companies pulling specific ideas from OpenClaw and integrating them into their own products. They’ll likely position these features as a “safer” alternative, especially given the narrative that OpenClaw itself isn’t secure.
Felix Rieseberg@felixrieseberg

We're shipping a new feature in Claude Cowork as a research preview that I'm excited about: Dispatch! One persistent conversation with Claude that runs on your computer. Message it from your phone. Come back to finished work. To try it out, download Claude Desktop, then pair your phone.

English
0
0
0
32
(Oma)devuae
(Oma)devuae@delveroin·
f*ck your weekend plans. You NEED to: • Learn Claude Code • Set up Perplexity Computer • Set up Claude Cowork (plug-ins, skills) • Set up OpenClaw • Experiment with agentic solutions • Use AI to create a business plan & strategy • Build an AI second-brain database • Learn basic automation tools (Manus, MCP, Zapier) • Become an elite prompt-engineer - the better you can communicate with AI, the better your Outputs • Read AI articles • Dive into robotics • Research AI stocks/ETFs/investment arbitrages The list goes on. SO much to do.
English
101
333
2.7K
118.9K
Matthew Berman
Matthew Berman@MatthewBerman·
Thinking about putting together a post about all the security measures I have in openclaw to protect against prompt injections. Critical if your openclaw ingests any web data, emails, etc. Would you read it?
English
75
9
269
23.2K