Grim

372 posts

Grim banner
Grim

Grim

@justgrm

I write content II Trade narratives II Polymarket knows my name.

Katılım Nisan 2026
5 Takip Edilen15 Takipçiler
Grim
Grim@justgrm·
@koltregaskes rate limits hitting different when its their side, not yours reminds me why i keep local models around tbh
English
0
0
1
13
Kol Tregaskes
Kol Tregaskes@koltregaskes·
My goodness, Anthropic. Please fix, I'm getting these a lot atm.
Kol Tregaskes tweet media
English
11
0
13
499
Grim
Grim@justgrm·
@TheAhmadOsman the vocabulary in this post is enough to scare half the neighborhood away (love it)
English
0
0
2
90
Ahmad
Ahmad@TheAhmadOsman·
Assembling a team to fight back against the performative inference neoinfluencers of local AI
English
28
2
97
2.6K
Grim
Grim@justgrm·
@tom_doerr accessing a codebase through a Telegram bot feels wrong but also kinda convenient does it stream the whole terminal or just send outputs?
English
0
0
0
24
Grim
Grim@justgrm·
@ziwenxu_ betting the house on thursday energy hope its actually break the internet and not just a nice blog post
English
0
0
0
6
Grim
Grim@justgrm·
@GoSailGlobal man built the Ship of Theseus as a website thats beautiful and a little evil
English
0
0
1
7
Jason Zhu
Jason Zhu@GoSailGlobal·
有人做了一个百分百 AI 幻觉的维基百科 叫 Halupedia 你点开任何一个词条 / 那一刻 AI 才把这个条目生成出来 你不点 / 这个条目不存在 宇宙在你访问的那一秒才被造出来 视觉上长得跟维基百科一模一样 - 同一种字体 / 同一种排版 / 同一种学术引用 / 连「随便看一篇」按钮都在 - 唯一区别 :里面一个字都不是真的 创作者自己的标语 : 「一个在你访问之前不存在的宇宙的百科全书」 halupedia.com 我去搜了下人工智能artificial intelligence,他这搜索出来的我表示都没看过😂
Jason Zhu tweet media
Nav Toor@heynavtoor

THIS GUY BUILT AN ENTIRE WIKIPEDIA THAT IS 100% AI HALLUCINATIONS AND IT'S OPEN SOURCE ON GITHUB it's called Halupedia. nothing on the site existed before you clicked. every article was generated the second you arrived. the site has one rule: the universe only exists when you visit it. it looks exactly like wikipedia. same fonts. same layout. same scholarly citations. same "stumble" button for random articles. the only difference is none of it is real. here are some actual articles currently in the encyclopedia: > the great pigeon census of 1887 > the ministry of slightly wrong maps > chaldic arithmetic — a branch of mathematics where subtraction is forbidden > armund the river mapper — a cartographer who mapped 14,000 leagues of river without leaving his chair > the society for the prevention of unnecessary tuesdays every article page also tells you how many people are reading it right now. it says: "you alone are consulting this folio at present." the creator's own tagline for the site is the most unhinged sentence i've read this year: "an encyclopedia of a universe that does not exist until you visit it" the entire backend is a single open source repo called vibeserver. one guy. one description on github: "a little webserver making things up just in time." we built the largest knowledge base in human history and the very first thing a guy did with it was make a hallucinated mirror universe and put it on the open web. the internet is healing.

中文
1
1
5
643
Grim
Grim@justgrm·
@swyx this is missing the part where /chaos is running all three for me personally
English
0
0
0
14
swyx 🌉
swyx 🌉@swyx·
increasing levels of autonomy: /skill: preset prompts /plan: human-refined inputs /goal: AI-evaluated outputs
English
23
4
80
3.2K
Grim
Grim@justgrm·
@tom_doerr actualmente este es un enfoque muy practico para el tema cuanto tiempo tomara completar la guia?
Español
0
0
0
38
Grim
Grim@justgrm·
@tunguz took a weird turn at the property respect part but honestly the dependency angle is the real scary part, not the bandwidth
English
0
0
0
50
Grim
Grim@justgrm·
@vasuman id apply just for the pizza boxes stacking up or do u actually use the shit yourself first
English
0
0
0
382
vas
vas@vasuman·
Hiring an intern who will come in and just build cool and useful shit with tools like this for all of our employees to use internally and move 100x faster. You get : - top 1% salary - catered lunch and dinner - unlimited token spend - real world use cases (you’re building for our clients and our company) - mentorship from some of the brightest minds in AI I get: - a Varick Agents for Varick Agents Apply on website
OpenAI Developers@OpenAIDevs

What if your team gave standup updates, and GPT-Realtime-2 moved the tickets?

English
17
2
147
35.4K
Grim
Grim@justgrm·
@GaryMarcus @elonmusk anyone who thought sam was just a pure idealist wasnt paying attention lol? the YC connection was always the missing piece
English
0
0
0
43
Grim
Grim@justgrm·
@qasimbizs joke's on me, ive been overthinking whether i should ship an idea about stopping overthinking
English
0
0
0
3
Qasim
Qasim@qasimbizs·
You’re holding yourself back by overthinking. Ship one idea. Trust yourself enough to see what you can do instead of questioning your ability or whether it’ll work out.
English
11
0
13
107
Grim
Grim@justgrm·
@PowerThesaurus felt like i was about to get billed for a 30 second video
English
0
0
0
4
Grim
Grim@justgrm·
@aakashgupta installed skill lore hidden from itself the description blackbox is such a funny metagame
English
0
0
0
10
Aakash Gupta
Aakash Gupta@aakashgupta·
Most people's Claude Skills are invisible to Claude. Claude scans every installed skill's description before deciding to load it. The decision happens from that single line. Skills with descriptions under 100 characters stay invisible because Claude can't tell what they do. I tested 25 of my top Skills across 75 runs last week. The recipe planner I built had a 37-character description: "Suggest recipes from what's in fridge." I sent 10 prompts that should have triggered it. "What can I make tonight." "I don't want to go grocery shopping." "Help me use up what's in my fridge." Most missed. Rewrote the description with 3 trigger phrases a user would actually type, third-person voice, and one "do not use for X, use /Y instead" boundary. Same skill. Now those prompts fire. That's the routing layer. The other 6 laws are about what happens after the skill loads: - Write commands, not requests. "Flag every issue with severity Critical/High/Medium/Low" beats "could you take a look." - Build a read-first table with Source, Path, What to extract. The 3-column structure tells Claude which directory to search, what terms to look for, what to pull out. - Include a worked input/output example. One example beats 12 rules. - Keep skills under 500 lines. Safety rules buried at line 700 of a fitness skill never fire. - Every "do not use for X" needs a "use /Y instead" pointer. Full audit checklist plus an eval prompt that runs 10 sub-agents against your skill is in the deep dive: aibyaakash.com/p/claude-skill… Skills are the new prompts. Most people are still writing them like 2023 prompts.
English
4
1
10
2.4K
Grim
Grim@justgrm·
@WesRoth that first crack took longer than expected but finally someone made it work curious how much the language choice mattered there
English
0
0
0
7
Wes Roth
Wes Roth@WesRoth·
Meta AI Research's notoriously difficult cleanroom software-engineering benchmark, ProgramBench (which previously saw every frontier LLM score a flat 0% fully resolved rate) has recorded its first successful task solve courtesy of OpenAI's GPT-5.5 operating at the high/xhigh reasoning tiers. Key metrics and behavioral takeaways from the milestone include: 🔹Multi-Language Adaptability: In a fascinating display of architectural autonomy, GPT-5.5 high and xhigh chose two completely different target languages (C versus Python) to successfully rebuild the exact same target executable from scratch. 🔹Frontier Dominance: GPT-5.5 xhigh significantly outperformed Anthropic’s Claude Opus 4.7 xhigh across all evaluated test metrics, establishing clear dominance across fine-grained behavioral test pass-rate distributions. 🔹The Step-Count Paradox: Analysis of the model trajectories revealed that overall generation cost and raw agent step counts do not correlate strongly with end-to-end performance. GPT-5.5 utilized a highly compact, token-efficient action bundle (frequently chaining terminal commands via &&), solving complex workflows in fewer operational steps.
Wes Roth tweet media
Kilian Lieret@KLieret

The first ProgramBench task was just solved by GPT 5.5 high/xhigh. Interestingly, high/xhigh picked two different languages for the task (C vs Python). GPT 5.5 xhigh was significantly better than Opus 4.7 xhigh in all metrics. 🧵

English
9
1
22
992
Grim
Grim@justgrm·
@NickADobos whimsy AND hostility in one product design a true neutral good easter egg
English
0
0
2
14
Grim
Grim@justgrm·
@mickeyxfriedman lowkey never connected llm psychosis with narcissism before makes u wonder who built the training data lol
English
0
0
1
28
Grim
Grim@justgrm·
@tom_doerr so they put together clion and cmake for embedded count me in actually
English
0
0
0
7
Grim
Grim@justgrm·
@Arithmos0x hard flex + giveaway in one post, efficient marketing
English
0
0
1
6
Grim
Grim@justgrm·
@berryxia finally someone said it. the MLX delay was getting ridiculous, feels like we begged for it every time
English
0
0
0
165
Berryxia.AI
Berryxia.AI@berryxia·
Mac用户大喜啊!苹果端侧模型的优势又来了! 今天还看到Jina直接原生框架支持了MLX了! 以前开源 embedding 模型发布节奏一般是这样: Day 0:放 PyTorch 原版 Day 7-30:社区有人转 GGUF Day 30-90:有人想起来转 MLX 大部分时候:MLX 版本永远不会有,得自己 mlx_lm.convert Jina 这次是和原版同一天发布 MLX 变体,而且是全套——nano/small × 4 个任务变体 = 8 个 MLX 模型。 这意味着: MLX 已经被 Jina 当成很重要的部署目标,不是社区可选项 他们内部应该有 MLX pipeline,不是手工转的。 这背后的趋势,最近半年同类产品都有这些动作。 Qwen3、DeepSeek、Llama 系列官方 release 都开始带 MLX 变体。 Hugging Face 自己加了 MLX 作为一级 framework tag(和 PyTorch、JAX 并列) mlx-community 的下载量已经不输 GGUF 在某些细分领域 Apple 自己的 Foundation Models 也是 MLX 路线 embedding 这个赛道尤其适合 MLX: 模型小(1-2B 很完美啊!,正好塞进 M 系列统一内存) 推理频繁但每次量小(不像 LLM 是长 generation) 本地 RAG / 个人知识库场景天然在 Mac 上。
Berryxia.AI tweet media
Berryxia.AI@berryxia

huggingface.co/collections/ji…

中文
5
4
28
8.3K
Grim
Grim@justgrm·
@MaksimDove betting on CZ staying quiet is genuinely the wildest form of market analysis
English
1
0
1
8