jack

4.8K posts

jack

jack

@JackNotOld

In the land of the blind the one-eyed man is king

Katılım Şubat 2020
886 Takip Edilen1.4K Takipçiler
jack
jack@JackNotOld·
😅
Kimi.ai@Kimi_Moonshot

Congrats to the @cursor_ai team on the launch of Composer 2! We are proud to see Kimi-k2.5 provide the foundation. Seeing our model integrated effectively through Cursor's continued pretraining & high-compute RL training is the open model ecosystem we love to support. Note: Cursor accesses Kimi-k2.5 via @FireworksAI_HQ ' hosted RL and inference platform as part of an authorized commercial partnership.

ART
0
0
0
23
jack
jack@JackNotOld·
I built a tiny macOS Markdown reader. Double-click a .md file -> it opens as a clean, print-ready document. Native app (<1MB), PDF export, live reload, secure rendering. Show HN: news.ycombinator.com/item?id=473150…
GIF
English
1
0
0
96
jack retweetledi
jack
jack@JackNotOld·
@taahajkhan @OtsoVeistera @gabriel1 @thetokenco Interesting. Do you see this being useful outside very large RAG contexts? My assumption is this is mainly valuable for enterprise workflows with massive documents (legal, compliance, research), rather than typical prompts.
English
2
0
0
66
otso veistera
otso veistera@OtsoVeistera·
You're wasting half your context window. We’re launching @thetokenco (YC W26) today. We compress LLM inputs before they reach the model. Fewer tokens, lower cost, faster inference. Models also perform better. In customer case studies we’ve seen a +5% lift in user purchases due to higher preference for outputs from compressed prompts. The API is live. Link in the comments
English
76
57
507
91.4K
jack retweetledi
Min Choi
Min Choi@minchoi·
Anthropic said no to the Pentagon. Now Sam Altman is backing them: "For all the differences I have with Anthropic, I mostly trust them as a company and I think they really do care about safety." OpenAI and Anthropic both drawing the same line. This is a big deal.
English
663
1.6K
17K
1.7M
jack
jack@JackNotOld·
Deepseek just added their scores to ARC-AGI-2 Potentially testing to compare against V4 (launch imminent)
jack tweet media
English
0
1
1
293
jack retweetledi
xjdr
xjdr@_xjdr·
"if you prove to me that you can distill frontier policy by SFT on less than 1T tokens, i will close my lab, quit my startup and come work for you right now"
xjdr tweet media
English
20
87
2.1K
79.1K
jack retweetledi
jack retweetledi
METR
METR@METR_Evals·
We estimate that Claude Opus 4.6 has a 50%-time-horizon of around 14.5 hours (95% CI of 6 hrs to 98 hrs) on software tasks. While this is the highest point estimate we’ve reported, this measurement is extremely noisy because our current task suite is nearly saturated.
METR tweet media
English
229
461
4K
3.5M
jack
jack@JackNotOld·
Taalas chip is cool. Unfortunately I have no use case for 17k TPS on llama 8b.
English
0
0
0
75