Nico

492 posts

Nico

Nico

@itmos__

Katılım Ocak 2025
2.7K Takip Edilen116 Takipçiler
Nico retweetledi
Nico
Nico@itmos__·
@JasonBotterill @Ken67547214 Ken is right, it's Lynchian weird, not anime weird. It's by Satoshi Kon, director of Perfect Blue, Millennium Actress, and Paprika. Famously ripped off by Aronofsky and Nolan. He was a top tier auteur, one of the goats
English
0
0
0
24
JB
JB@JasonBotterill·
@Ken67547214 I don’t watch anime should I watch this it’s not weird right
English
2
0
2
32
Ken 無 (non-official taco bell affiliate)
I keep going back and forth on cameras in my house. I was pretty much fine with the tradeoffs of letting my own software watch me through a webcam at my desk, but I do not like the idea of having internet connected cameras distributed anywhere else in my home. However, I think I could come up with an affordable system that used an (mostly) airgapped mini-pc to extract coordinate, identity, and state data through a minimal data connection.
English
5
1
19
792
Nico
Nico@itmos__·
@N8Programs - most interesting or surprising paper you've read in the last month? - favorite film? - the @voooooogel Claude chrome extension might be your magnum opus. How are you planning to top that?
English
1
0
2
135
N8 Programs
N8 Programs@N8Programs·
Thank you so much to everyone for 10K followers! It's been my absolute honor to post my work here for the last few years - from Three.JS to Local LLMs to MLX to the research I do now. As is customary (and cringe), I'm doing an AMA - post any questions you have in the comments!
English
5
0
25
4.7K
Nico retweetledi
Niels Rogge
Niels Rogge@NielsRogge·
Introducing a revival of PapersWithCode! As @ilyasut said, we're back to the "age of research". Hence, it's important to share research and build on each other's work. > find SOTA per domain, not just LLMs > leaderboards > methods > all parsed at scale using AI agents.
English
33
86
593
64.3K
Nico retweetledi
jason
jason@jxnlco·
jason from the codex team here, heres a draft on codex maxxing and the primatives i use on a daily basis jxnl.github.io/blog/writing/2… would love any feedback
English
151
217
3.5K
378.1K
Nico
Nico@itmos__·
@techno_popgirl Happy birthday! 🥳 I'm listening to your wonderful music right now.
English
1
0
1
26
大正九年
大正九年@techno_popgirl·
今日の誕生日 ありがとうございました!
日本語
6
1
34
2.7K
Nico retweetledi
Sophie Wang
Sophie Wang@SophieLWang·
"The Truth Lies Somewhere in the Middle (of the Generated Tokens)" In autoregressive language models, mean pooling hidden states across generation yields better representations than any token alone. project page: sophielwang.com/tokens w/ @phillip_isola and @thisismyhat
English
9
68
465
48.7K
Nico
Nico@itmos__·
@corsaren it's too taboo so most actual artists aren't touching it yet. I really like Jia Zhangke's short, mostly because I respect him as a director. It's metacommentary and doesn't stand on its own, but it's good and pretty funny:
Nico tweet mediaNico tweet media
English
1
0
20
682
Nico retweetledi
Tomasz Limisiewicz
Tomasz Limisiewicz@TomLimi·
We present Compute Optimal Tokenization! 🔡 Common in LLM scaling works stick to one tokenizer, sweeping data/model size. But what happens when we control the tokenizer’s compression rate (bytes/token)? Here we sweep tokenizers, params, and data across compute budgets: [1/N]
English
21
95
621
100.8K
Nico retweetledi
Ricardo Olmedo
Ricardo Olmedo@rdolmedo_·
Claude 3 Opus scored 4% on SWE-bench at release. Shockingly, a Pythia-scale model trained **only on pre-1931 data**, with a bit of fine-tuning, outperforms the April 2024 SOTA. Clearly, Opus is the better model. Why should we care about benchmarks, then? 👇🧵
Ricardo Olmedo tweet media
Ricardo Olmedo@rdolmedo_

We fine-tuned Alec Radford’s 1930 vintage LLM to solve SWE-bench issues. After just ‼️250‼️ training examples, the model solves its first issue, a simple patch to the xarray library. 🧵👇

English
16
22
411
114.4K
Nico
Nico@itmos__·
@postimortem @_lopopolo by "tools and context" he's referring to integrations to your org's tools, plugins, access to your internal KB, etc. instead of your own harness with custom tool defs, session management, compaction algorithm, etc.
English
2
0
2
63
tim ganiev
tim ganiev@postimortem·
@_lopopolo but tools and context are the major part of harness...
English
1
0
2
131
Unemployed Capital Allocator
Unemployed Capital Allocator@atelicinvest·
the actual monkey paw - you can make anything you ever wanted, but whatever you wanted to make isn't actually what you need at all and you just build up and up and up because doing it feels better than actually doing the hard work of figuring out what is useful instead of what it is you want to do it's all just a larp video game otherwise and performance productivity porn and 2 years later you look around and wonder what you did with all those billions of tokens you produced but you're too afraid to actually answer that question so you just keep on building just another billion tokens. i swear this time it'll be good.
goodalexander@goodalexander

AI monkey paw: you can make anything you ever wanted but nobody will buy it bc they can also make anything they ever wanted

English
4
4
56
5.5K
Nico retweetledi
Nico retweetledi
Max Spero
Max Spero@max_spero_·
Over the last year, I've watched a rise in AI content on basically every internet platform. Seeing a viral AI-generated post used to be a rare find. Now it's a daily occurrence. Four months ago, we launched the @pangramlabs bot to help people check long posts and articles for AI slop without leaving the platform. And it blew up. We went from a niche tool used by academics to a core piece of cognitive security infrastructure. Today, we're taking it one step further. We're launching a Chrome extension that proactively scans all social content as you scroll, flagging AI content in real time so you can save your attention for what really matters: content authored by humans. At launch, the Pangram Chrome extension will proactively scan posts on X, LinkedIn, Reddit, Substack, and Medium. And we'll give you a feed health summary, so you can see exactly which accounts are putting AI slop on your feed. I'm so excited to share this with you all, and I hope you find it as useful as I do.
Pangram Labs@pangramlabs

Today we're releasing the Pangram Chrome Extension, which automatically flags AI-generated content as you scroll your feed. We're sick of having to constantly be on guard for AI slop on social media. For most of human history, if a piece of writing was grammatical, coherent, and well-structured, you were safe in assuming that somebody put some thought into producing it. That assumption no longer holds true: AI has severed the relationship between form and content, destroying the credibility signal we once relied on. The Pangram Chrome extension restores that signal. It scans your feed as you scroll, flagging AI-generated and AI-assisted content in real time and showing you how much of your feed is machine-written. Works on X, LinkedIn, Reddit, Substack, and Medium. New users get 2 weeks free. Install it here: pangram.com/solutions/chro…

English
24
54
443
84.5K
Nico
Nico@itmos__·
@Plinz @mattparlmer > You are asking to exclude millions of people who cannot afford renting human drivers from being able to getting around Waymos generally cost 1.5-2x as much as Uber. > How about we build infrastructure again? you mean public transportation? yeah I agree
English
0
0
2
84
Joscha Bach
Joscha Bach@Plinz·
@mattparlmer You are asking to exclude millions of people who cannot afford renting human drivers from being able to getting around. That is an incredible social cost. Let's find other sources of income. How about we build infrastructure again?
English
8
1
84
2.8K
mattparlmer 🪐 🌷
mattparlmer 🪐 🌷@mattparlmer·
The problem isn’t with the Waymo safety record, it’s that driverless taxis break a load bearing part of the post-2008 social settlement The ability to earn an income from driving and delivery apps has kept a lot of people afloat who would otherwise be entirely destitute
Timothy B. Lee@binarybits

Over the last ~100 million miles of driving, these are the five most serious crashes that could be plausibly blamed on Waymo, as judged by @chi_t_williams and me. If you looked at 10,000 miles of driving from 10,000 random human drivers you'd see much worse behavior.

English
107
86
1.7K
229.1K
Nico retweetledi
wh
wh@nrehiew_·
Frontier LLMs are doing too much when it comes to editing code. I'm excited to share this work on the Over-Editing problem which refers to models modifying code beyond what is asked of them. The main findings are: - Many frontier models Over-Edit with GPT 5.4 being the biggest culprit - Reasoning models have a higher natural tendency to Over-Edit compared to their non-reasoning counterparts - RL is the best approach to train models to perform minimal code editing while preventing catastrophic forgetting compared to SFT, DPO and Rejection Sampling. Blog and details below!
wh tweet media
English
15
42
437
64.2K
Nico retweetledi
ClaudeDevs
ClaudeDevs@ClaudeDevs·
Some of you ran into Opus 4.7 refusing normal code edits with "this might be malware" warnings. That was a bug on our side, not the model being cautious. Older builds applied a stale safety prompt that Opus 4.7 doesn't need. Run claude update or relaunch the app.
English
170
170
4.6K
317.8K
Taelin
Taelin@VictorTaelin·
I don't think we're all hallucinating, there's something seriously wrong about 4.7. Just tried it on the same two prompt (what's the best GC approach for Bend). 4.7 simply lies a lot, ignores information right on its context, makes bad proposals. This is really weird?
English
115
35
1.3K
69.8K