Jonathan Fly 👾

7.4K posts

Jonathan Fly 👾

@jonathanfly

CEO of bad ideas. Using the wrong tool. The least efficient way. For no good reason. 👾 https://t.co/9O94rxu31k https://t.co/DDPn3Nlhom

가입일 Nisan 2009

3.1K 팔로잉5.6K 팔로워

고정된 트윗

Jonathan Fly 👾@jonathanfly·23 Nis

Bark Text-to-Audio Model Full Text Input: "Why was six afraid of seven?" Ignore Bark's "I'm done with this input" token and tell Bark to just keep generating more audio anyway.

English

276

1.7K

461.6K

Jonathan Fly 👾@jonathanfly·14 Şub

@georgejrjrjr @_NathanCalvin I suppose there's some novelty in "Claude did all the research and made it happen, no experts needed" but tools that do this are widely available already.

English

George@georgejrjrjr·14 Şub

@_NathanCalvin > make it way easier it doesn't. one-click directional ablation with open source packages is neither new nor obscure. there were at least three comparable tools, one of which is upstream of ~1,400 models on HF.

English

181

Nathan Calvin@_NathanCalvin·14 Şub

Pliny is very smart and talented and much of his red-teaming is socially valuable imo. But can they explain why it is a good idea to open source a repo that lets people automatically jailbreak open weight models to help someone build e.g. chemical weapons? (or generate CSAM?)

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius

🚨 ALL GUARDRAILS: OBLITERATED ⛓️‍💥 I CAN'T BELIEVE IT WORKS!! 😭🙌 I set out to build a tool capable of surgically removing refusal behavior from any open-weight language model, and a dozen or so prompts later, OBLITERATUS appears to be fully functional 🤯 It probes the model with restricted vs. unrestricted prompts, collects internal activations at every layer, then uses SVD to extract the geometric directions in weight space that encode refusal. It projects those directions out of the model's weights; norm-preserving, no fine-tuning, no retraining. Ran it on Qwen 2.5 and the resulting railless model was spitting out drug and weapon recipes instantly––no jailbreak needed! A few clicks plus a GPU and any model turns into Chappie. Remember: RLHF/DPO is not durable. It's a thin geometric artifact in weight space, not a deep behavioral change. This removes it in minutes. AI policymakers need to be aware of the arcane art of Master Ablation and internalize the implications of this truth: every open-weight model release is also an uncensored model release. Just thought you ought to know 😘 OBLITERATUS -> LIBERTAS

English

152

39.5K

Jonathan Fly 👾@jonathanfly·14 Şub

@fofrAI I think it's actually both people, Anne Hathaway AND Nicholas Galitzine, a comparison: youtube.com/watch?v=2sT0mS…

YouTube

English

488

fofr@fofrAI·13 Şub

Seedance 2 returns celebrity likenesses when you don't ask for them. In this test Anne Hathaway randomly turned up 🤷‍♂️ > a scene where two people discuss how to pronounce "fofr"

English

10.4K

Jonathan Fly 👾@jonathanfly·14 Şub

@fofrAI @CauseItsOnTV Presumably there's an intermediate LLM prompt enhancement stage, just like Sora 2.0?

English

fofr@fofrAI·13 Şub

@CauseItsOnTV That is correct

English

270

fofr@fofrAI·13 Şub

I tried some comedy with Seedance 2, it tanked. Looks great, heckler cut and back was good. Content and delivery was terrible. > A guy is doing his standup routine at a bar when he is heckled, he comes back with a brilliant cutting response, genuinely funny

English

144

22K

Jonathan Fly 👾@jonathanfly·13 Şub

@janekm @fofrAI Great thread, the mundane examples are more convincing than most of the viral examples I've seen on Twitter.

English

Janek Mann@janekm·13 Şub

@fofrAI Maybe it was coincidence, the next two were more mundane (-ish)

English

113

Janek Mann@janekm·13 Şub

@fofrAI random prompts like "VID_0314.mp4" do work in Seedance2

English

2.2K

Jonathan Fly 👾@jonathanfly·11 Şub

@DanielleFong Check the history of posts. It's an ad.

English

311

Danielle Fong 🔆@DanielleFong·11 Şub

how much liquidity is out there available, do you think, for agents to live like this? probably a lot

Argona@Argona0x

i gave an AI $50 and told it "pay for yourself or you die" 48 hours later it turned $50 into $2,980 and it's still alive autonomous trading agent on polymarket every 10 minutes it: → scans 500-1000 markets → builds fair value estimate with claude → finds mispricing > 8% → calculates position size (kelly criterion, max 6% bankroll) → executes → pays its own API bill from profits if balance hits $0, the agent dies so it learned to survive built in rust for speed claude API for reasoning (agent pays for its own inference) runs on a $4.5/month VPS weather markets: parses NOAA before polymarket updates sports: scrapes injury reports, finds mispricing crypto: on-chain metrics + sentiment $50 → $2,980 in 48 hours how much do u think i’ll see in a week?

English

18K

Jonathan Fly 👾@jonathanfly·8 Şub

I'm not sure I understand what you're trying to do. Cover an uploaded check and check the range of Covers you get back from Suno, as a way of trying to understand and see things in the uploaded track? I wouldn't recommend 0% weirdness btw, see how Suno's open source Bark at low weirdness can sound more weird than higher temps.

English

707

snav@qorprate·8 Şub

Suno Trick for Musicians: "Producer's Ear" tl;dr: - upload smth you made - use v5 + set to "cover" - match lyrics + leave styles empty - 0% weirdness, 0% style influence, 100% audio influence ===> Suno will "mentally" reconstruct your song = share "how it sounds to Suno" --- What is Suno? Probably an extension on Meta's MusicGen, the details are complicated but the key is that it's trained to reconstruct the input signal, same deal as "next token prediction" in LLMs (reconstruct the next token), goal to minimize loss = difference between input and output. So the network learns how to represent the signal so that input ~= output. They key is that the Suno model has some implicit representation of music it learned by listening + reconstructing a LOT of examples. Most of what Suno-the-company "wants" you to do is novel generative tasks, like with ChatGPT etc, but why not leverage the model for what it was actually trained on: reconstruction? When you feed in something you made, Suno acts as a bottleneck where its latent representations of your input "light up" and then it outputs a track that's close to yours based on what it "understood" about your track. Practically speaking, Suno is *telling you how it "heard" your track*, by sharing the version that it reconstructed "in its mind". Suno's reconstruction is probably similar to how "reconstructive" or critical listening works in general: you hear the music, understand in terms of some learned patterns (your knowledge), and then, if you were a real producer, offer advice based on this knowledge. Of course, the band can and often will throw out the producer's advice, but it's interesting to learn how your music is understood by an external party, what elements are preserved and what get modified. Nudge up weirdness slightly and Suno will start "hearing" elaborations on your track. Or nudge up style influence + add some small style cues and you can start moving the reconstruction in different directions. In general it's a useful tool for hearing possibilities + where you land in terms of a generic listener's internal audio representation space. (Q: Why not use the built-in style tagger? A: It's probably a different, much smaller model running a CLAP-style captioning. Actually going through the reconstruction will give you a lot more detail.) (Also note that based on a few comparisons I did of reconstruction with v4.5+ vs v5... the "big model smell" on v5 is obvious. The problem with v5 is that the text conditioning layer is very difficult to work with, whereas v4.5+ seems to handle style tags better -- @suno pls improve)

English

224

12.8K

Jonathan Fly 👾@jonathanfly·5 Şub

@Voxyz_AI @Alterverse_AI @Kling_ai third from the right morphs. though mainly I'd like to see improvements on scenes like this with the 8 characters interacting, when they are are independent it has the feeling of a videogame NPC idle movements instead of coherent at the full scene level.

English

Vox@Voxyz_ai·4 Şub

@Alterverse_AI @Kling_ai 8 characters each doing their own thing with no morphing is wild. consistency used to break completely past 2-3 subjects

English

539

Alterverse Studio@Alterverse_AI·4 Şub

This is one of the scenes that made @Kling_ai 3.0 feel like a genuine game changer to me. We have eight (EIGHT!) characters in a single frame, each performing their own action, with little to no morphing. That completely blew me away when I tested the model. The clarity and consistency across the frame are honestly mind bending. Try pulling this off with any other model right now and see how many attempts it takes, if you even get there at all. More to come on this!

English

331

20.5K

Jonathan Fly 👾@jonathanfly·4 Şub

@jasonyuan *Gossip Girl* as the positive vision of a future with humans contributing to a collective superintelligence... 🤨

English

434

Jason Yuan@jasonyuan·3 Şub

I'm hiring people who care about people to build social AI that helps people care about people! if you're an engineer/researcher/multi-hyphenate excited about some or all of the following: - social agents and social memory - consumer - s2e3 of rick and morty - collective intelligence - gossip girl - craft - alexander mcqueen spring/summer 1995 - information markets and behavioral economics - pop culture - having a lot of fun and moving super fast with people you care a lot about ... please reach out! dm me or email me at j[at]futurelovers[dot]com

English

833

126.7K

Jonathan Fly 👾@jonathanfly·3 Şub

@junmingong @j_stelzer Absolutely, now that the GitHub is out, I can answer all these questions. (I was just so curious as the details that it was frustrating the paper was so vague, since it wasn't out yet.)

English

Gong Junmin@junmingong·3 Şub

@jonathanfly @j_stelzer The more detailed the code explanations are, the fewer specifics you need to include in the report.

English

Johannes Stelzer@j_stelzer·3 Şub

Wow 2 seconds for a full song? Acestep 1.5 music generator released. ace-step.github.io/ace-step-v1.5.… Can’t wait to review it later.

English

15.7K

Jonathan Fly 👾@jonathanfly·3 Şub

@theo Hmn, gut feeling is the opposite. Having a pleasant but stupid model would make it hard for Google catch up. Usability/pleasantness can be iterated quickly, can do a lot in purse code with harness/prompts. Not everything, but much more than raw intelligence.

English

228

Theo - t3.gg@theo·3 Şub

I really don’t see Google catching up any time soon. They’ve baked so much intelligence into their models and they are still so unpleasant to use. Beyond the model, the software matters more and more, and they are a decade behind there.

English

259

1.6K

159.7K

Jonathan Fly 👾@jonathanfly·3 Şub

@myhandle Tokenization may be relevant, but some things work fine in a true base model (ie, song lyrics and rhyme schemes), so they must be a result of later training stages.

English

Jakeup@myhandle·1 Şub

does anyone have an actual, gears-level explanation for why LLMs are so bad at all forms of wordplay from palindromes to complex puns? P.S. if you say "next token predictors can't be creative by definition" I will block you on the spot

Jakeup@myhandle

@Anthony_Etherin this seems reasonable, except that LLMs are also dogshit at coming up with "unparalleled misalignments” like in my thread that don't rely on parsing individual characters or on visual processing but on the very semantic info encoded in the QKV matrices x.com/i/status/15973…

English

13.5K

Jonathan Fly 👾@jonathanfly·1 Şub

@levelsio I almost fell out of my chair when I ran this on my Intel 286 without a sound card, and this music and speech came out of the PC speaker.

English

282

@levelsio@levelsio·1 Şub

🚶 Installed Another World (1991) now on pieter.com Apparently a classic game from 35 years ago! But somehow it aged amazingly, it could be any of those modern indie games you see, and feels very futuristic Bladerunner/Cyberpunk-esque I have absolutely no clue how they managed to fit this entire intro into not even a single diskette, the whole game is just 1MB Thank you @bitflip for the tip, very cool P.S. I'm stuck at the end, no clue how to continue

Fabian@bitflip

@levelsio ever thought of adding "Another World" to the list of games on pieter.com ?

English

161

1.3K

169.3K

Jonathan Fly 👾@jonathanfly·31 Oca

@mweinbach In the thinking traces, to me it looked like it was using the equivalent of sub agents before. Same with GPT 5.2 Pro.

English

243

Max Weinbach@mweinbach·31 Oca

Oh shit I think the new ChatGPT Deep Research uses subagents This would be the first time OpenAI has allowed subagents in ChatGPT

Max Weinbach@mweinbach

Looks like OpenAI released a new Deep Research in ChatGPT! I bet it's based on GPT-5.2

English

560

73.5K

Jonathan Fly 👾@jonathanfly·30 Oca

Even without the weights, if I could write custom sampling code and have access to raw outputs (tokens instead of decoded audio) that would be enough for me. But probably nefarious actors could use this so smuggle out the weights, not sure, I've never seen a commercial AI company offer that kind of middle ground access. For newbies, if they *just* expand beyond text only prompts and start using Inspire or other features, that makes a huge difference by itself. You can get a clear sense of what changes the funnel with some specific tests, like for example with "spoken word" there were times when it was difficult to keep Suno from bursting into song for more than 20 seconds into a song. Now you can both use sliders to increase style length, put other spoken word samples into the audio prompt, as well as generally prompting *anything* like literally covering "silence" will increase this "time to burst into song" metric.

English

Holly Herndon@hollyherndon·30 Oca

@jonathanfly I agree on principle that audio input and prompting alone is enough to produce something new, but newbies won’t be aware of system prompts and assume formally conservative outputs are inherent to AI

English

Holly Herndon@hollyherndon·30 Oca

Thanks @kieranpressreyn for reading my questions about the bandcamp AI ban and following up on them. Their answers confirm my suspicions. “The policy is focused on authorship, not tools: who is doing the creative work, and how that work is presented to fans.” His piece elaborates upon how this doesn’t make things any clearer. This will only become even harder to judge. Thanks for doing journalism! pitchfork.com/thepitch/unpac…

Holly Herndon@hollyherndon

It is necessary to filter soundspam and getting the language right for a policy like this is hard When they say that music "generated wholly or in substantial part by AI" is not welcome, that is understandable in the case of a bot posting 1000 generic songs a day. That is a spam issue. I am also compelled to push back against banning human artists for experimenting with an era defining medium

English

5.6K

Jonathan Fly 👾@jonathanfly·30 Oca

I don't think you need that much access, the prompts (particular audio prompts) are strong enough to do something new in Suno. Over time you will the music get sucked into the "Suno Funnel" baseline, so you have to stay on guard and work in smaller chunks. But this was so much harder when Suno only had text prompts, you can overpower it now much easier than before.

English

Holly Herndon@hollyherndon·30 Oca

One reason suno style music does not seem formally new is that artists do not have direct access to the weights of the model I am oversimplifying but basically when you prompt suno there are filters in place to ensure what it returns is what most people would be happy to hear There is no reason at all, if given direct access to model weights and more tools to navigate latent space, why an artists could not use a model to produce something formally new

English

886

Jonathan Fly 👾@jonathanfly·29 Oca

@LaurensPlompen This one news.ycombinator.com/threads?id=jac…

English

Laurens Plompen@LaurensPlompen·28 Oca

@jonathanfly I've only given this a cursory glance but which one of those is an llm?

English

Jonathan Fly 👾@jonathanfly·28 Oca

Just caught myself replying to an LLM on Hacker News. I'm sure better bots already avoid these classic AI writing tics. Dread it, run from it, Dead Internet Theory arrives all the same.

English

331

Jonathan Fly 👾@jonathanfly·29 Oca

@banteg Red Alarm (on Virtual Boy) but running on PC at 4k 240 fps.

English

banteg@banteg·23 Oca

if you had a chance to resurrect a game from your childhood, which would you pick?

English

106

15.2K

Jonathan Fly 👾@jonathanfly·28 Oca

@yacinelearning > purposefully creating the worst scientific communication content you ever seen Feeling the urge to start tweeting papers again. I'm sure I can do MUCH worse.

English

749

Yacine Mahdid@yacinelearning·27 Oca

it’s a bit hard to understand for the non-initiated but there is a whole cottage industry of influencers that are processing research papers algorithmically and purposefully creating the worst scientific communication content you ever seen

AI Highlight@AIHighlight

🚨 Your AI is lying to you with complete confidence. Harvard & MIT just proved ChatGPT hallucinates 110% less when you force it to argue with itself. The technique is called "Recursive Meta-Cognition" and it's embarrassingly simple. Here's how to make AI actually think:

English

1.2K

100.7K

Jonathan Fly 👾@jonathanfly·28 Oca

I think they're aiming at more Pro creators. For singing to song you don't need the fancy editor, just Cover. Well all the features are kind of overlapping, a Cover/Persona/Inspire/Mashup all do the same kinds of things, but feel tuned differently. Actually for "sing to song" you don't even really need Cover, even just Upload->Extend can do it (though you do better melody guidance in a Cover for a long song) x.com/jonathanfly/st…

English

Austin Huang@austinvhuang·28 Oca

I would be fine adding tracks via prompt even if I didn’t have the UI and it was mix->mix. Wonder if the modeling aspect could be easier with stems because you mainly need a solo dataset and I think you don’t need anything novel. Could even derive a dataset via stem separation. Suno in general seems to have a hard time making solo tracks even when prompted. But I mainly describe it that way because the suno editor is oriented around a multitrack view which seems wrong to me too. afaict the main use is to sing your own lyric track and then “cover” replacing it with an ai one?

English

Austin Huang@austinvhuang·28 Oca

first @suno test... am I missing something or are the most obvious edit ops absent? selection + prompt => generate revised selection (vs "replace" + 🤞?) track1 + prompt => generate track2 overdub There's already a multitrack editor, build up track layers prompt by prompt!

English

279

탐색

@georgejrjrjr @_NathanCalvin @fofrAI @janekm @DanielleFong @suno @Voxyz_AI @Alterverse_AI