Fung

474 posts

Fung banner
Fung

Fung

@Funggx

never hurts to learn more | undergrad ML Research & SWE

Katılım Ağustos 2024
164 Takip Edilen41 Takipçiler
Fung
Fung@Funggx·
@lucasmeijer You also get better quality from forced alignment since it knows what’s being said as you already have the script. I’ll show you an example later today
English
0
0
0
20
Lucas Meijer
Lucas Meijer@lucasmeijer·
@Funggx I have low volume. I only care about quality not money.
English
1
0
0
28
Lucas Meijer
Lucas Meijer@lucasmeijer·
I really want these guys to be great, but I'm 30 clicks after seeing this tweet, and I still have not found the actual thing they shipped! I have production apps that could use this, and I give up because it's always such a shitshow to use anything google ai.
Logan Kilpatrick@OfficialLoganK

Introducing Gemini 3.1 Flash TTS 🗣️, our latest text to speech model with scene direction, speaker level specificity, audio tags, more natural + expressive voices, and support for 70 different languages. Available via our new audio playground in AI Studio and in the Gemini API!

English
23
0
237
36.1K
Fung
Fung@Funggx·
@lucasmeijer Performing STT just to get timestamps is usually very expensive compute/time wise. I haven’t worked with Dutch before, but you should use forced alignment given that you have the text already, it’s way cheaper. I can build and show you a sample of it
English
1
0
0
55
Lucas Meijer
Lucas Meijer@lucasmeijer·
@Funggx I just tried Inworld TTS. Its dutch quality is horrrible, getting way better results with that new gemini thing. I guess I just need to run the generated audio through a transcriber again to get timestamps to drive subtitles
English
1
0
1
75
Fung
Fung@Funggx·
@lucasmeijer Gemini TTS is not very good for this. Inworld TTS API (better quality imo) gives you word/character timestamps. But if you need to use Gemini TTS, you can perform forced alignment (github.com/lukerbs/forcea…) to create the SRT file. If you want more help, feel free to send a DM :)
English
1
0
1
168
Lucas Meijer
Lucas Meijer@lucasmeijer·
@Funggx what do you recommend if I want to drive subtitles with the audio track and I need word level timecodes?
English
2
0
0
894
taoki
taoki@justalexoki·
@jskoiz where the FUCK I've been looking all morning
English
1
0
5
216
Fung
Fung@Funggx·
@msfeldstein @kr0der cursor cloud agents are peak, I love how they record the testing. Sadly they cost a ton to run and don't work with cheaper models
English
0
0
1
20
Michael Feldstein
Michael Feldstein@msfeldstein·
@kr0der Ive been loving cloud agents recently on web. I've been doing CLI development and you can use the terminal right in the browser so i just dont have to do anything locally, parallelization ++ Like this is all i need to build with an agent and try out the results
Michael Feldstein tweet media
English
4
0
8
566
Anthony Kroeger
Anthony Kroeger@kr0der·
what do you guys think about cloud agents? i don't really understand the hype in terms of using them for coding - they become a LOT slower, more expensive, and the UX to get them working perfectly isn't the best compared to local agents i still prefer local agents for coding by a lot and pretty much only use cloud agents for a quick fix in Slack
English
11
0
19
2.3K
Fung
Fung@Funggx·
@rsuyoy @Xxi5olc maybe it's being overwhelmed with large context leading to more tokens per conversation pass?
English
1
0
1
148
Yousr
Yousr@rsuyoy·
@Xxi5olc a couple of hours of prompting, discussing things in the codebase more than writing code, using medium and high and never xhigh
English
1
0
0
1.7K
Yousr
Yousr@rsuyoy·
Yeah man OpenAI’s absolutely bullshitting us with the rate limits, there’s no way I just burned through 1/5 of the weekly limit in one light session
Yousr tweet media
English
82
2
213
26.3K
Fung
Fung@Funggx·
@tenobrus You’re on a roll with your predictions right now
English
0
0
1
97
Fung
Fung@Funggx·
@kr0der Maybe better for straight coding without the planning aspect?
English
0
0
1
745
Anthony Kroeger
Anthony Kroeger@kr0der·
does this mean GPT 5.3 Codex is actually better at coding than GPT 5.4? 👀
Tibo@thsottiaux

@Its_Nova1012 GPT-5.4 - Best results, highest cost. Well rounded. GPT-5.4-Mini - Cheapest option. Fast and accurate, will notice occasional things that are too hard for it and you can switch to GPT-5.4. GPT-5.3-Codex - Code freak. Can use if you only care about generating code.

English
43
3
352
77.4K
Fung
Fung@Funggx·
@prathamdby Have you ever ran into any awkward model behaviors? This seems like gold, especially with something like a copilot sub on top of it
English
1
0
1
73
𝖕𝖗𝖆𝖙𝖍
𝖕𝖗𝖆𝖙𝖍@prathamdby·
I just made a short video showing how to do this with the CLIProxyAPI tool. It's a very useful and easy-to-use tool, but it can be hard for people who have never used it before. In this video, I showed you how to use any OAuth subscription in Cursor AI. Links below ⬇️
Fekri@fekdaoui

hey @cursor_ai i’d love to be able to use my openai codex sub in cursor’s native agent window you could even make it available only to active cursor subscribers so people don’t just free-ride cursor ux + codex sub usage limits would be elite

English
5
4
24
5.4K
cheaty
cheaty@cheatyyyy·
@its5q @thdxr holy shit this is a corner of the internet ive never heard about thank you so much
English
2
0
21
5.4K
cheaty
cheaty@cheatyyyy·
what the fuck is dax's top secret vps provider from miami pls no gatekeep @thdxr
cheaty tweet media
English
11
0
201
41.2K
kache
kache@yacineMTB·
GABE NEWEL YOUUUUUUUUUUUUUUUUUUUUUU
English
7
1
86
15.9K
kache
kache@yacineMTB·
I"M FUCKING ADDICTED TO VIDEO GAMES AGAIN AAAAAAAAAAAAAAAAAAAAA
English
160
13
1.2K
189.6K
Fung
Fung@Funggx·
@aryanvs_ looking forward to the streams :D
English
0
0
0
33
Aryan V S
Aryan V S@aryanvs_·
@Funggx thinking about it. maybe twitch and x?
English
2
0
4
121
Aryan V S
Aryan V S@aryanvs_·
will be streaming high perf cuda i guess since it's a draw. reaches more people and probably more helpful :) you wouldn't believe how much inefficiency exists in open source codebases/tooling unless you very actively read training and inference frameworks. all the main ideas are basically out there but very scattered we will push the boundary and optimize the fuck outta consumer and eventually datacenter gpus. no clankers, no optimization frameworks, no dsls, no search-based compilers, just raw perf and lowering memory requirements from first principles
Aryan V S@aryanvs_

A or B? bird app does not let you put poll in quote tweet :/

English
3
1
32
2.6K
SIGKITTEN
SIGKITTEN@SIGKITTEN·
bags bitcoin crypto $bags $btc
English
4
0
11
1.1K
Fung
Fung@Funggx·
@tenobrus Aw man, hoping your content doesn’t go away. I have a passion for learning and enjoy your takes :)
English
0
0
0
24
Tenobrus
Tenobrus@tenobrus·
alright i'm officially gonna try friends of friends only replies for at least 2 weeks and see how it affects my experience. will reconsider then. but before the gates slam shut, it's Lowbie Appreciation Thursday! introduce urself in the replies and tell me why i should follow u!
Tenobrus@tenobrus

strongly considering making all my posts "friends of friends replies only" for a while. i follow 2.5k people and they collectively follow an assload more, so its really not too strong a filter, but it'd keep out bots and random groypers. downside is it blocks brand new lowbies...

English
186
2
501
33.3K
amit
amit@gravicle·
Love that they are comparing with 2 generation old Luma models. @Designarena lets try with Ray3.14
Design Arena@Designarena

BREAKING: Grok Imagine by @xai takes 1st overall on Multi Image to Video Arena, with an overall Elo of 1342. The team's debut reference image to video model establishes a new Pareto frontier for Preference vs. Speed with an average generation time of 58.9 seconds. Huge congrats to the @xai team for this achievement!

English
5
3
43
10.5K
Fung
Fung@Funggx·
@jparkjmc Wait isn’t this in NYC
English
0
0
1
416
SIGKITTEN
SIGKITTEN@SIGKITTEN·
man, that 200 codex sub is gonna be gone in like a day after april
English
28
6
469
75.3K