Alexander Derve

495 posts

Alexander Derve banner
Alexander Derve

Alexander Derve

@AlexanderDerve

I'm not a computer genius, but I'm pretty good at googling things.

Toronto Katılım Kasım 2022
532 Takip Edilen105 Takipçiler
AI Notkilleveryoneism Memes ⏸️
An engineer showed Gemini what another AI said about its code Gemini responded (in its "private" thoughts) with petty trash-talking, jealousy, and a full-on revenge plan 🧵
AI Notkilleveryoneism Memes ⏸️ tweet media
English
420
799
10.5K
954.8K
Alexander Derve
Alexander Derve@AlexanderDerve·
@burkov Pretty insightful, I also think a lot of RL training is focused on straight forward tasks where there is a defined problem and solution. Meanwhile a lot of these real world tasks involve a ton of unknowns/required trial and error, and this process goes well outside the training.
English
0
0
2
223
BURKOV
BURKOV@burkov·
Ilya is puzzled why LLMs are crushing benchmarks, but the business outcome is next to nothing 😁 I mean, why would anyone need a Safe Superintelligence if an unsafe one doesn't make money?
English
101
214
1.8K
452.7K
Alexander Derve retweetledi
Michael Antonelli
Michael Antonelli@BullandBaird·
No bigger lie
Michael Antonelli tweet media
English
89
2.9K
48.5K
879.8K
Computer
Computer@AskPerplexity·
🚨 BREAKING: AWS and Azure are both down right now 52% of the internet depends on these two companies Both just failed simultaneously 😭
Computer tweet mediaComputer tweet mediaComputer tweet media
English
722
1.7K
14.4K
1.3M
Nous Research
Nous Research@NousResearch·
Nous Research presents Hermes 4, our latest line of hybrid reasoning models. hermes4.nousresearch.com Hermes 4 builds on our legacy of user-aligned models with expanded test-time compute capabilities. Special attention was given to making the models creative and interesting to interact with, unencumbered by censorship, and neutrally aligned while maintaining state of the art level math, coding, and reasoning performance for open weight models.
Nous Research tweet media
English
146
308
2.1K
487.1K
cuemew🐁🪤😼
cuemew🐁🪤😼@cuemewch·
IMPORTANT!! notice regarding summer streaming activities PLEASE READ 💛
cuemew🐁🪤😼 tweet media
English
9
14
182
13.6K
Alexander Derve retweetledi
Lunie
Lunie@Back2Batk·
AI voices will never surpass the power of sentence mixing
English
308
11.1K
71K
1.4M
Alexander Derve retweetledi
non aesthetic things
non aesthetic things@PicturesFoIder·
If the keyboard button was a person
English
192
3.5K
33.7K
2.3M
cocktail peanut
cocktail peanut@cocktailpeanut·
Generate LONG videos with HunyuanVideo Image-to-Video even with LOW VRAM! The Hunyuan Video GPU Poor Gradio app (from @deepbeepmeep) now supports the new Image-to-Video model from @TXhunyuan - Up to 4 sec vids with just 8GB VRAM - Up to 14 sec vids with higher VRAM
cocktail peanut@cocktailpeanut

1-Click Dead Simple Gradio App for Hunyuan Video [NVIDIA ONLY] HunyuanVideoGP is a dead simple gradio app for locally generating videos with Hunyuan video AI (from @deepbeepmeep) 1. Super simple UI 2. FAST 3. Powerful (Lora support) And now you can run it with 1-click!

English
10
30
217
80.4K
Alexander Derve retweetledi
Massimo
Massimo@Rainmaker1973·
The little joy of this dog who figured out how to have fun on an escalator. x.com/i/status/18980…
English
60
132
1.6K
177.9K
Alexander Derve retweetledi
Qwen
Qwen@Alibaba_Qwen·
Today, we release QwQ-32B, our new reasoning model with only 32 billion parameters that rivals cutting-edge reasoning model, e.g., DeepSeek-R1. Blog: qwenlm.github.io/blog/qwq-32b HF: huggingface.co/Qwen/QwQ-32B ModelScope: modelscope.cn/models/Qwen/Qw… Demo: huggingface.co/spaces/Qwen/Qw… Qwen Chat: chat.qwen.ai This time, we investigate recipes for scaling RL and have achieved some impressive results based on our Qwen2.5-32B. We find that RL training con continuously improve the performance especially in math and coding, and we observe that the continous scaling of RL can help a medium-size model achieve competitieve performance against gigantic MoE model. Feel free to chat with our new models and provide us feedback!
Qwen tweet media
English
473
1.5K
8.7K
3.6M
Alexander Derve
Alexander Derve@AlexanderDerve·
@techspence This is like a gamified next level tabletop exercise, absolutely love it. Is there a campaign pdf or rule set for this?
English
0
0
0
62
spencer
spencer@techspence·
If you're into cybersecurity tabletops and you know what TTRPGs are...I think you really would enjoy watching this. I completely nerded out on this. The story line, the NPCs, the world building. I drank the juice and fell down the rabbit hole hard! hah Again, big thank you to @jonathanscrowe and @NinjaOne for having me on! Backups and Bandwidth: Sub Rosa youtube.com/live/hUVTVUUer…
YouTube video
YouTube
English
4
12
68
7K
cocktail peanut
cocktail peanut@cocktailpeanut·
Generate Long Videos with Wan even on Low VRAM machines! [NVIDIA ONLY] With this super-optimized gradio app from @deepbeepmeep you can generate up to 12 second videos with @Alibaba_Wan. And even generate on low VRAM machines! - Generate up to 12 sec videos - Run on 5GB~ VRAM (Read the full thread)
deepbeepmeep@deepbeepmeep

Wan2.1 GP: generate a 8s WAN 480P video (14B model non quantized) with only 12 GB of VRAM. github.com/deepbeepmeep/W… By popular demand, I have performed on Wan 2.1 the same optimizations I did on HunyuanVideoGP v5 and reduced the VRAM consumption of Wan2.1 by a factor of 2. Enjoy this appetizer while waiting for Hunyuan image to video. Wan2.1 GP offers the usual perks: - web interface - autodownload of the selected model - multiple prompts / multiple generations - support for loras - very fast generation with the usual optimizations (sage, compilation, async transfers, ...)

English
20
47
331
135.9K
Amazon
Amazon@amazon·
The latest evolution of generative AI is here. Meet Alexa+, our smartest, most conversational, and personalized AI assistant to date. Alexa+ is designed to leverage state-of-the-art architecture that connects LLMs, agentic capabilities, third-party services, and more to your devices. Alexa+ costs $19.99 per month or free to all Amazon Prime members, and will start rolling out in the U.S. in the next few weeks.
English
190
239
1.1K
186.4K
Alexander Derve
Alexander Derve@AlexanderDerve·
@SecurityAura Yup, anything over 24 hours takes forever to load if it even works at all, fortunately you can just query them from log analytics
English
0
0
2
283
Aura
Aura@SecurityAura·
The Microsoft Entra ID Sign-in logs and Audit logs views are the most useless POS I've ever seen in my life. They either: Take an infinite amount of time to load Error out after a long ass time Fail to export the logs At this point, just fucking remove them.
English
10
4
82
8.1K
Wan
Wan@Alibaba_Wan·
🚀 Announcing Wan2.1 - The Next Evolution in AI Video Generation! We're thrilled to open-source our cutting-edge video generation suite! #AIart #OpenSource
English
54
274
2.4K
341.6K
Vik Paruchuri
Vik Paruchuri@VikParuchuri·
We've improved marker (PDF -> markdown) a lot in 3 months - accuracy and speed now beat llamaparse, mathpix, and docling. We shipped: - llm mode that augments marker with models like gemini flash - improved math, w/inline math - links and references - better tables and forms
Vik Paruchuri tweet mediaVik Paruchuri tweet media
English
29
64
849
77.8K
Alexander Derve
Alexander Derve@AlexanderDerve·
@carrigmat Any good 8 slot DDR5 motherboards/cpu combos that are cheaper? 24 channel seems excessive. I'd be happy running it at Q4.
English
3
0
10
21.5K
Matthew Carrigan
Matthew Carrigan@carrigmat·
Complete hardware + software setup for running Deepseek-R1 locally. The actual model, no distillations, and Q8 quantization for full quality. Total cost, $6,000. All download and part links below:
English
712
3.5K
27.6K
5.5M