Arjun Reddy

324 posts

Arjun Reddy banner
Arjun Reddy

Arjun Reddy

@junafinity

Tech Entrepreneur

Madurai Katılım Eylül 2021
1.8K Takip Edilen527 Takipçiler
Arjun Reddy retweetledi
Arjun Reddy
Arjun Reddy@junafinity·
@jun_song At osmAPI.com we are working on exactly this and guess what, we have so many people contributing from Apple Studios and Clusters
English
0
0
0
13
Jun Song
Jun Song@jun_song·
Selling personal computing will be new trend of 2027. Mark this tweet.
English
4
2
47
1.9K
Jun Song
Jun Song@jun_song·
Some guy really said Macs are bad for local LLMs. Zero heat compared to GPUs, no PC building hassle, and unified memory lets you run massive models. Plus, the power bill is cut in half—that’s hundreds of dollars saved a month. macOS is basically perfect for agent workflows, and MLX is clearly the winning format right now. The ONLY downside is prefill speed. Name a better $4k setup than the upcoming M5 Max Mac Studio (128GB RAM, 600GB/s bandwidth). You can’t. Recommending a DGX Spark instead is wild. Good luck running agents properly on 273GB/s. (And yeah, I don't recommend the Mac Mini either, bandwidth is too low). If you're going to criticize, at least do your research first.
English
59
12
271
22.7K
Arjun Reddy
Arjun Reddy@junafinity·
@jun_song @dealignai Thank you for your sincere efforts Jun! Fitting either MiniMax M3 or K2.6 on a 128GB MBP (I’ve two of these so I’m extra happy!) would really change the way people see local AI vs $200 per month max plans
English
1
0
4
998
Jun Song
Jun Song@jun_song·
In few weeks, everyone with 128gb Mac will have uncensored Opus-4.6 locally. It will be Minimax-M3.0-JANGTQ-CRACK by @dealignai The open-source community is working hard on fitting them into 24GB VRAM. The future of Local LLM is so bright.
English
78
125
2.2K
85.7K
Arjun Reddy
Arjun Reddy@junafinity·
@Alibaba_Qwen We are @osmAPI_off , the OpenRouter of India building a strong user base for Qwen models here, we’d love to be a Qwen Ambassador
English
0
0
0
244
Qwen
Qwen@Alibaba_Qwen·
📣We're calling for ambassadors! Whether you're a developer with great technical taste or a local community leader who loves bringing people together, we'd love to have you join us. Visit the website below for more details and to apply. In return, ambassadors will receive early access to Qwen models, API credits, annual merchandise, and more. Come and check it out!👇 qwen.ai/ambassador
Qwen tweet media
English
127
192
1.8K
163.7K
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
PSA, we are hiring for our DevX team to kick start our India presence 🇮🇳! If you want to help builders get the most out of Gemini in India, please ping me via DM or email. India is our largest market from an AI Studio user pov, very excited to visit later this year as well!!
English
210
96
1.7K
207.6K
Kasif
Kasif@md_kasif_uddin·
Be honest, which is the best open source AI Model?
Kasif tweet mediaKasif tweet mediaKasif tweet mediaKasif tweet media
English
364
73
1.9K
295.9K
snow
snow@lstmfpga·
@HowToAI_ Thanks, but I hope that you can post the arxiv link instead of pdf, because it is too large to dl and open on mobile.
English
2
0
0
550
How To AI
How To AI@HowToAI_·
Tencent has killed fine-tuning and RL with a $18 budget. Right now, if you want an AI agent to become an expert at a specific, complex real-world task, you have to use Reinforcement Learning. You let it try, fail, and update its internal parameters over and over again. This is the exact optimization technique (GRPO) that DeepSeek used to build their massive reasoning models. But there is a massive problem. Updating model weights is insanely expensive. It requires massive GPU clusters. And worst of all, when you train a model to be highly specialized at one thing, it often "overfits" and forgets how to be good at everything else. Tencent killed this bottleneck forever.. by building Training-Free GRPO. Instead of spending thousands of dollars to permanently alter the AI's brain, they asked a simple question: What if we just distill the experience of learning, and inject it as a memory? Here is how it works. They run the AI through the exact same trial-and-error process. But instead of updating the weights, they extract the "semantic advantage"—the actual logic of why one answer was better than another. They compress this winning logic into a "token prior”, a tiny package of high-quality experiential knowledge. Then, they just attach that knowledge directly into the API call. The results are staggering. Tested on DeepSeek-V3, this method required only a few dozen training samples to turn the AI into a specialized expert in complex math and web searching. It didn't just compete with models that were actually fine-tuned. It outperformed them. Zero parameter updates. Zero expensive training runs. Zero base-model amnesia.
How To AI tweet media
English
73
136
938
63.4K
Arjun Reddy
Arjun Reddy@junafinity·
Is it true that @deepseek_ai is dropping prices like it’s hot because they wanna drive up their traction, ipso facto - valuation before their merger with @Kimi_Moonshot ?
English
0
0
0
52
AVB
AVB@neural_avb·
@RonxldWilson Very nice. Is it publicly available to interact/browse?
English
1
0
2
30
Arjun Reddy
Arjun Reddy@junafinity·
@bindureddy Once the context length of 1M without rot is perfect in Kimi K3, there’ll be an exodus from OpenAI and Anthropic. Enterprise AI will work towards making it LTS
English
0
0
1
258
Bindu Reddy
Bindu Reddy@bindureddy·
Kimi 2.6 beats DeepSeek and remains the king of open source!! It’s a GPT 5.5 / Opus 4.7 low effort model which is about 5x cheaper in practice The only drawback is speed! Open source would have won when they catch you up on that dimension - Weeks away!
English
54
22
444
26.8K
Arjun Reddy
Arjun Reddy@junafinity·
@AlicanKiraz0 Salt of the earth believers of opensource future, helping us fight closed AGI Overlords! Thank you brother
English
1
1
42
4.3K
Alican Kiraz
Alican Kiraz@AlicanKiraz0·
Yeni Siber Güvenlik modelimi finetune'a başladım; Qwen3.6-35B-A3B'yi 1.3 Milyar token'lık cybersecurity datasetim ile finetune ediyorum 🔥 Bu hafta açık-kaynak olarak paylaşacağım.
Alican Kiraz tweet media
Türkçe
84
152
2.5K
120.2K
Arjun Reddy
Arjun Reddy@junafinity·
@jun_song @Kimi_Moonshot and @Alibaba_Qwen as they are the only ones that make agentic ready models with vision. While we are happy that there are amazing GLM, MiniMax and DeepSeek opensource releases, it would awesome to have more vision ready models
English
0
0
9
2.3K
Jun Song
Jun Song@jun_song·
What is your favorite open-source AI company?
Jun Song tweet media
English
295
97
1.6K
93.6K
Arjun Reddy
Arjun Reddy@junafinity·
@runsonai Maybe this is a stupid question but can it also handle some Claude opus distill fine tunes?
English
0
0
0
68
Thanh Pham
Thanh Pham@runsonai·
I open-sourced DDTree-MLX: tree-based speculative decoding for Apple Silicon. Now you can run Qwen 3.5 27b on your Apple machines 1.5x faster than normal. Expect even faster on smaller models. It runs Qwen 3.5 27B locally with MLX, extends DFlash with draft trees, and gets ~10-15% faster than DFlash alone on code + structured prompts while keeping output lossless. Built on the works of @bstnxbt @liranringel @yaniv_romano github.com/humanrouter/dd…
English
25
55
517
35.9K
Arjun Reddy
Arjun Reddy@junafinity·
Hi @SarvamAI @SarvamForDevs We used TurboQuant to shrink Sarvam30B ~3.5 times smaller while losing only 2.5% performance compared to BF16
English
3
3
5
472
Arjun Reddy
Arjun Reddy@junafinity·
Compress the model. Keep the intelligence. Sarvam built a remarkable 32-billion parameter model for a billion Indian language speakers. Our job was making it run anywhere. 64 gigabytes is a lot of weight to carry. We applied TurboQuant. The core insight is simple. Rotate the weights so every coordinate carries equal information. Now each number looks like every other number. No outliers. No wasted bits. Quantize each to one of eight optimal values. Three bits instead of sixteen. Add two tiny scale factors per block to recover the fine detail. The experts — 90% of the model — got the aggressive treatment. Attention weights got slightly more room. Router weights stayed untouched. Routing is a discrete decision. You don't approximate discrete decisions. 64 gigabytes became 18.6. One GPU instead of four. Cosine similarity above 0.99. The weights changed. The intelligence didn't. Specific knowledge plus leverage. Know which weights can compress. Apply the math. Ship it
English
0
0
0
126