Arjun Reddy

324 posts

Arjun Reddy

@junafinity

Tech Entrepreneur

Madurai Katılım Eylül 2021

1.8K Takip Edilen527 Takipçiler

Sabitlenmiş Tweet

Arjun Reddy@junafinity·11h

We liberated @claudeai opus distilled @Alibaba_Qwen 3.6 27B by @KyleHessling1 & Jackrong with Heretic abliteration tool kit. Quantized models with vision preserved: huggingface.co/osmapi/osmQwop… huggingface.co/osmapi/osmQwop… huggingface.co/osmapi/osmQwop… Thanks to @osmAPI_off @TervPro 's student research team

English

Arjun Reddy retweetledi

Pratyush Kumar@pratykumar·5d

A must attend webinar on what to build with our SoTA speech recognition model and the recent upgrades on diarisation and accuracy.

Sarvam@SarvamAI

We're hosting a webinar on Saaras V3, our speech-to-text model for teams building Voice AI for India. Date: Thursday, May 21 Time: 5:00 PM IST In this session, we'll break down what it takes to ship Voice AI that works reliably across noisy environments, regional accents, mixed-language speech, and multiple speakers. Register here: links.sarvam.io/speech-to-text…

English

9.2K

Arjun Reddy@junafinity·6d

@ajay_2512x I’m a co-founder of this IIT incubated AI Healthcare company- Ohm.Doctor how can I be of service?

English

661

Arjun Reddy@junafinity·13 May

@jun_song At osmAPI.com we are working on exactly this and guess what, we have so many people contributing from Apple Studios and Clusters

English

Jun Song@jun_song·13 May

Selling personal computing will be new trend of 2027. Mark this tweet.

English

1.9K

Arjun Reddy@junafinity·13 May

@jun_song Apple Silicon Master Race ✌🏻

Español

266

Jun Song@jun_song·13 May

Some guy really said Macs are bad for local LLMs. Zero heat compared to GPUs, no PC building hassle, and unified memory lets you run massive models. Plus, the power bill is cut in half—that’s hundreds of dollars saved a month. macOS is basically perfect for agent workflows, and MLX is clearly the winning format right now. The ONLY downside is prefill speed. Name a better $4k setup than the upcoming M5 Max Mac Studio (128GB RAM, 600GB/s bandwidth). You can’t. Recommending a DGX Spark instead is wild. Good luck running agents properly on 273GB/s. (And yeah, I don't recommend the Mac Mini either, bandwidth is too low). If you're going to criticize, at least do your research first.

English

271

22.7K

Arjun Reddy@junafinity·13 May

@viccpoes @bdsqlsz Oh yes please

English

vicc@viccpoes·12 May

should we open source? 👀

Krea@krea_ai

this is Krea 2. our first foundation model, built completely from scratch for aesthetic diversity and stylistic control. learn more and get early access 👇

English

209

1.1K

86.6K

Arjun Reddy@junafinity·13 May

@jun_song @dealignai Thank you for your sincere efforts Jun! Fitting either MiniMax M3 or K2.6 on a 128GB MBP (I’ve two of these so I’m extra happy!) would really change the way people see local AI vs $200 per month max plans

English

998

Jun Song@jun_song·13 May

In few weeks, everyone with 128gb Mac will have uncensored Opus-4.6 locally. It will be Minimax-M3.0-JANGTQ-CRACK by @dealignai The open-source community is working hard on fitting them into 24GB VRAM. The future of Local LLM is so bright.

English

125

2.2K

85.7K

Arjun Reddy@junafinity·12 May

@Alibaba_Qwen We are @osmAPI_off , the OpenRouter of India building a strong user base for Qwen models here, we’d love to be a Qwen Ambassador

English

244

Qwen@Alibaba_Qwen·11 May

📣We're calling for ambassadors! Whether you're a developer with great technical taste or a local community leader who loves bringing people together, we'd love to have you join us. Visit the website below for more details and to apply. In return, ambassadors will receive early access to Qwen models, API credits, annual merchandise, and more. Come and check it out!👇 qwen.ai/ambassador

English

127

192

1.8K

163.7K

Arjun Reddy@junafinity·8 May

@OfficialLoganK We run osmAPI.com and would love to partner to serve our 30k+ college student users together

English

Logan Kilpatrick@OfficialLoganK·7 May

PSA, we are hiring for our DevX team to kick start our India presence 🇮🇳! If you want to help builders get the most out of Gemini in India, please ping me via DM or email. India is our largest market from an AI Studio user pov, very excited to visit later this year as well!!

English

210

1.7K

207.6K

Arjun Reddy@junafinity·1 May

@md_kasif_uddin Qwen for sure

English

152

Kasif@md_kasif_uddin·1 May

Be honest, which is the best open source AI Model?

English

364

1.9K

295.9K

Arjun Reddy@junafinity·30 Nis

@lstmfpga @HowToAI_ arxiv.org/abs/2510.08191

QME

snow@lstmfpga·30 Nis

@HowToAI_ Thanks, but I hope that you can post the arxiv link instead of pdf, because it is too large to dl and open on mobile.

English

550

How To AI@HowToAI_·29 Nis

Tencent has killed fine-tuning and RL with a $18 budget. Right now, if you want an AI agent to become an expert at a specific, complex real-world task, you have to use Reinforcement Learning. You let it try, fail, and update its internal parameters over and over again. This is the exact optimization technique (GRPO) that DeepSeek used to build their massive reasoning models. But there is a massive problem. Updating model weights is insanely expensive. It requires massive GPU clusters. And worst of all, when you train a model to be highly specialized at one thing, it often "overfits" and forgets how to be good at everything else. Tencent killed this bottleneck forever.. by building Training-Free GRPO. Instead of spending thousands of dollars to permanently alter the AI's brain, they asked a simple question: What if we just distill the experience of learning, and inject it as a memory? Here is how it works. They run the AI through the exact same trial-and-error process. But instead of updating the weights, they extract the "semantic advantage"—the actual logic of why one answer was better than another. They compress this winning logic into a "token prior”, a tiny package of high-quality experiential knowledge. Then, they just attach that knowledge directly into the API call. The results are staggering. Tested on DeepSeek-V3, this method required only a few dozen training samples to turn the AI into a specialized expert in complex math and web searching. It didn't just compete with models that were actually fine-tuned. It outperformed them. Zero parameter updates. Zero expensive training runs. Zero base-model amnesia.

English

136

938

63.4K

Arjun Reddy@junafinity·29 Nis

Is it true that @deepseek_ai is dropping prices like it’s hot because they wanna drive up their traction, ipso facto - valuation before their merger with @Kimi_Moonshot ?

English

Arjun Reddy@junafinity·28 Nis

@neural_avb @RonxldWilson I second this request ✌🏻

English

AVB@neural_avb·28 Nis

@RonxldWilson Very nice. Is it publicly available to interact/browse?

English

ron@RonxldWilson·28 Nis

fully automated end to end pipeline ready and deployed scaled to 22 workers w/ one master now just gonna let it run forever

ron@RonxldWilson

how I built a search engine from scratch here's what I have been building over the course of last month resulting in visiting of over 55 million unique domains 130 GB of sqlite DB, 200 million rows and over 4 million unique Indian B2B businesses 1/n

English

569

Arjun Reddy@junafinity·27 Nis

@bindureddy Once the context length of 1M without rot is perfect in Kimi K3, there’ll be an exodus from OpenAI and Anthropic. Enterprise AI will work towards making it LTS

English

258

Bindu Reddy@bindureddy·27 Nis

Kimi 2.6 beats DeepSeek and remains the king of open source!! It’s a GPT 5.5 / Opus 4.7 low effort model which is about 5x cheaper in practice The only drawback is speed! Open source would have won when they catch you up on that dimension - Weeks away!

English

444

26.8K

Arjun Reddy@junafinity·26 Nis

@AlicanKiraz0 Salt of the earth believers of opensource future, helping us fight closed AGI Overlords! Thank you brother

English

4.3K

Alican Kiraz@AlicanKiraz0·25 Nis

Yeni Siber Güvenlik modelimi finetune'a başladım; Qwen3.6-35B-A3B'yi 1.3 Milyar token'lık cybersecurity datasetim ile finetune ediyorum 🔥 Bu hafta açık-kaynak olarak paylaşacağım.

Türkçe

152

2.5K

120.2K

Arjun Reddy@junafinity·25 Nis

@jun_song @Kimi_Moonshot and @Alibaba_Qwen as they are the only ones that make agentic ready models with vision. While we are happy that there are amazing GLM, MiniMax and DeepSeek opensource releases, it would awesome to have more vision ready models

English

2.3K

Jun Song@jun_song·25 Nis

What is your favorite open-source AI company?

English

295

1.6K

93.6K

Arjun Reddy@junafinity·16 Nis

@runsonai Maybe this is a stupid question but can it also handle some Claude opus distill fine tunes?

English

Thanh Pham@runsonai·15 Nis

I open-sourced DDTree-MLX: tree-based speculative decoding for Apple Silicon. Now you can run Qwen 3.5 27b on your Apple machines 1.5x faster than normal. Expect even faster on smaller models. It runs Qwen 3.5 27B locally with MLX, extends DFlash with draft trees, and gets ~10-15% faster than DFlash alone on code + structured prompts while keeping output lossless. Built on the works of @bstnxbt @liranringel @yaniv_romano github.com/humanrouter/dd…

English

517

35.9K

Arjun Reddy@junafinity·2 Nis

@SarvamAI @SarvamForDevs @harsh_m121 @VinayakGavariya would love your thoughts

English

153

Arjun Reddy@junafinity·2 Nis

Hi @SarvamAI @SarvamForDevs We used TurboQuant to shrink Sarvam30B ~3.5 times smaller while losing only 2.5% performance compared to BF16

English

472

Arjun Reddy@junafinity·2 Nis

Compress the model. Keep the intelligence. Sarvam built a remarkable 32-billion parameter model for a billion Indian language speakers. Our job was making it run anywhere. 64 gigabytes is a lot of weight to carry. We applied TurboQuant. The core insight is simple. Rotate the weights so every coordinate carries equal information. Now each number looks like every other number. No outliers. No wasted bits. Quantize each to one of eight optimal values. Three bits instead of sixteen. Add two tiny scale factors per block to recover the fine detail. The experts — 90% of the model — got the aggressive treatment. Attention weights got slightly more room. Router weights stayed untouched. Routing is a discrete decision. You don't approximate discrete decisions. 64 gigabytes became 18.6. One GPU instead of four. Cosine similarity above 0.99. The weights changed. The intelligence didn't. Specific knowledge plus leverage. Know which weights can compress. Apply the math. Ship it

English

126

Arjun Reddy@junafinity·2 Nis

huggingface.co/VibeStudio/sar…

ZXX

226

Keşfet

@ajay_2512x @jun_song @viccpoes @bdsqlsz @dealignai @Alibaba_Qwen @osmAPI_off @OfficialLoganK