kukeshajanth kodeswaran

776 posts

kukeshajanth kodeswaran

@kukeshajanth

Building ⬛⬜⬜⬜⬜

Toronto Katılım Ağustos 2020

442 Takip Edilen48 Takipçiler

kukeshajanth kodeswaran@kukeshajanth·6h

Haha , crazy 🎉

Nick Dobos@NickADobos

12 million token context window lol

English

kukeshajanth kodeswaran retweetledi

Dilmer 👓@Dilmerv·2d

📣 Today, I’m excited to walk you through Unity’s NEW AI offering, Meta MCP Extensions, and agentic tools to demonstrate how we can 𝗕𝘂𝗶𝗹𝗱 𝗔 𝗙𝘂𝗹𝗹 𝗩𝗥/𝗩𝗥 𝗚𝗮𝗺𝗲 from start to finish using these AI tools in a practical way. 🎥 Full video available at: youtu.be/bWxIF903t_I 📌 Here’s what I’m covering today: - Unity VR project setup (with OpenXR plugins) - Installing the Unity AI Assistant and demos - Configuring Unity MCP + Meta MCP Extensions - Configuring Claude Code & MCPs - Building a VR/MR Basketball Game with the Unity AI Assistant, Claude Agent, and external Claude Code CLI - A lot of iteration with Claude Code + the Meta XR Simulator 💡Also, it’s been a while since I've posted a new video, and I’m genuinely excited to be back, especially with a topic like this that I know many devs have been waiting for.

YouTube

English

340

21.3K

kukeshajanth kodeswaran@kukeshajanth·2d

@BoazWith True , really impressed with overall capabilities even outside coding . This feels like opus 4.5 level change. Can’t get enough of this workflow . Redoing most of the existing workflow with codex , feeling more confident . 🚀

English

Boaz Hwang@BoazWith·2d

@kukeshajanth That spec/implementation split is the useful part. Codex gets much calmer when the weird decisions are already named instead of hidden inside a chat prompt.

English

kukeshajanth kodeswaran@kukeshajanth·2d

right now meta is Claude code for spec and codex for implementation. Seeing insane results

Gokul Rajaram@gokulr

TOO EARLY TO CALL From friend: ---- To compare and learn, I built an app in parallel on both Codex and Claude Code. - Claude Code was far superior in design. - But Codex beat Claude Code by orders of magnitude when it came to development and (in particular) testing. I had to spend a fraction of time on testing the app in Codex vs Claude. Codex was far better at identifying and fixing bugs before it came to me for testing. Claude had many more bugs, and it’s testing was limited and lower fidelity. ---- tl;dr It's too early to have a singular winner in AI software development (let alone in other more nascent spaces). We have multiple excellent companies going neck-to-neck with zero complacency and incredible urgency. Developers will be the ultimate winner as Anthropic and OpenAI (and Google and Factory) continually one-up each other across different dimensions.

English

kukeshajanth kodeswaran retweetledi

Leon Lin@LexnLin·2d

UPDATE for imagegen-frontend-web skill more "creative" outputs - different layouts - better understanding of the request skill: github.com/Leonxlnx/taste…

English

1.2K

54.6K

kukeshajanth kodeswaran@kukeshajanth·2d

@NickADobos Didn’t think of this way , kudos. For a given time , gpt 5.5 low with multiple pass might be better I guess. Do you happen to find anything concrete in these tests

English

Nick Dobos@NickADobos·2d

What’s the difference between GPT 5.5 low reasoning + /goal Vs GPT 5.5 xhigh reasoning + one shot Both are essentially yeeting compute at a task. But which one - is more efficient? - works better & produces better results? - finishes the task? Seems like the major difference is low would spend less time thinking between each step? And would do way more tool calls because of this?

English

544

98.2K

kukeshajanth kodeswaran retweetledi

Alex Patrascu@maxescu·28 Nis

You can put yourself in any celebrity-sighting scenario now. Upload a photo. Seedance drops you into the scene. Paparazzi moment, red carpet, airport terminal... whatever you want. I've done 45 of these. Still testing which scenarios work best. Here's the prompt template:

English

129

1.4K

169.5K

kukeshajanth kodeswaran retweetledi

AVB@neural_avb·25 Nis

Watch this 50 minute video to learn low-levels of GRPO and training tiny models (<1B) on RLVR envs Also: - text-based gym envs - visual/animated tour of how GRPO works - deep dive into PPO math: we will literally see logits update with each policy update - code

English

295

10K

kukeshajanth kodeswaran retweetledi

Mikhail Parakhin@MParakhin·23 Nis

If you are interested in the state-of-the-art finetuning tips-and-tricks

Shopify Engineering@ShopifyEng

We reverse-engineered training data from thousands of merchant-created automations and fine-tuned Qwen3-32B into a tool-calling agent for Shopify Flow. Results: 2.2x faster, 68% cheaper The more interesting part: why we trained on Python instead of our own DSL, and what broke when benchmarks looked good but production didn't. ⬇️

English

277

57.3K

kukeshajanth kodeswaran retweetledi

Ankit Gupta@agupta·23 Nis

Okay yeah browser-harness is AGI github.com/browser-use/br…

English

101

1.6K

172.2K

kukeshajanth kodeswaran@kukeshajanth·23 Nis

@Jason I’m in

English

@jason@Jason·23 Nis

We started an AI founder twitter group... reply with "I'm in" if you're a founder and want to be added

English

10.9K

136

4.6K

901K

kukeshajanth kodeswaran retweetledi

Numman Ali@nummanali·21 Nis

1. wtf

Google Gemma@googlegemma

What does it take to run 3, 5, or even 10 concurrent instances of Gemma 4 locally? We've open-sourced a demo letting you run multiple models side-by-side on your hardware. Gemma 4 26B A4B easily runs 10+ concurrent requests on a MacBook Pro M4 Max at 18 tokens/sec per request.

QST

2.1K

474.7K

kukeshajanth kodeswaran retweetledi

Aksel@akseljoonas·21 Nis

Introducing ml-intern, the agent that just automated the post-training team @huggingface It's an open-source implementation of the real research loop that our ML researchers do every day. You give it a prompt, it researches papers, goes through citations, implements ideas in GPU sandboxes, iterates and builds deeply research-backed models for any use case. All built on the Hugging Face ecosystem. It can pull off crazy things: We made it train the best model for scientific reasoning. It went through citations from the official benchmark paper. Found OpenScience and NemoTron-CrossThink, added 7 difficulty-filtered dataset variants from ARC/SciQ/MMLU, and ran 12 SFT runs on Qwen3-1.7B. This pushed the score 10% → 32% on GPQA in under 10h. Claude Code's best: 22.99%. In healthcare settings it inspected available datasets, concluded they were too low quality, and wrote a script to generate 1100 synthetic data points from scratch for emergencies, hedging, multilingual etc. Then upsampled 50x for training. Beat Codex on HealthBench by 60%. For competitive mathematics, it wrote a full GRPO script, launched training with A100 GPUs on hf.co/spaces, watched rewards claim and then collapse, and ran ablations until it succeeded. All fully backed by papers, autonomously. How it works? ml-intern makes full use of the HF ecosystem: - finds papers on arxiv and hf.co/papers, reads them fully, walks citation graphs, pulls datasets referenced in methodology sections and on hf.co/datasets - browses the Hub, reads recent docs, inspects datasets and reformats them before training so it doesn't waste GPU hours on bad data - launches training jobs on HF Jobs if no local GPUs are available, monitors runs, reads its own eval outputs, diagnoses failures, retrains ml-intern deeply embodies how researchers work and think. It knows how data should look like and what good models feel like. Releasing it today as a CLI and a web app you can use from your phone/desktop. CLI: github.com/huggingface/ml… Web + mobile: huggingface.co/spaces/smolage… And the best part? We also provisioned 1k$ GPU resources and Anthropic credits for the quickest among you to use.

English

134

641

4.7K

1.2M

kukeshajanth kodeswaran retweetledi

Tencent AI@TencentAI_News·21 Nis

🥳We just open-sourced Cube Sandbox! An instant, concurrent, secure and lightweight sandbox runtime for AI Agents. Built with RustVMM and KVM, it achieves the perfect balance of security and performance: → Sub-60ms cold start (2.5-50x faster) → Under 5MB memory overhead per instance (6x less memory) → Dedicated kernel per sandbox (hardware-level isolation) → Thousands of concurrent sandboxes per node → 100% E2B SDK compatible. Swap the endpoint, zero code changes Full-stack capability, one-click deployment. 3 steps to spin up your own private AI sandbox 👇 🔗 github.com/TencentCloud/C…

GIF

English

142

204.8K

kukeshajanth kodeswaran retweetledi

VoxelPlot@voxelplot·19 Nis

Seedance 2.0 - Advanced Workflow Series 9. Rendering Engine Upload a video of a clay render of your scene. Provide a reference of the first frame with textures and lighting. Seedance will render the entire video with textures and lighting. Workflow + Prompts 👇

English

760

46.1K

kukeshajanth kodeswaran retweetledi

Zhijian Liu@zhijianliu_·15 Nis

🔥 DFlash x MLX is happening! Shoutout to @aryagm01 for the early work on this. We're building on the momentum. Native MLX support, more models (Qwen3.5), up to 4x faster. Lossless! 👉 github.com/z-lab/dflash

English

756

212.4K

kukeshajanth kodeswaran@kukeshajanth·15 Nis

Fat skills and fat deterministic code ! Could be the efficient combo!

Garry Tan@garrytan

x.com/i/article/2042…

English

kukeshajanth kodeswaran retweetledi

Zhihao Jia@JiaZhihao·14 Nis

🚀Introducing Motus, the open-source agent infrastructure that learns in production. Existing agent infra serves static agents: the harness, model, and workflow are fixed after deployment. But static agents degrade over time. The harness goes stale, new models go unincorporated, context drifts, and latency compounds. Motus closes this gap by learning from every trace (failures, latency, cost, and task outcomes) and using those signals to continuously optimize agent harness, model orchestration, context memory, and end-to-end latency. Early results: higher accuracy than any single frontier model at 2.3× lower cost (Terminal-Bench 2.0, SWE-bench Verified), with 52% lower latency and 45% better memory recall. Open source under Apache 2.0. Works with any agent SDK. Deploy with one command. github.com/lithos-ai/motus lithosai.com

English

565

56K

kukeshajanth kodeswaran@kukeshajanth·14 Nis

@gajesh Small models and CPU/Edge inference is the combo that would be able to take majority of the current loads/tasks that are being performed on GPU. As long as the task is within the defined scope and repeatable , we can optimize this combo to get pretty good performance per dollar.

English

Gajesh@gajesh·14 Nis

i have become more optimistic about small - medium size models after looking at GEPA.

English

1.7K

kukeshajanth kodeswaran retweetledi

Gajesh@gajesh·14 Nis

Wake the world's sleeping compute. Look at the Mac nearest to you. What's it doing? Probably nothing. There are 100M+ Macs with Apple Silicon out there. Apple quietly made them *really* good at inference. A $3k Mac runs a 60B model at 30 watts. Most sit idle most of the day. Meanwhile every AI API call passes through three layers of margin before reaching the hardware. We call this the Inference Tax. We got curious: what happens if you connect idle Macs directly to inference demand? This is Darkbloom. Private inference network for idle Macs. darkbloom [dot] dev -- paper + code open. Reply for invite + free credits ↓

English

310

121

1.4K

458K

Keşfet

@BoazWith @NickADobos @Jason @huggingface @aryagm01 @elonmusk @BarackObama @taylorswift13