Wootzapp
3.2K posts

Wootzapp
@WootzApp
Home of the w8-RL rollout infrastructure. we built a browser... so you can build environments






The deadline to apply for Alliance’s ALL17 cohort is tonight at 11:59 PM PT. If you’re building a startup, read on to learn why the best builders choose Alliance ↓


We’re releasing OpenReward, a minimalist product that does one thing really well: serve RL environments at scale. Agentic RL is really painful because it adds a new axis of compute - environment compute - alongside training compute that needs to be scaled seamlessly on demand. OpenReward is a narrowly focused product based on this problem. We serve complex agentic environments as minimal API endpoints, which work with any training framework and scale based on use. Our vision is a home of reward on the internet, which is interoperable with any form of training or evaluation - and ultimately provides an open ecosystem alternative to the closed RL vendor market. 🧵


Meta's infrastructure. India's best builders. 48 hours. This is India's biggest agentic RL hackathon. OpenEnv is the new open standard to train AI agents, used by PyTorch, AI at Meta, and Hugging Face In April, India's best builders get to build on it and have their work reviewed by Meta and HF engineering teams. Meta PyTorch OpenEnv Hackathon × Scaler School of Technology. The best environments get evaluated for inclusion in the OpenEnv global ecosystem. - Real contribution. Not a portfolio piece. - $30,000 prize pool. - 48 hours. - Bangalore.



Introducing OpenReward. 🌍 330+ RL environments through one API ⚡ Autoscaled sandbox compute 🍒 4.5M+ unique RL tasks 🚂 Works like magic with Tinker, Miles, Slime Link and thread below.




Whichever is the lab that will offer continuous learning / online RL per unique agent for enterprise will absolutely print money. Virtual headcount for all companies will become very real. Could charge $5k+ per month per continuous agent easily

Inference is becoming central to both RL rollouts and production agents. We chose NVIDIA Dynamo because agentic inference at scale means handling global deployments, long-context reasoning, multi-turn trajectories, sparse MoEs, and large fleets of adapters.






Now that our 15 member llm team is infamous, time to expand for next time!
If you have done one or more of the following, then please reach out.
- pretrained a model of any size, from scratch
- posttrained any base model, end to end (data curation, sft, rl)
- are a pytorch wizard
- are a cuda kernel master
- you have any other relevant skills and work to back it up
firstname






