Gobi Dasu

419 posts

Gobi Dasu

@gobidasu

Future of Work · Human-AI · Startups · Travel · Stanford BSCS/MSCS · Northwestern PhD CSEd · https://t.co/T4Mcg2vjOJ · https://t.co/CIzxtsse2B / aka गोविंद

San Francisco, CA เข้าร่วม Haziran 2011

458 กำลังติดตาม761 ผู้ติดตาม

Gobi Dasu@gobidasu·21 Eki

@elonmusk @karpathy Do you have any intuition on complexity or entropy here? Wouldn't shifting from a token space to a pixel or photon space increase the input and output spaces significantly?

English

2.6K

Elon Musk@elonmusk·21 Eki

@karpathy Long-term, >99% of input and output for AI models will be photons. Nothing else scales. grok.com/share/bGVnYWN5…

English

311

443

3.2K

3.5M

Andrej Karpathy@karpathy·21 Eki

I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter. The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language person) is whether pixels are better inputs to LLMs than text. Whether text tokens are wasteful and just terrible, at the input. Maybe it makes more sense that all inputs to LLMs should only ever be images. Even if you happen to have pure text input, maybe you'd prefer to render it and then feed that in: - more information compression (see paper) => shorter context windows, more efficiency - significantly more general information stream => not just text, but e.g. bold text, colored text, arbitrary images. - input can now be processed with bidirectional attention easily and as default, not autoregressive attention - a lot more powerful. - delete the tokenizer (at the input)!! I already ranted about how much I dislike the tokenizer. Tokenizers are ugly, separate, not end-to-end stage. It "imports" all the ugliness of Unicode, byte encodings, it inherits a lot of historical baggage, security/jailbreak risk (e.g. continuation bytes). It makes two characters that look identical to the eye look as two completely different tokens internally in the network. A smiling emoji looks like a weird token, not an... actual smiling face, pixels and all, and all the transfer learning that brings along. The tokenizer must go. OCR is just one of many useful vision -> text tasks. And text -> text tasks can be made to be vision ->text tasks. Not vice versa. So many the User message is images, but the decoder (the Assistant response) remains text. It's a lot less obvious how to output pixels realistically... or if you'd want to. Now I have to also fight the urge to side quest an image-input-only version of nanochat...

vLLM@vllm_project

🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support. 🧠 Compresses visual contexts up to 20× while keeping 97% OCR accuracy at <10×. 📄 Outperforms GOT-OCR2.0 & MinerU2.0 on OmniDocBench using fewer vision tokens. 🤝 The vLLM team is working with DeepSeek to bring official DeepSeek-OCR support into the next vLLM release — making multimodal inference even faster and easier to scale. 🔗 github.com/deepseek-ai/De… #vLLM #DeepSeek #OCR #LLM #VisionAI #DeepLearning

English

560

1.6K

13.3K

3.3M

Gobi Dasu@gobidasu·19 Eki

Specific insights for bootstrapped founders: 1. Bootstrapped founders trying to scale without burning out need to build human-AI systems since keeping dozens of people on payroll isn't an option. 2. The 'vibe coding' → 'scalable ops' transition is exactly what we're all wrestling with. Establishing a 'system of record' early can avert doing everything through Slack DMs and founder heroics. 3. The trust-but-verify + budget caps approach could be a game-changer for teams that can't afford expensive ops tools yet. People say you need to hit PMF first, but isn't PMF like any extensive heuristic search? Why wouldn't having a more robust operating model help you reach PMF faster? What's your biggest bottleneck when trying to professionalize your startup?

Gobi Dasu@gobidasu

Some founders scale humans. Others don't. Here’s the difference we gleaned from ops leaders at Alchemy, Meter, Sonder, and FAANGM.

English

306

Gobi Dasu@gobidasu·19 Eki

@anishackd Plug: Building Hailcube, human‑AI ops that make all of this automatic. Calendly on my profile.

English

Gobi Dasu@gobidasu·19 Eki

Some founders scale humans. Others don't. Here’s the difference we gleaned from ops leaders at Alchemy, Meter, Sonder, and FAANGM.

English

445

Gobi Dasu@gobidasu·19 Eki

@anishackd What's blocking you from scaling vibe-coded visions into scalable human-AI systems?

English

Gobi Dasu@gobidasu·19 Eki

@anishackd A DAO won't ship a vision; setters and accountable middle managers who update the system of record and align teams will.

English

Gobi Dasu@gobidasu·19 Eki

@anishackd Tribal knowledge is continuity—retain or document it in the system of record. So it's no longer "tribal".

English

Gobi Dasu@gobidasu·19 Eki

@anishackd Hiring for attention to detail means hiring credible people who will actually OWN keeping the system of record up to date.

English

Gobi Dasu@gobidasu·19 Eki

@anishackd Onboarding for alignment; traceability for visibility. All orchestrated through the system of record.

English

Gobi Dasu@gobidasu·19 Eki

@anishackd System of record (ATS→CRM→handbooks) > Slack archaeology.

English

Gobi Dasu@gobidasu·19 Eki

@anishackd Allow tool freedom; standardize roles, flows, and the system of record, but not apps.

English

Gobi Dasu@gobidasu·19 Eki

@anishackd Build vs. buy vs. acquire = finance math. VP-level folks should do these. Decision records > vibes.

English

Gobi Dasu@gobidasu·19 Eki

@anishackd Publish the vision—but silently test edge‑case coverage on delivery.

English

Gobi Dasu@gobidasu·19 Eki

@anishackd Trust‑but‑verify means a culture of bottom‑up suggestions from ICs, but go authoritative during crunch.

English

Gobi Dasu@gobidasu·10 Eki

Thanks John, and congrats on building Kolega across many tiers/use cases. We specifically have built a self running AI Kanban board connected to a vetted tech talent network. The traceability aspect is with respect to the human actions in this context. In a nutshell, the system can take in a high level vision, assign subtasks with suggested “prompts to use” to vetted human IC, and coordinate peer reviews amongst those ICs. The ICs themselves use AI tools. Every AI coordinated task assignment, reassignment, extension request, and peer review is logged, linked, and timestamped in the Kanban board interface allowing the visionary to oversee the instrumented actions of many fractional human ICs of varying org affinity before even having the money to keep them on full time payroll.

English

John Pellew@JohnWPellew·10 Eki

@gobidasu Love that you spent the week building a human-AI system to fix "vibe coding" with a traceable, reliable HITL workflow. Quick question, how are you surfacing that traceability to engineers today so they can trust and act on suggestions? John Pellew, CTO @ Kolega

English

Gobi Dasu@gobidasu·10 Eki

Summer wrap w/ #SFTechWeek: Malcolm Gladwell × Jay Gambetta, SPC, a16z + IBM meetups, ERA NYC, Stanford/Harker alumni events, hackathons, and a drone show. Spent most of our time heads-down on a human-AI system that fixes vibe coding with a traceable, reliable HITL workflow. DM "aipm" to join the waitlist.

English

259

Gobi Dasu@gobidasu·28 Haz

@ananyachdh I'd like to try this.

English

Ananya Chadha@ananyachdh·12 Mar

It's official — we are excited to launch Quander.ai in the world. GENERATE A 1-MINUTE MOVIE FROM A MINOR PROMPT. Our AI agent assembles the entire video for you, in your video timeline, and you can manually make changes, if you’d like. It's incredible to see our first users have made videos with their friends as main characters, product ads, faith stories, music videos, fanfictions, book trailers, and more. 🧵 Try it today @quanderAI. If you’ve made it here, we have free credits for you 👇

English

681

227

2.4K

300.9K

Gobi Dasu@gobidasu·2 Şub

🚀 Important Update: Dear customers and talent, due to high demand, we're actively automating our processes to serve you faster and more efficiently! Thank you for your patience—we're excited to connect with you soon. Stay tuned! 🔥

English

134

Gobi Dasu@gobidasu·28 Eki

Discover how AI and globalization are reshaping the American Dream, offering financial freedom and creativity without borders! 🌍💡 Dive in: ldtalentwork.com #FutureOfWork #Innovation tinyurl.com/yc4wzpds

English

114

Gobi Dasu@gobidasu·16 Eki

Unlock the power of global talent! 🌍 Overcome common myths about international hiring and tap into a world of skilled professionals. Discover more at ldtalentwork.com #TalentWithoutBorders tinyurl.com/5vv628wz

English

ค้นพบ

@elonmusk @karpathy @anishackd @ananyachdh @BarackObama @taylorswift13 @cristiano @BillGates