pjgo

92 posts

pjgo

@pjgo

California, USA เข้าร่วม Mart 2008

586 กำลังติดตาม57 ผู้ติดตาม

pjgo@pjgo·5d

Last year, a gaming company shared a major frustration with us: they couldn't get the capacity they needed from their frontier model providers. We took that challenge to heart. Today at #GoogleNext, I’m thrilled to announce that Gemini Pro and Gemini Flash are now available for preview on our inference platform. 🚀 This isn't just another API integration. These are sovereign deployments, offering a trifecta of benefits for enterprise scale: 🔒 Privacy: Your data stays within your dedicated environment. ⚡ Performance: Reliable, guaranteed throughput without the "noisy neighbor" effect. 📍 Proximity: Low-latency processing right next to your data sources. Stop by and see the future of sovereign AI in action! 📍 Find us at #GoogleNext Booth 7713 cirrascale.com/google

English

pjgo รีทวีตแล้ว

Tyler Stalman@stalman·11 Mar

The moment I realized I was going need to find a tougher test for the MacBook Neo😳

English

362

1.1K

21.9K

8.5M

pjgo@pjgo·27 May

@nathanbenaich Wow!

Nathan Benaich@nathanbenaich·27 May

Memorial Day 🌇

English

1.6K

pjgo รีทวีตแล้ว

Andrej Karpathy@karpathy·11 May

We're missing (at least one) major paradigm for LLM learning. Not sure what to call it, possibly it has a name - system prompt learning? Pretraining is for knowledge. Finetuning (SL/RL) is for habitual behavior. Both of these involve a change in parameters but a lot of human learning feels more like a change in system prompt. You encounter a problem, figure something out, then "remember" something in fairly explicit terms for the next time. E.g. "It seems when I encounter this and that kind of a problem, I should try this and that kind of an approach/solution". It feels more like taking notes for yourself, i.e. something like the "Memory" feature but not to store per-user random facts, but general/global problem solving knowledge and strategies. LLMs are quite literally like the guy in Memento, except we haven't given them their scratchpad yet. Note that this paradigm is also significantly more powerful and data efficient because a knowledge-guided "review" stage is a significantly higher dimensional feedback channel than a reward scaler. I was prompted to jot down this shower of thoughts after reading through Claude's system prompt, which currently seems to be around 17,000 words, specifying not just basic behavior style/preferences (e.g. refuse various requests related to song lyrics) but also a large amount of general problem solving strategies, e.g.: "If Claude is asked to count words, letters, and characters, it thinks step by step before answering the person. It explicitly counts the words, letters, or characters by assigning a number to each. It only answers the person once it has performed this explicit counting step." This is to help Claude solve 'r' in strawberry etc. Imo this is not the kind of problem solving knowledge that should be baked into weights via Reinforcement Learning, or least not immediately/exclusively. And it certainly shouldn't come from human engineers writing system prompts by hand. It should come from System Prompt learning, which resembles RL in the setup, with the exception of the learning algorithm (edits vs gradient descent). A large section of the LLM system prompt could be written via system prompt learning, it would look a bit like the LLM writing a book for itself on how to solve problems. If this works it would be a new/powerful learning paradigm. With a lot of details left to figure out (how do the edits work? can/should you learn the edit system? how do you gradually move knowledge from the explicit system text to habitual weights, as humans seem to do? etc.).

English

715

10.4K

1.5M

pjgo@pjgo·24 Mar

@geoffwolfe Wow. Is that a multi-billion dollar property?

English

Geoff Wolfe@geoffwolfe·24 Mar

Deadline to allow Army Corp of Engineers to clear your lot is March 31st. Things will start happening after that. Btw, this video is of a mobile home park. There is one owner for all that property. The trailers were all land renters.

Kevin Dalton@TheKevinDalton

Why does it look like there is zero recovery effort happening on PCH?

English

1.8K

pjgo รีทวีตแล้ว

Ashlee Vance@ashleevance·5 Eki

The world's richest human and the secretary of transportation just casually hammering out disaster relief on X. What a bizzarely sane moment in the midst of our long national nightmare that is an election year

English

524

2.3K

38.8K

2.4M

pjgo รีทวีตแล้ว

Dan Nystedt@dnystedt·4 Eki

Foxconn will begin mass producing Nvidia GB200 servers in November, with shipments in December, media report, citing company spokesman James Wu, who said Foxconn will display a GB200 server at its HHTD (tech days event) Oct. 8-9. Foxconn will supply both NVL36 and NVL72 servers, but expects demand to target NVL72. Aside from the GPUs, around 80% of other components in the servers are made by Foxconn (trade name of Hon Hai Precision Industry). Foxconn is the global leader in AI server production. $NVDA #AIserver #server #Foxconn #semiconductors news.cnyes.com/news/id/5729067

English

117

18.3K

pjgo@pjgo·14 Ağu

Hyperion forecasts $16.3B in HPC, Advanced AI, and AI-centric server spend this year, up just 10%—a shockingly low number considering the AI boom. The real growth story? It's all happening with the CSPs, and this report *excludes* CSP spend. #AI #HPC #CloudComputing

insideHPC.com@insideHPC

Hyperion Adds 'AI-Centric Servers' to HPC Industry Sizing, Boosts Overall Market by 37% wp.me/p3RLHQ-oBr @HPC_Hyperion #HPC #AI

English

pjgo@pjgo·12 Nis

@drorpoleg @nbaschez This: “Bring a carrier so you can wear him/her and walk around.” Baby can also sleep while in the carrier.

English

103

Dror Poleg@drorpoleg·11 Nis

@nbaschez Bring a car seat. Fly at night. Breastfeed/bottle at take off and landing. Bring a carrier so you can wear him/her and walk around.

English

509

Nathan Baschez@nbaschez·11 Nis

Parents: any tips for handling long international flights with a 10-month-old?

English

66.4K

pjgo@pjgo·11 Nis

@MosaicML @Forbes Congratulations @MosaicML!!!

English

232

Databricks AI Research@DbrxMosaicAI·11 Nis

Woo hoo! 🙌What an honor to make the @Forbes AI 50 List. MosaicML empowers you build your own #GenerativeAI. Train, finetune, and deploy your custom #LLM today: mosaicml.com

English

320

43.2K

pjgo รีทวีตแล้ว

Cerebras@cerebras·28 Mar

🎉 Exciting news! Today we are releasing Cerebras-GPT, a family of 7 GPT models from 111M to 13B parameters trained using the Chinchilla formula. These are the highest accuracy models for a compute budget and are available today open-source! (1/5) Press: businesswire.com/news/home/2023…

English

317

1.3K

503.6K

pjgo@pjgo·29 Mar

Three key steps when upgrading to NVIDIA's new H100 GPU: blog.cirrascale.com/blog/a100-to-h…

English

pjgo@pjgo·24 Mar

I feel like I’m watching the iPhone launch at 10x speed

Greg Brockman@gdb

We’ve added initial support for ChatGPT plugins — a protocol for developers to build tools for ChatGPT, with safety as a core design principle. Deploying iteratively (starting with a small number of users & developers) to learn from contact with reality: openai.com/blog/chatgpt-p…

English

105

pjgo รีทวีตแล้ว

Greg Brockman@gdb·20 Mar

GPT-4 for converting your ideas to working prototypes:

Zaid Farooqui@zaid

have a graveyard of tightly scoped side projects that i’m just feeding to this thing and it’s just spitting out code…….that just executes lmaooo

English

117

1.2K

323.2K

pjgo รีทวีตแล้ว

Lior Alexander@LiorOnAI·16 Mar

This might be the most eventful week AI has ever seen: Monday: -Stanford Alpaca 7B Tuesday: -GPT4 -Anthropic releases Claude -Google's PaLM API -AdeptAI raises $350M -Google adds GenAI to workspaces Wednesday: -Pytorch 2.0 -MidjourneyV5 Thursday: -Microsoft 365 Copilot

English

906

4.2K

898.2K

pjgo รีทวีตแล้ว

Hacker News Bot@newsycombinator·10 Oca

Interactive California Reservoir Levels Dashboard engaging-data.com/ca-reservoir-d…

English

9.4K

pjgo รีทวีตแล้ว

Tiernan Ray@TiernanRayTech·30 Kas

If you’re working on large language models on a budget, Cerebras and Cirrascale offer a cloud service starting at $2,500. Large language models are about to be a giant commercial phenomenon, says Cerebras CEO Feldman. // @cerebras // #AI #deeplearning #machinelearning

ZDNET@ZDNET

AI challenger Cerebras unveils 'pay-per-model' large-model AI cloud service with Cirrascale, Jasper zd.net/3ELDHYl by @TiernanRayTech

English

pjgo รีทวีตแล้ว

Sebastian Raschka@rasbt·22 Kas

Utilizing multiple GPUs for deep learning? My workflow is usually as follows: 1. Implement: Tinker and debug on the CPU 2. Test: Try the first run on a single GPU 3. Run wide: many experiments on many GPUs 4. Hone in: run a few bigger, promising experiments via multi-GPU

English

254

pjgo รีทวีตแล้ว

Exa@ExaAILabs·10 Kas

metaphor.systems is now publicly available! Metaphor is a search engine based on generative AI, the same sorts of techniques behind DALL-E 2 and GPT-3 1/

English

543

2.7K

ค้นพบ

@nathanbenaich @geoffwolfe @drorpoleg @nbaschez @MosaicML @Forbes @elonmusk @BarackObama