pjgo

92 posts

pjgo

pjgo

@pjgo

California, USA เข้าร่วม Mart 2008
586 กำลังติดตาม57 ผู้ติดตาม
pjgo
pjgo@pjgo·
Last year, a gaming company shared a major frustration with us: they couldn't get the capacity they needed from their frontier model providers. We took that challenge to heart. Today at #GoogleNext, I’m thrilled to announce that Gemini Pro and Gemini Flash are now available for preview on our inference platform. 🚀 This isn't just another API integration. These are sovereign deployments, offering a trifecta of benefits for enterprise scale: 🔒 Privacy: Your data stays within your dedicated environment. ⚡ Performance: Reliable, guaranteed throughput without the "noisy neighbor" effect. 📍 Proximity: Low-latency processing right next to your data sources. Stop by and see the future of sovereign AI in action! 📍 Find us at #GoogleNext Booth 7713 cirrascale.com/google
English
0
0
0
97
pjgo รีทวีตแล้ว
Tyler Stalman
Tyler Stalman@stalman·
The moment I realized I was going need to find a tougher test for the MacBook Neo😳
English
362
1.1K
21.9K
8.5M
Nathan Benaich
Nathan Benaich@nathanbenaich·
Memorial Day 🌇
Nathan Benaich tweet media
English
3
0
10
1.6K
pjgo รีทวีตแล้ว
Andrej Karpathy
Andrej Karpathy@karpathy·
We're missing (at least one) major paradigm for LLM learning. Not sure what to call it, possibly it has a name - system prompt learning? Pretraining is for knowledge. Finetuning (SL/RL) is for habitual behavior. Both of these involve a change in parameters but a lot of human learning feels more like a change in system prompt. You encounter a problem, figure something out, then "remember" something in fairly explicit terms for the next time. E.g. "It seems when I encounter this and that kind of a problem, I should try this and that kind of an approach/solution". It feels more like taking notes for yourself, i.e. something like the "Memory" feature but not to store per-user random facts, but general/global problem solving knowledge and strategies. LLMs are quite literally like the guy in Memento, except we haven't given them their scratchpad yet. Note that this paradigm is also significantly more powerful and data efficient because a knowledge-guided "review" stage is a significantly higher dimensional feedback channel than a reward scaler. I was prompted to jot down this shower of thoughts after reading through Claude's system prompt, which currently seems to be around 17,000 words, specifying not just basic behavior style/preferences (e.g. refuse various requests related to song lyrics) but also a large amount of general problem solving strategies, e.g.: "If Claude is asked to count words, letters, and characters, it thinks step by step before answering the person. It explicitly counts the words, letters, or characters by assigning a number to each. It only answers the person once it has performed this explicit counting step." This is to help Claude solve 'r' in strawberry etc. Imo this is not the kind of problem solving knowledge that should be baked into weights via Reinforcement Learning, or least not immediately/exclusively. And it certainly shouldn't come from human engineers writing system prompts by hand. It should come from System Prompt learning, which resembles RL in the setup, with the exception of the learning algorithm (edits vs gradient descent). A large section of the LLM system prompt could be written via system prompt learning, it would look a bit like the LLM writing a book for itself on how to solve problems. If this works it would be a new/powerful learning paradigm. With a lot of details left to figure out (how do the edits work? can/should you learn the edit system? how do you gradually move knowledge from the explicit system text to habitual weights, as humans seem to do? etc.).
English
715
1K
10.4K
1.5M
pjgo
pjgo@pjgo·
@geoffwolfe Wow. Is that a multi-billion dollar property?
English
2
0
0
68
pjgo รีทวีตแล้ว
Ashlee Vance
Ashlee Vance@ashleevance·
The world's richest human and the secretary of transportation just casually hammering out disaster relief on X. What a bizzarely sane moment in the midst of our long national nightmare that is an election year
Ashlee Vance tweet mediaAshlee Vance tweet media
English
524
2.3K
38.8K
2.4M
pjgo รีทวีตแล้ว
Dan Nystedt
Dan Nystedt@dnystedt·
Foxconn will begin mass producing Nvidia GB200 servers in November, with shipments in December, media report, citing company spokesman James Wu, who said Foxconn will display a GB200 server at its HHTD (tech days event) Oct. 8-9. Foxconn will supply both NVL36 and NVL72 servers, but expects demand to target NVL72. Aside from the GPUs, around 80% of other components in the servers are made by Foxconn (trade name of Hon Hai Precision Industry). Foxconn is the global leader in AI server production. $NVDA #AIserver #server #Foxconn #semiconductors news.cnyes.com/news/id/5729067
English
2
23
117
18.3K
pjgo
pjgo@pjgo·
@drorpoleg @nbaschez This: “Bring a carrier so you can wear him/her and walk around.” Baby can also sleep while in the carrier.
English
0
0
2
103
Dror Poleg
Dror Poleg@drorpoleg·
@nbaschez Bring a car seat. Fly at night. Breastfeed/bottle at take off and landing. Bring a carrier so you can wear him/her and walk around.
English
1
0
4
509
Nathan Baschez
Nathan Baschez@nbaschez·
Parents: any tips for handling long international flights with a 10-month-old?
English
80
0
44
66.4K
pjgo รีทวีตแล้ว
Cerebras
Cerebras@cerebras·
🎉 Exciting news! Today we are releasing Cerebras-GPT, a family of 7 GPT models from 111M to 13B parameters trained using the Chinchilla formula. These are the highest accuracy models for a compute budget and are available today open-source! (1/5) Press: businesswire.com/news/home/2023…
English
31
317
1.3K
503.6K
pjgo รีทวีตแล้ว
Lior Alexander
Lior Alexander@LiorOnAI·
This might be the most eventful week AI has ever seen: Monday: -Stanford Alpaca 7B Tuesday: -GPT4 -Anthropic releases Claude -Google's PaLM API -AdeptAI raises $350M -Google adds GenAI to workspaces Wednesday: -Pytorch 2.0 -MidjourneyV5 Thursday: -Microsoft 365 Copilot
English
78
906
4.2K
898.2K
pjgo รีทวีตแล้ว
Tiernan Ray
Tiernan Ray@TiernanRayTech·
If you’re working on large language models on a budget, Cerebras and Cirrascale offer a cloud service starting at $2,500. Large language models are about to be a giant commercial phenomenon, says Cerebras CEO Feldman. // @cerebras // #AI #deeplearning #machinelearning
ZDNET@ZDNET

AI challenger Cerebras unveils 'pay-per-model' large-model AI cloud service with Cirrascale, Jasper zd.net/3ELDHYl by @TiernanRayTech

English
0
1
2
0
pjgo รีทวีตแล้ว
Sebastian Raschka
Sebastian Raschka@rasbt·
Utilizing multiple GPUs for deep learning? My workflow is usually as follows: 1. Implement: Tinker and debug on the CPU 2. Test: Try the first run on a single GPU 3. Run wide: many experiments on many GPUs 4. Hone in: run a few bigger, promising experiments via multi-GPU
English
11
26
254
0
pjgo รีทวีตแล้ว
Exa
Exa@ExaAILabs·
metaphor.systems is now publicly available! Metaphor is a search engine based on generative AI, the same sorts of techniques behind DALL-E 2 and GPT-3 1/
English
73
543
2.7K
0