agibsonccc

10.3K posts

agibsonccc banner
agibsonccc

agibsonccc

@agibsonccc

Founder @KonduitAI Maintainer Eclipse Deeplearning4j Building https://t.co/GDg4JPHYso - #MLOps YC W16

Tokyo-to, Japan Katılım Nisan 2011
2.4K Takip Edilen3K Takipçiler
agibsonccc retweetledi
Yifei Hu
Yifei Hu@hu_yifei·
This is crazy! Especially for a 580M parameter model. The benchmark scores are not SOTA but it's not fair to compare it with models that are 5-10x larger. This is a great exploration on multi-purpose OCR model. GOT-OCR paper: arxiv.org/abs/2409.01704
Yifei Hu tweet media
English
4
20
177
11.1K
agibsonccc retweetledi
Soumith Chintala
Soumith Chintala@soumithchintala·
Why do 16k GPU jobs fail? The Llama3 paper has many cool details -- but notably, has a huge infrastructure section that covers how we parallelize, keep things reliable, etc. We hit an overall 90% effective-training-time. ai.meta.com/research/publi…
Soumith Chintala tweet media
English
32
202
1.3K
380.5K
agibsonccc retweetledi
Jeremy Howard
Jeremy Howard@jeremyphoward·
You should probably stop whatever you're doing and watch this right now, because it's amazing. youtube.com/watch?v=TtVJ4J…
YouTube video
YouTube
English
55
282
1.1K
262.1K
agibsonccc retweetledi
Daniel Jeffries
Daniel Jeffries@Dan_Jeffries1·
Lot of absurd takes like this on the superalignment team leaving OpenAI. The more likely reason they left is not because Ilya and Jan saw some super advanced AI emerging that they couldn't handle but that they didn't and as the cognitive dissonance hit, OpenAI and other practical teams building real world AI are realizng this fantasy of super intelligent machines rising up and getting out of control is a waste of time, money and resources. So they slowly and correctly starved that team of compute that could be used for more useful things like building capabilities into their products, which is what AI are, products.
Daniel Jeffries tweet media
English
300
375
3.2K
1M
agibsonccc retweetledi
Jeremy Howard
Jeremy Howard@jeremyphoward·
There's a new bill, SB-1047 "Safe and Secure Innovation for Frontier Artificial Intelligence Models Act". I think it could do a great deal of harm to startups, American innovation, open source, and safety. So I've written a response to the authors: 🧵 answer.ai/posts/2024-04-…
English
34
277
1.1K
304.6K
agibsonccc retweetledi
Christopher Manning
Christopher Manning@chrmanning·
One of the simplest but most useful and appropriate pieces of AI regulation to adopt at the moment is to require model providers to document the training data they used. This is something that the @EU_Commission AI Act gets right … on p.62 of its 272 pages (!).
Christopher Manning tweet media
Brian Merchant@bcmerchant

So when *the CTO* of OpenAI is asked if Sora was trained on YouTube videos, she says “actually I’m not sure” and refuses to discuss all further questions about the training data. Either a rather stunning level of ignorance of her own product, or a lie—pretty damning either way!

English
12
59
317
49.2K
agibsonccc retweetledi
Jeremy Howard
Jeremy Howard@jeremyphoward·
How to write CUDA on AMD.
Jeremy Howard tweet media
English
16
60
699
77.8K
agibsonccc retweetledi
Christopher Manning
Christopher Manning@chrmanning·
I do not believe human-level AI (artificial superintelligence, or the commonest sense of #AGI) is close at hand. AI has made breakthroughs, but the claim of AGI by 2030 is as laughable as claims of AGI by 1980 are in retrospect. Look how similar the rhetoric was in @LIFE in 1970!
Christopher Manning tweet media
English
113
355
1.6K
387.2K
agibsonccc retweetledi
agibsonccc retweetledi
agibsonccc retweetledi
Delip Rao e/σ
Delip Rao e/σ@deliprao·
Not your model, not your GPTs. All those folks rushing to add stuff to the GPT store are writing free functions for another OpenAI llm that they will brand as “AGI”. Almost free labor extraction. I will bet, for most folks, the revenue share will be pennies. Folks who think this is an iOS App Store moment are deluded.
Patrick Blumenthal@PatrickJBlum

EpsteinGPT has been officially banned. Why?

English
28
44
355
114.3K
agibsonccc retweetledi
Alex J. Champandard 🌱
🙋: Why didn't authors collaborate w/ LAION? The approach LAION took to filtering is not standard and can not be considered serious or professional in any way. Many actions they took in this unusual process fall under the Criminal Code. This makes collaboration... suboptimal.
English
1
3
51
22K
agibsonccc retweetledi
Stella Biderman
Stella Biderman@BlancheMinerva·
TIL: @GoogleAI's 1.6T parameter mixture-of-experts encoder-decoder model is available under an Apache 2.0 license! Trained on public data too.
English
11
60
484
136.8K
agibsonccc retweetledi
Delip Rao e/σ
Delip Rao e/σ@deliprao·
I sent this to a reporter in response to a query yesterday when things didn't look this crazy. Now that it is clear the future of OpenAI is uncertain, we should encourage all companies to build on resilient AI technology that only open source can offer.
Delip Rao e/σ tweet media
English
5
12
79
22.5K
agibsonccc retweetledi
anton
anton@abacaj·
“But but OpenAI models are so much better” - try fine tuning some open source models first like this
Adithyan@adithyan_ai

I burned in🔥2000$ in finetuning so you don't have to. I fine-tuned models with @OpenAI and @anyscalecompute API endpoints with 50million tokens. Here are the results I wish I knew before getting into finetuning. If you just want a quick snapshot, look at the figure. A longer explanation follows, explaining my findings. I am not an expert and not deep into theory of AI models. I just want to get the BEST model performance at the CHEAPEST possible price for my USE-CASE. And quickly deploy that to prod. I picked one specific simple USE-CASE. Summarizing text in a very specific tone, voice and a very specific structure. Trained both models with close to 50M tokens (~37M words). In short, - Anyscale costs 40X cheaper to finetune. - Anyscale costs 56x cheaper to finetune. Comparing the outputs, I get on par performance from llama-13b-fine-tuned as gpt-3.5-fine-tuned. Finetuning smaller models is clearly the way to go for simpler use-cases! I don't understand OpenAI's offering for fine-tuning here. They need to step-up the game. They need to either reduce the price or offer flexibility to compete with open-source fine-tuning models. I am going to run an another experiment which is a way more complicated use-case. It would be interesting to see who wins here. I suspect @OpenAI Turbo will have an edge here (otherwise the pricing does not make sense). P.S : I also know I can finetune models locally & directly without API. Like I said, I am not deep into theory yet. I tried this in @huggingface with their auto-train framework. But it was just not as easy as plugging in via API calls. There were adapters and stuff, and I got quickly lost. But I am reading up and will try start including them in the comparisons too. If anyone is aware of other managed/otherwise solutions for finetuning let me know please.

English
7
34
285
59.9K
agibsonccc retweetledi
Andrew Ng
Andrew Ng@AndrewYNg·
Attending ⁦@geoffreyhinton⁩’s retirement celebration at Google with old friends. Thank you for everything you’ve done for AI! ⁦@JeffDean⁩ ⁦@quocleix
Andrew Ng tweet media
English
67
223
3.8K
854.3K