agibsonccc

10.3K posts

agibsonccc

@agibsonccc

Founder @KonduitAI Maintainer Eclipse Deeplearning4j Building https://t.co/GDg4JPHYso - #MLOps YC W16

Tokyo-to, Japan Katılım Nisan 2011

2.4K Takip Edilen3K Takipçiler

agibsonccc retweetledi

Yifei Hu@hu_yifei·11 Eyl

This is crazy! Especially for a 580M parameter model. The benchmark scores are not SOTA but it's not fair to compare it with models that are 5-10x larger. This is a great exploration on multi-purpose OCR model. GOT-OCR paper: arxiv.org/abs/2409.01704

English

177

11.1K

agibsonccc@agibsonccc·12 Eyl

We are comparing a low level execution runtime, a python framework people use, and an llm..this is deranged. These things don't even compete.

James Wang@draecomino

In 2012 CUDA was very important. You can't build anything without it. In 2024 90% of AI developers are actually web developers – and they build off Llama, not CUDA.

English

291

agibsonccc retweetledi

Jeremy Howard@jeremyphoward·9 Ağu

I used to be in the "benefit of the doubt" camp, but can no longer -- so lemme be clear: this is quite simply a lie. Those pushing SB1047 *know* that nearly all open source AI devs rely on fine-tuned models, which *are* blocked by this bill.

martin_casado@martin_casado

Great article by @Priyasideas on increasing opposition to SB 1047. sfstandard.com/2024/08/09/may…

English

466

68K

agibsonccc retweetledi

Soumith Chintala@soumithchintala·23 Tem

Why do 16k GPU jobs fail? The Llama3 paper has many cool details -- but notably, has a huge infrastructure section that covers how we parallelize, keep things reliable, etc. We hit an overall 90% effective-training-time. ai.meta.com/research/publi…

English

202

1.3K

380.5K

agibsonccc retweetledi

Jeremy Howard@jeremyphoward·1 Tem

You should probably stop whatever you're doing and watch this right now, because it's amazing. youtube.com/watch?v=TtVJ4J…

YouTube

English

282

1.1K

262.1K

agibsonccc retweetledi

Daniel Jeffries@Dan_Jeffries1·18 May

Lot of absurd takes like this on the superalignment team leaving OpenAI. The more likely reason they left is not because Ilya and Jan saw some super advanced AI emerging that they couldn't handle but that they didn't and as the cognitive dissonance hit, OpenAI and other practical teams building real world AI are realizng this fantasy of super intelligent machines rising up and getting out of control is a waste of time, money and resources. So they slowly and correctly starved that team of compute that could be used for more useful things like building capabilities into their products, which is what AI are, products.

English

300

375

3.2K

agibsonccc retweetledi

Jeremy Howard@jeremyphoward·29 Nis

There's a new bill, SB-1047 "Safe and Secure Innovation for Frontier Artificial Intelligence Models Act". I think it could do a great deal of harm to startups, American innovation, open source, and safety. So I've written a response to the authors: 🧵 answer.ai/posts/2024-04-…

English

277

1.1K

304.6K

agibsonccc retweetledi

hardmaru@hardmaru·1 Nis

Tokyo will become a Global R&D Powerhouse in AI.

Marika Katanuma@marikakatanuma

Open AI is planning to open its first Asia based office in Tokyo. ..following the path of @SakanaAILabs? bloomberg.com/news/articles/…

English

549

107.7K

agibsonccc retweetledi

Christopher Manning@chrmanning·14 Mar

One of the simplest but most useful and appropriate pieces of AI regulation to adopt at the moment is to require model providers to document the training data they used. This is something that the @EU_Commission AI Act gets right … on p.62 of its 272 pages (!).

Brian Merchant@bcmerchant

So when *the CTO* of OpenAI is asked if Sora was trained on YouTube videos, she says “actually I’m not sure” and refuses to discuss all further questions about the training data. Either a rather stunning level of ignorance of her own product, or a lie—pretty damning either way!

English

317

49.2K

agibsonccc retweetledi

Jeremy Howard@jeremyphoward·15 Mar

How to write CUDA on AMD.

English

699

77.8K

agibsonccc retweetledi

Christopher Manning@chrmanning·14 Mar

I do not believe human-level AI (artificial superintelligence, or the commonest sense of #AGI) is close at hand. AI has made breakthroughs, but the claim of AGI by 2030 is as laughable as claims of AGI by 1980 are in retrospect. Look how similar the rhetoric was in @LIFE in 1970!

English

113

355

1.6K

387.2K

agibsonccc retweetledi

Soumith Chintala@soumithchintala·4 Mar

This is likely going to become **the** Reference for developing Multimodal LLMs (among other things). Great resource, and great work from a long list of awesome people!

Shayne Longpre@ShayneRedford

New Resource: Foundation Model Development Cheatsheet for best practices We compiled 250+ resources & tools for: 🔭 sourcing data 🔍 documenting & audits 🌴 environmental impact ☢️ risks & harms eval 🌍 release & monitoring With experts from @AiEleuther, @allen_ai, @huggingface, @StanfordCRFM, @PrincetonCITP, @MasakhaneNLP, @MIT++ 🔗 fmcheatsheet.org 1/

English

349

56.9K

agibsonccc retweetledi

Yann LeCun@ylecun·7 Oca

One reason AI research has been progressing so fast is not just because of frequent and early publication of preprint on ArXiv and the exchange of open source code, but also because the ML/AI community has largely freed itself from the stranglehold of for-profit & paywalled scientific publishing.

Yann LeCun@ylecun

Thankfully, ML/AI research is largely free of the commercial publishing stranglehold. Preprints are posted on ArXiv and OpenReview, short papers in top conferences like ICLR, NeurIPS, ICML, and a few others, and longer papers in JMLR and TMLR. All of these venues are open access and free for both readers and authors. ArXiv and OpenReview are supported by philanthropy (they need more), conferences by registration fees, and journals by nothing (it really costs essentially no money to run an online journal). The exceptions are the few folks who still think that publishing in Nature Machine Intelligence, Machine Learning Journal, and other for-profit journals is a good idea (looking at you, DeepMind 😠). Not surprisingly, many Nature MI papers are about hardware or applications of AI to the sciences, topics that are not well covered by "core" ML venues.

English

351

2.4K

443.4K

agibsonccc retweetledi

Delip Rao e/σ@deliprao·8 Oca

Not your model, not your GPTs. All those folks rushing to add stuff to the GPT store are writing free functions for another OpenAI llm that they will brand as “AGI”. Almost free labor extraction. I will bet, for most folks, the revenue share will be pennies. Folks who think this is an iOS App Store moment are deluded.

Patrick Blumenthal@PatrickJBlum

EpsteinGPT has been officially banned. Why?

English

355

114.3K

agibsonccc retweetledi

Alex J. Champandard 🌱@alexjc·21 Ara

🙋: Why didn't authors collaborate w/ LAION? The approach LAION took to filtering is not standard and can not be considered serious or professional in any way. Many actions they took in this unusual process fall under the Criminal Code. This makes collaboration... suboptimal.

English

22K

agibsonccc@agibsonccc·23 Kas

LlamaGpt6 assimilated Satya and took over Microsoft.

Christopher Nguyen ⽗@pentagoniac

I just finished a two-day company quarterly strategy meeting. I haven’t missed anything, have I? Satya is still CEO at MSFT, right?

English

204

agibsonccc retweetledi

Stella Biderman@BlancheMinerva·19 Kas

TIL: @GoogleAI's 1.6T parameter mixture-of-experts encoder-decoder model is available under an Apache 2.0 license! Trained on public data too.

English

484

136.8K

agibsonccc retweetledi

Delip Rao e/σ@deliprao·20 Kas

I sent this to a reporter in response to a query yesterday when things didn't look this crazy. Now that it is clear the future of OpenAI is uncertain, we should encourage all companies to build on resilient AI technology that only open source can offer.

English

22.5K

agibsonccc retweetledi

anton@abacaj·19 Eki

“But but OpenAI models are so much better” - try fine tuning some open source models first like this

Adithyan@adithyan_ai

I burned in🔥2000$ in finetuning so you don't have to. I fine-tuned models with @OpenAI and @anyscalecompute API endpoints with 50million tokens. Here are the results I wish I knew before getting into finetuning. If you just want a quick snapshot, look at the figure. A longer explanation follows, explaining my findings. I am not an expert and not deep into theory of AI models. I just want to get the BEST model performance at the CHEAPEST possible price for my USE-CASE. And quickly deploy that to prod. I picked one specific simple USE-CASE. Summarizing text in a very specific tone, voice and a very specific structure. Trained both models with close to 50M tokens (~37M words). In short, - Anyscale costs 40X cheaper to finetune. - Anyscale costs 56x cheaper to finetune. Comparing the outputs, I get on par performance from llama-13b-fine-tuned as gpt-3.5-fine-tuned. Finetuning smaller models is clearly the way to go for simpler use-cases! I don't understand OpenAI's offering for fine-tuning here. They need to step-up the game. They need to either reduce the price or offer flexibility to compete with open-source fine-tuning models. I am going to run an another experiment which is a way more complicated use-case. It would be interesting to see who wins here. I suspect @OpenAI Turbo will have an edge here (otherwise the pricing does not make sense). P.S : I also know I can finetune models locally & directly without API. Like I said, I am not deep into theory yet. I tried this in @huggingface with their auto-train framework. But it was just not as easy as plugging in via API calls. There were adapters and stuff, and I got quickly lost. But I am reading up and will try start including them in the comparisons too. If anyone is aware of other managed/otherwise solutions for finetuning let me know please.

English

285

59.9K

agibsonccc retweetledi

Andrew Ng@AndrewYNg·16 Eki

Attending ⁦@geoffreyhinton⁩’s retirement celebration at Google with old friends. Thank you for everything you’ve done for AI! ⁦@JeffDean⁩ ⁦@quocleix⁩

English

223

3.8K

854.3K

Keşfet

@EU_Commission @LIFE @GoogleAI @geoffreyhinton @JeffDean @quocleix @elonmusk @BarackObama