
Doug Davidoff
14.1K posts

Doug Davidoff
@dougdavidoff
The Traditional Growth Playbook Is Broken. With @DemandCreator Team, We're Leading The Design of The New One - 1 Geared to be a Win for Buyer & Seller.






After I was convicted of murder and sentenced to 26 years in prison, when the earth dropped out from beneath me, and global shame rained down on top of me, I had my first ever epiphany. /thread









After reading the @nytimes lawsuit against @OpenAI and @Microsoft, I find my sympathies more with OpenAI and Microsoft than with the NYT. The suit: (1) Claims, among other things, that OpenAI and Microsoft used millions of copyrighted NYT articles to train their models (2) Gives examples in which OpenAI models regurgitated NYT articles almost verbatim But the presentation muddies (1) and (2), and I saw a lot of commentary on social media that -- because of what I believe is a muddied presentation -- draws a link between them that I'm not sure is what people think it is. On (1): I understand why media companies don't like people training on their documents, but believe that just as humans are allowed to read documents on the open internet, learn from them, and synthesize brand new ideas, AI should be allowed to do so too. I would like to see training on the public internet covered under fair use -- society will be better off this way -- though whether it actually is will ultimately be up to legislators and the courts. On (2): I suspect a lot of the examples of ChatGPT regurgitating articles nearly verbatim were due to a RAG-like mechanism where the user prompt causes the system to browse the web, retrieve a specific article and then print it out. (If my statement here isn't accurate, I would love to see the @nytimes clarify this.) If this is the case, then (i) To OpenAI's credit, they seem to have already updated their software to make this much less likely, and (ii) This is also a much easier problem to fix than if an LLM were to regurgitate text using only the pre-trained weights, which AFAIK very rarely happens (and which, given its rarity, also raises the question of how much harm to NYT this has actually caused). To be clear, I believe independent media is important for democracy and must be protected. I also sympathize with media businesses worried about Generative AI disrupting their businesses. But I'm not convinced the NYT lawsuit is the right way to do this. Usual caveat: I am not a lawyer and am not giving legal advice or any other form of advice here. You can also read more details of my take on this below. deeplearning.ai/the-batch/issu…




