Marco Fahmi

2.7K posts

Marco Fahmi banner
Marco Fahmi

Marco Fahmi

@dataronin

Mostly lurking.

Brisbane, Australia Katılım Ocak 2011
913 Takip Edilen411 Takipçiler
Marco Fahmi retweetledi
LSE Impact Blog
LSE Impact Blog@LSEImpactBlog·
✍️Dorothea Strecker, Heinz Pampel, Rouven Schabinger and Nina Leonie Weisweiler, explore how common data repository shutdowns are and suggest what can be done to ensure data preservation in the long-term. #OpenData wp.me/p4m9em-cPR
English
0
9
5
2.9K
Marco Fahmi retweetledi
Jeni Tennison
Jeni Tennison@JeniT·
I am really troubled by the proposal from the Tony Blair Institute for a "National Data Trust" (NDT) for health data, particularly by the idea that the next government might actually go for it. institute.global/insights/polit…
English
2
17
31
5.2K
Marco Fahmi retweetledi
NHMRC CRE on Achieving the Tobacco Endgame
Congratulations 🎉 to CRE researcher Dr Carmen Lim @Cwernlim who has been awarded $660,000 by the @nhmrc to develop a program to understand how pro-vaping campaigns on social media influence young peoples' attitudes towards the use of e-cigarettes.
English
1
3
15
875
Marco Fahmi retweetledi
Kevin Gee
Kevin Gee@kevg1412·
"He's not even a 10x engineer. He's like, 100x, or 1,000x engineer" Gmail creator Paul Buchheit on how Bret Taylor once rewrote Google Maps in a single weekend:
Kevin Gee tweet media
English
60
692
9.1K
2.4M
Marco Fahmi retweetledi
Axios
Axios@axios·
Behind the hype of generative AI, large companies are struggling to deploy the new technology — hitting cost and data management hurdles that are leaving many of their generative AI projects stuck in pilot phase. trib.al/zwRlIVl
English
0
2
11
12.1K
Marco Fahmi retweetledi
Axios
Axios@axios·
As adoption of generative AI grows, providers are hoping that greater transparency about how they do and don't use customers' data will increase those clients' trust in the technology. trib.al/tCr3DVd
English
1
1
7
12.1K
Marco Fahmi retweetledi
Rachel Woods
Rachel Woods@rachel_l_woods·
There's a resurgence of interest in fine tuning LLMs I've yet to see a successful public use case where fine tuning > prompting. But here's where I see fine tuning *mattering*: First, fine tuning is for teaching an LLM specific tasks or behaviors Not teaching an LLM new knowledge. For new knowledge, use Retrieval (store your data in an outside database and strategically pull the right chunks in to give the LLM context to your question) But even in teaching LLMs specific tasks or behaviors - here's the catch... LLMs are remarkably good at picking up tasks and behaviors from just a good prompt THIS is what makes LLMs mind blowing after all So that begs the question. Where is fine tuning actually helpful? Some use cases I could see developing are teaching LLMs tasks that are exceptionally difficult to describe, or fit into ~10 examples you can add to a prompt. One way to think about this: if it would take someone a few weeks doing a task to 'master it' instead of being able to read training materials and get the picture... That *may* be a use case for fine tuning But proceed with caution To truly teach an LLM a new behavior or task, you'll need to treat this like a machine learning project, not just throwing examples in and getting magic in return (which it still blows my mind that ChatGPT does this so well for us). Things like: - Dataset design - Training and test data - Overfitting + more as the tooling around fine tuning gets more sophisticated The other obvious use case is cost. If you can get a super small language model to do a task instead of GPT-4, there's meaningful cost savings there. And if you're using a language model to do large scale tasks like triaging your customer support inbox, or analyzing public data for insights The costs can add up. But if you're wondering where the heck to invest in fine tuning... My answer at the moment for most businesses is still: Make sure you can't do it with prompts.
English
38
56
452
164.1K
Marco Fahmi retweetledi
Meredith Whittaker
Meredith Whittaker@mer__edith·
📢NEW PAPER! Where @davidthewid, @sarahbmyers & I unpack what Open Source AI even is. We find that the terms ‘open’ & ‘open source’ are often more marketing than technical descriptor, and that even the most 'open' systems don't alone democratize AI 1/ papers.ssrn.com/sol3/papers.cf…
Meredith Whittaker tweet mediaMeredith Whittaker tweet media
English
67
618
1.8K
634.9K
Marco Fahmi retweetledi
Bellingcat
Bellingcat@bellingcat·
New updates on Bellingcat's #github this week. The 'whisperbox' API receives audio or video URLs and returns the video transcripts using OpenAI's Whisper model. Designed by Bellingcat discord member github.com/fspoettel Find the tool at: github.com/bellingcat/whi…
English
3
18
95
67.6K
Marco Fahmi
Marco Fahmi@dataronin·
Opinion Paper: “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy sciencedirect.com/science/articl…
English
0
0
1
67
Marco Fahmi retweetledi
Daniel Severo
Daniel Severo@_dsevero·
Best poster award
Daniel Severo tweet media
English
120
3.3K
28.1K
2.6M
Marco Fahmi retweetledi
Peter Griffin
Peter Griffin@petergnz·
Department of Internal Affairs has put out some guidance today on use of generative AI in the public sector...
Peter Griffin tweet media
English
5
19
38
8.9K
Marco Fahmi retweetledi
Justin Alvey
Justin Alvey@justLV·
I “jailbroke” a Google Nest Mini so that you can run your own LLM’s, agents and voice models. Here’s a demo using it to manage all my messages (with help from @onbeeper) 🔊 on, and wait for surprise guest! I thought hard about how to best tackle this and why, see 🧵
English
366
2.5K
13.9K
1.7M