Marc Ratkovic

94 posts

Marc Ratkovic

Marc Ratkovic

@MarcRatkovic

Professor. Statistical methods in political science, particularly machine learning for causal inference. Really getting into large language models.

Katılım Şubat 2019
67 Takip Edilen165 Takipçiler
Sabitlenmiş Tweet
Marc Ratkovic
Marc Ratkovic@MarcRatkovic·
I'm hiring!!! Grad students/Pre-docs/Post-docs. LLMs for certain. Causal inference is also likely. We're building a top-notch, supportive, kinetic, and downright awesome community @GESSuniMannheim Formal announcement coming, but email me MarcRatkovic@gmail.com w any Q's.
English
1
21
46
9.1K
Marc Ratkovic
Marc Ratkovic@MarcRatkovic·
I've taken to talking about LLMs as "innovating" instead of "intelligent" or "conscious." "Innovating"=doing something I didn't train/tell them to, no muss no fuss. Hopefully this paper can give us the right words to talk about consciousness! arxiv.org/abs/2308.08708
Marc Ratkovic tweet media
English
1
0
1
484
Hamel Husain
Hamel Husain@HamelHusain·
What's your favorite/easiest way of doing distributing fine-tuning when your LLM doesn't fit on 1 GPU? I'm collecting a list
English
18
22
227
90.6K
Marc Ratkovic
Marc Ratkovic@MarcRatkovic·
Writing is a process--training a LLM inspired by writing pedagogy. The idea that LLMs learn writing the same as us is a stretch, but there _must_ be quite a bit practical educators can add. huggingface.co/papers/2308.08…
Marc Ratkovic tweet media
English
0
0
1
476
Marc Ratkovic retweetledi
Gamaleldin Elsayed
Gamaleldin Elsayed@gamaleldinfe·
Nature Comms paper: Subtle adversarial image manipulations influence both human and machine perception! We show that adversarial attacks against computer vision models also transfer (weakly) to humans, even when the attack magnitude is small. nature.com/articles/s4146…
GIF
English
12
89
384
84K
Marc Ratkovic retweetledi
fly51fly
fly51fly@fly51fly·
[CL] AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework Q Wu, G Bansal, J Zhang, Y Wu, S Zhang, E Zhu, B Li, L Jiang, X Zhang, C Wang [Pennsylvania State University & Microsoft & University of Washington] (2023) arxiv.org/abs/2308.08155
fly51fly tweet mediafly51fly tweet mediafly51fly tweet mediafly51fly tweet media
English
0
2
2
941
Marc Ratkovic
Marc Ratkovic@MarcRatkovic·
@vithursant19 Cool stuff! And I like the idea at an intuitive level--there's strong pathways that need to be learned then more complex ones can be followed. Is it possible to put an L1 on the neurons? Or like a LARS algorithm?
English
0
0
0
22
Vithu Thangarasa
Vithu Thangarasa@vithursant19·
Also, check out our more recent follow up work on Variable SPDF that was presented at #ICML2023, where we show how a 75% sparse 6.7B Cerebras-GPT model can do as well as it's dense counterpart! cerebras.net/blog/accelerat…
English
1
0
1
131
Vithu Thangarasa
Vithu Thangarasa@vithursant19·
Excited and grateful to present our paper on "Sparse Pretraining and Dense Fine-tuning for LLMs" at @UncertaintyInAI! Engaging in deep discussions with brilliant researchers has been an enriching experience. Look forward to sharing insights and learning from others in the field!
Vithu Thangarasa tweet mediaVithu Thangarasa tweet media
English
1
1
15
1.3K
Marc Ratkovic retweetledi
fly51fly
fly51fly@fly51fly·
[CL] Teach LLMs to Personalize -- An Approach inspired by Writing Education C Li, M Zhang, Q Mei, Y Wang, S A Hombaiah, Y Liang, M Bendersky [Google] (2023) arxiv.org/abs/2308.07968
fly51fly tweet mediafly51fly tweet mediafly51fly tweet media
English
0
3
5
816
Marc Ratkovic
Marc Ratkovic@MarcRatkovic·
Cool stuff! Looking forward to benchmarking. From 32 to 16 to 8 to 4 to 2 bit quantization--will we be working with sums of booleans at some point? (And what does the quantization/# of parameters tradeoff look like?)
English
0
0
0
264
Marc Ratkovic retweetledi
Jim Fan
Jim Fan@DrJimFan·
There're few who can deliver both great AI research and charismatic talks. OpenAI Chief Scientist @ilyasut is one of them. I watched Ilya's lecture at Simons Institute, where he delved into why unsupervised learning works through the lens of compression. Sharing my notes: - Kolmogorov compressor is the theoretical shortest-length program that produces a dataset. SGD is a practical approximation of the Kolmogorov search that finds an implicit program embedded in the weights of a soft computer, i.e. big Transformers. - Unsupervised learning is about computing the conditional Kolmogorov complexity of a target dataset given an unlabelled corpus, i.e. K(Y|X) - Theory tells us that optimizing for K(X, Y), the joint complexity, is as good as K(Y|X). So simply throw all data into the mix, and "just compress everything". - Joint compression is maximum likelihood over the giant concatenated dataset. - Ilya cites iGPT, Chen et al. 2020, to illustrate the ideas. iGPT is an image compressor that learns to predict the next pixel using a 1D sequence model. This is a phenomenal lecture, very accessible, and sometimes quite entertaining. YouTube: youtube.com/watch?v=AKMuA_… Lecture page: simons.berkeley.edu/talks/ilya-sut…
YouTube video
YouTube
English
53
418
2.7K
821.9K
Marc Ratkovic
Marc Ratkovic@MarcRatkovic·
Good to know!! RoT train on 4 epochs, so reuse training data. May depend on specifics of model, there's a scaling law, more thoughtful details in paper. arxiv.org/abs/2305.16264
Marc Ratkovic tweet media
English
0
0
1
316
Marc Ratkovic
Marc Ratkovic@MarcRatkovic·
Doing more with less! A small model trained w/ wide range of prompts can outperform larger models (GPT3, but not 4). For constrained tasks, smaller-with-a-wider-variety-of-high-quality-training-types can hit the same performance on a single task. arxiv.org/abs/2305.16264
English
0
0
1
196
Marc Ratkovic retweetledi
MZES Uni Mannheim
MZES Uni Mannheim@MZESUniMannheim·
❗️ Researchers often rely on third-party entities to field surveys. Therefore, it is important to verify the sincerity of their conduct. In a project funded by the University of #Mannheim, @fraukolos & colleagues examined ways to detect falsified and fabricated interviews. (1/5)
MZES Uni Mannheim tweet media
English
1
1
6
2.1K
Marc Ratkovic retweetledi
Vlad Lialin
Vlad Lialin@guitaricet·
For all PhD students in small labs: find all possible ways to collaborate with well-known open research groups like @AiEleuther @laion_ai @BigscienceW @BigCodeProject; apply to every single fellowship and look for connections. It’s not optional if you want to have a career.
English
2
4
47
6K
Marc Ratkovic
Marc Ratkovic@MarcRatkovic·
The quality of training data matters! A lot! And feeding these models well-curated data (real or synthetic) _really_ helps. Also: pre-training loss is a great predictor of accuracy. arxiv.org/pdf/2308.01825…
English
0
0
0
122