Sam Gijsen

69 posts

Sam Gijsen banner
Sam Gijsen

Sam Gijsen

@SamCJG

ML @ Tübingen AI Center, Hertie AI

参加日 Ağustos 2024
207 フォロー中33 フォロワー
Sam Gijsen
Sam Gijsen@SamCJG·
@_ueaj Actually pretty relevant for one of my projects; dropout you mean good old attn+ffn dropout?
English
0
0
0
21
ueaj
ueaj@_ueaj·
It would be very fun to apply all the data augmentation/data effeciency techniques we've learned over the years but weren't applicable to LLMs due to the abundance of data + all the new ones. dropout, layer looping, massive overparameterization, muon, optimal hparams, etc.
David Duvenaud@DavidDuvenaud

Announcing Talkie: a new, open-weight historical LLM! We trained and finetuned a 13B model on a newly-curated dataset of only pre-1930 data. Try it below! with @AlecRad and @status_effects 🧵

English
4
0
55
4.1K
Michael Levin
Michael Levin@drmichaellevin·
The other part of the heart/cancer story is control of resting potential. Cardiac cells have a very strong control of their bioelectric state, and are extremely resistant to the kind of persistent depolarization necessary for the cancer phenotype. pmc.ncbi.nlm.nih.gov/articles/PMC42… thoughtforms.life/cancer-and-cel… Neural cells too, but they can be forced into mitosis etc. by persistent depolarizarion: sciencemag.org/cgi/reprint/19…
English
12
36
235
8.3K
David Sinclair
David Sinclair@davidasinclair·
Wondered why we don’t hear about heart cancer? Contraction-sensing Nesprin-2 protein is discovered to prevent heart cancer By causing cells to pulse (or adding in Nespirin) we might be able to treat cancer in other organs 👏 @ScienceMagazine
David Sinclair tweet media
English
31
174
1.4K
63.2K
vik
vik@vikhyatk·
@BelieveOnJesus there's a big divide between people who became programmers because they wanted to make video games, and people who joined the industry because someone told them it would make them money
English
2
0
24
2.2K
vik
vik@vikhyatk·
"AI is going to wipe out programming jobs." Then why is every programmer I know working 20 hours a day ever since they started using AI? I thought this technology was going to free us from the toil of labor.
English
164
107
2.7K
197.5K
Sam Gijsen
Sam Gijsen@SamCJG·
@teortaxesTex There’s a whole cohort of Elon-bucks farming accounts which aim to get these kind of quote tweets via sycophancy
English
0
0
12
1.1K
Sam Gijsen
Sam Gijsen@SamCJG·
@vikhyatk They need to introduce daily plans so we can keep up with these recommendations
English
0
0
7
192
Sam Gijsen
Sam Gijsen@SamCJG·
Come check out our poster on training brain foundation models using self-distillation at @iclr_conf Friday 10:30AM!
Sam Gijsen tweet media
English
0
0
1
39
kache
kache@yacineMTB·
kache tweet media
ZXX
6
3
43
4.7K
kalomaze
kalomaze@kalomaze·
@celestepoasts doing this at a bs approaching pretraining CLT sounds absolutely brutal at small scale. i have low faith both in the concept of "doing it on a model smaller than ~32b" and "doing it without per step batch sizes in the thousands" and ofc you have to be VERY rigorous wrt goodhart
English
1
0
10
250
Celeste
Celeste@celestepoasts·
wonder if pretraining scaling laws are holding at all, seems like something only a frontier lab would find out if they don't, and they obviously wouldn't publicize this
English
3
0
84
6.2K
Sam Gijsen
Sam Gijsen@SamCJG·
@kuberdenis Go to the lakes south-west, go for walks in the forests there or ferry tour over the lakes or visit Potsdam
English
0
0
1
25
Denislav Gavrilov
Denislav Gavrilov@kuberdenis·
what should i do in Berlin? i am stuck here for two more days
English
17
0
16
2.6K
Sam Gijsen
Sam Gijsen@SamCJG·
@sedielem Never got a semanticist setup to work with neural data, wondering if there's something about natural image statistics thats particularly favourable to this kind of approach.
English
1
0
1
136
Sander Dieleman
Sander Dieleman@sedielem·
FlexTok/Semanticist provided an elegant recipe to learn semantically coarse-to-fine sequence representations of images. This works for video as well: preserve the temporal axis, replace the spatial axes with a semantic coarse-to-fine axis. Promising for long video generation!
Andrei Atanov@andrew_atanov

Are all videos worth the same number of tokens? Whether rich in motion or visually minimal, standard 3D-grid tokenizers treat them equally. We present VideoFlexTok, which represents videos using a flexible-length, coarse-to-fine sequence of tokens. Page: videoflextok.epfl.ch Demo: huggingface.co/spaces/EPFL-VI… Paper: arxiv.org/abs/2604.12887 1/n

English
1
9
66
9.9K
vik
vik@vikhyatk·
distilling from 1B to 400M is going well
vik tweet media
English
5
5
112
16.5K
vik
vik@vikhyatk·
haven't watched youtube shorts ever since the training cluster came online
English
5
1
35
1.9K
Sam Gijsen
Sam Gijsen@SamCJG·
@robustus 50 cents to the miners, 100 bucks to nvidia. Jensen wins again
English
0
0
0
80
clem 🤗
clem 🤗@ClementDelangue·
"But here is what we found when we tested: We took the specific vulnerabilities Anthropic showcases in their announcement, isolated the relevant code, and ran them through small, cheap, open-weights models. Those models recovered much of the same analysis. Eight out of eight models detected Mythos's flagship FreeBSD exploit, including one with only 3.6 billion active parameters costing $0.11 per million tokens. A 5.1B-active open model recovered the core chain of the 27-year-old OpenBSD bug." aisle.com/blog/ai-cybers…
English
111
344
2.4K
723K
ueaj
ueaj@_ueaj·
Worryingly, ontologically, all the zero-days were already in the training data
English
13
22
1.1K
72.5K
Sam Gijsen
Sam Gijsen@SamCJG·
@thetrocro @LynAldenContact They only talk about exploits that have already been fixed, but currently bitcoin isnt mentioned in the 244 page model card
English
0
0
2
432
Lyn Alden
Lyn Alden@LynAldenContact·
Someone should write a book about this.
Haseeb >|<@hosseeb

This is terrifying. @AnthropicAI 's new unreleased Mythos model is so good at hacking, it found bugs in "every major operating system and web browser." 83.1% were exploited on first attempt. This thing is like COVID but for software. Actually apocalyptic in the wrong hands.

English
57
85
1.1K
205.2K
Sam Gijsen
Sam Gijsen@SamCJG·
Reading claude mythos model card and slotted in this activity for tomorrow morning
Sam Gijsen tweet media
English
0
0
1
54
Mario Zechner
Mario Zechner@badlogicgames·
@thsottiaux oh no :( but also: it's definitely not europe! we're like 20 AI years behind.
English
5
0
28
2.5K