Josip Krapac

6.9K posts

Josip Krapac banner
Josip Krapac

Josip Krapac

@josipK

I'm learning to be patient, but it's going slower than expected.

Barcelona, Cataluña Beigetreten Nisan 2009
935 Folgt326 Follower
Josip Krapac
Josip Krapac@josipK·
This shitty site is still shit. Who'd say!
English
0
0
0
27
Josip Krapac
Josip Krapac@josipK·
@stanmaltman @yoavgo @stanislavfort So precision doesn't matter because I can brute force search for needle in the haystack, as long as my recall makes sure that there's needle in that haystack in the first place? But there must be some precision below which this doesn't pay off, it can't be arbitrarily small.
English
1
0
0
14
(((ل()(ل() 'yoav))))👾
as much as i detest Anthropic's PR stunts, the findings by Aisle Security are also highly misinterpreted. "isolating the relevant code" makes a *huge* difference, it is a MUCH easier task after isolation. in CS terms, verification is much easier than search / solving.
(((ل()(ل() 'yoav))))👾 tweet media(((ل()(ل() 'yoav))))👾 tweet media
English
8
2
83
5.9K
Josip Krapac
Josip Krapac@josipK·
@giffmana The loss was falling so fast it had to kill itself before it reaches AGI.
English
0
0
1
28
Lucas Beyer (bl16)
Lucas Beyer (bl16)@giffmana·
I just achieved the holy grail as a weekend project! Wrote a multimodal training codebase in pure C and CUDA+NCCL with minimal dependencies (no torch!) that achieves 45 MFU with FSDP on four 8xH100 nodes. I might feel cute and open-source it later. Screenshot for proof:
Lucas Beyer (bl16) tweet media
English
45
27
1.1K
62.3K
electro cute
electro cute@eklek_trokut·
stvari koje su me u zadnja 2 dana pinku oraspoložile: - otac (40) i sin (10) gledaju ogromno ukopano gradilište obojica s rukama na leđima - žena koja u košari bicikle vozi pasića - goth metalka u punom getapu koja jede kontiki dok šeće
2
0
18
340
Pedro Domingos
Pedro Domingos@pmddomingos·
@TDaltonC @DarioAmodei Yep, the arrogance is on full display here: he thinks he knows better, and won’t let the Ukrainians decide for themselves.
English
1
0
2
170
Pedro Domingos
Pedro Domingos@pmddomingos·
So @DarioAmodei, you don’t think Ukraine should be allowed to use autonomous weapons against Russia?
English
81
20
455
70.9K
Josip Krapac
Josip Krapac@josipK·
@yishan @danaparish I mostly chat about my projects and (imagined) illnesses. I don't know what you'd enjoyed more!
English
0
0
0
71
Yishan
Yishan@yishan·
As an employer, I would absolutely love to have this kind of info on a candidate and would consider it extremely valuable for making hiring decisions. It is also grossly inappropriate to do and I would never do it. I was also a bit curious about what it would say about me personally so I typed it into ChatGPT.
English
9
0
63
10.2K
Dana Parish
Dana Parish@danaparish·
🤯🤯🤯 HR asked potential candidate to open ChatGPT during interview & ask, “Based on my past conversations, can you analyze my behavioral tendencies?” When she declined, the vibe went south. This is insanely scary & inappropriate.
Dana Parish tweet mediaDana Parish tweet media
English
872
1.8K
10.2K
840.9K
Josip Krapac
Josip Krapac@josipK·
@ChrSzegedy "Are there any works that explore curriculum learning in an unsupervised setting? Does the notion of Y is easier to learn if you first learn X happen automatically in auto-regressive modelling, or diffusion models or any other generative modelling approach?"
English
0
0
0
11
Christian Szegedy
Christian Szegedy@ChrSzegedy·
There must be huge low-hanging fruit in figuring out how to train metacognition incrementally. If the human brain is any good template, then it is quite telling that most of the dramatic cognitive learning happens after the brain has finished growing and loses significantly in plasticity. Humans get more and more effective at learning metacognitive skills even in their twenties. We achieve similar effects by training increasingly large models, but there must be a way to enable incrementally improved metacognitive learning at near-constant compute cost after certain minimal skills are achieved.
English
17
10
158
26.5K
Josip Krapac
Josip Krapac@josipK·
@duolingo something is broken again. Didn't have my x3 today, a bunch of points to today not added... My profile is j88k.
English
0
0
0
27
Josip Krapac
Josip Krapac@josipK·
@giffmana Definitely speeding up (flash attention, kv-cache) and quantization basics.
English
0
0
1
20
Lucas Beyer (bl16)
Lucas Beyer (bl16)@giffmana·
Hey chat, I need your opinions! Later this week, I'll teach my usual Transformers class. However, I just found out that someone is giving "foundations of attention and transformers" lecture before me already. So I'm thinking of still doing a "recap Lucas style" but then spending more time on some topics my lecture usually doesn't cover, or just scratches the surface. What more advanced/recent topics would you like to see included? Keep in mind this is a teaching/class style talk. Some ideas: more in-depth on decoding, kv-cache. Flex/flash/paged attention. Spend more time on multimodal versions? Tokenizers? I think bad ideas: geglu, global/local, rmsnorm, ... I feel like these are all trivially understood and not worth "teaching", though you may convince me otherwise.
Lucas Beyer (bl16) tweet media
English
84
13
598
91.4K
Josip Krapac
Josip Krapac@josipK·
Are there any works that explore curriculum learning in an unsupervised setting? Does the notion of Y is easier to learn if you first learn X happen automatically in auto-regressive modelling, or diffusion models or any other generative modelling approach?
English
0
0
0
79
electro cute
electro cute@eklek_trokut·
u jednom trenutku si mlad u drugom sidiš za kompom pripremaš se za posa i na nogama nosiš eksfolijacijske čarape za svoje ispucale pete
3
0
16
481
Josip Krapac
Josip Krapac@josipK·
@eklek_trokut Javi se kad počneš s kompresijskim čarapama za proširene vene. Ako već nisi!
0
0
1
42
Josip Krapac
Josip Krapac@josipK·
@phillip_isola @francoisfleuret It's very similar to what I think makes sense: minimal change to the existing method(s) that makes the biggest impact. What is missing is how one measures impact, but your idea with "open problems" makes sense. Minimal change forces you to connect to the existing knowledge.
English
0
0
2
46
Phillip Isola
Phillip Isola@phillip_isola·
@francoisfleuret One way around this might be to have a system that penalizes methodological novelty: your reward is the open problems you solved minus the new methods you had to introduce to do so. I think that could be fun to try as a workshop competition or something.
English
4
1
32
2.5K
François Fleuret
François Fleuret@francoisfleuret·
Bitter hot take: There is an incentive in research not to simplify conceptually as much as you can what you invented because you will [generally] end up epsilon away from an existing made-by-giants-20y-ago thing, and nobody cares about your painful journey and mental blisters.
English
11
15
356
21.7K
Alec Stapp
Alec Stapp@AlecStapp·
Poland went from Iran-level of economic development to Japan-level in a single generation
Alec Stapp tweet media
English
794
2.6K
25.7K
4.5M
Josip Krapac
Josip Krapac@josipK·
@LucaAmb @ziv_ravid @yoavgo I can swap in arguments (variables) in that "reasoning program" provided that "argument types" are "the same" and get a valid result.
English
0
0
0
25
Josip Krapac
Josip Krapac@josipK·
@LucaAmb @ziv_ravid @yoavgo But if person gives me reasoning steps, and they entail each other I'm more likely to believe both the result and that it has been reached by reasoning. I can apply the same steps and arrive at the same results.
English
1
0
0
27
(((ل()(ل() 'yoav))))👾
of course LLMs don't "really reason". because if you ask people what is "reasoning" you will get many different answers, either very narrow or technical, or very broad and useless. and LLMs don't match the narrow ones, and the broad ones are too broad. the LLMs do... something.
English
12
2
154
11.9K
Josip Krapac
Josip Krapac@josipK·
@ziv_ravid @LucaAmb @yoavgo I have no idea if LLMs do that. They certainly sometimes leave impression, but impression of a mechanism and mechanism are not the same thing.
English
1
0
0
75
Josip Krapac
Josip Krapac@josipK·
@ziv_ravid @LucaAmb @yoavgo Execution is following the steps of the plan, verifing that following of the steps leads us to desired intermediate state, and triggering backtracking and re-planning.
English
1
0
0
82