Josip Krapac

6.9K posts

Josip Krapac

@josipK

I'm learning to be patient, but it's going slower than expected.

Barcelona, Cataluña Beigetreten Nisan 2009

935 Folgt326 Follower

Josip Krapac@josipK·11 Nis

This shitty site is still shit. Who'd say!

English

Josip Krapac@josipK·10 Nis

@stanmaltman @yoavgo @stanislavfort So precision doesn't matter because I can brute force search for needle in the haystack, as long as my recall makes sure that there's needle in that haystack in the first place? But there must be some precision below which this doesn't pay off, it can't be arbitrarily small.

English

(((ل()(ل() 'yoav))))👾@yoavgo·9 Nis

as much as i detest Anthropic's PR stunts, the findings by Aisle Security are also highly misinterpreted. "isolating the relevant code" makes a *huge* difference, it is a MUCH easier task after isolation. in CS terms, verification is much easier than search / solving.

English

5.9K

Josip Krapac@josipK·29 Mar

@giffmana The loss was falling so fast it had to kill itself before it reaches AGI.

English

Lucas Beyer (bl16)@giffmana·28 Mar

I just achieved the holy grail as a weekend project! Wrote a multimodal training codebase in pure C and CUDA+NCCL with minimal dependencies (no torch!) that achieves 45 MFU with FSDP on four 8xH100 nodes. I might feel cute and open-source it later. Screenshot for proof:

English

1.1K

62.3K

Josip Krapac@josipK·7 Mar

@eklek_trokut Samo dobre vijesti

electro cute@eklek_trokut·6 Mar

stvari koje su me u zadnja 2 dana pinku oraspoložile: - otac (40) i sin (10) gledaju ogromno ukopano gradilište obojica s rukama na leđima - žena koja u košari bicikle vozi pasića - goth metalka u punom getapu koja jede kontiki dok šeće

340

Josip Krapac retweetet

Emtiyaz Khan@EmtiyazKhan·4 Mar

4 postdoc positions are available at TU Darmstadt and Centre for Excellence on Reasonable AI. I am part of the Active AI group. Please help me spread the word. Description: hessian.ai/projects/reaso… Application portal: career.tu-darmstadt.de/tu-darmstadt/j…

English

Josip Krapac@josipK·1 Mar

@pmddomingos @TDaltonC @DarioAmodei I'd say the guy is just deciding for himself. What's wrong with that?

English

Pedro Domingos@pmddomingos·28 Şub

@TDaltonC @DarioAmodei Yep, the arrogance is on full display here: he thinks he knows better, and won’t let the Ukrainians decide for themselves.

English

170

Pedro Domingos@pmddomingos·28 Şub

So @DarioAmodei, you don’t think Ukraine should be allowed to use autonomous weapons against Russia?

English

455

70.9K

Josip Krapac@josipK·15 Şub

@yishan @danaparish I mostly chat about my projects and (imagined) illnesses. I don't know what you'd enjoyed more!

English

Yishan@yishan·14 Şub

As an employer, I would absolutely love to have this kind of info on a candidate and would consider it extremely valuable for making hiring decisions. It is also grossly inappropriate to do and I would never do it. I was also a bit curious about what it would say about me personally so I typed it into ChatGPT.

English

10.2K

Dana Parish@danaparish·14 Şub

🤯🤯🤯 HR asked potential candidate to open ChatGPT during interview & ask, “Based on my past conversations, can you analyze my behavioral tendencies?” When she declined, the vibe went south. This is insanely scary & inappropriate.

English

872

1.8K

10.2K

840.9K

Josip Krapac@josipK·15 Şub

@ChrSzegedy "Are there any works that explore curriculum learning in an unsupervised setting? Does the notion of Y is easier to learn if you first learn X happen automatically in auto-regressive modelling, or diffusion models or any other generative modelling approach?"

English

Josip Krapac@josipK·15 Şub

@ChrSzegedy A semi-related question: bsky.app/profile/josipk…

English

Christian Szegedy@ChrSzegedy·15 Şub

There must be huge low-hanging fruit in figuring out how to train metacognition incrementally. If the human brain is any good template, then it is quite telling that most of the dramatic cognitive learning happens after the brain has finished growing and loses significantly in plasticity. Humans get more and more effective at learning metacognitive skills even in their twenties. We achieve similar effects by training increasingly large models, but there must be a way to enable incrementally improved metacognitive learning at near-constant compute cost after certain minimal skills are achieved.

English

158

26.5K

Josip Krapac@josipK·20 Oca

@duolingo something is broken again. Didn't have my x3 today, a bunch of points to today not added... My profile is j88k.

English

Josip Krapac@josipK·25 Eyl

@giffmana Definitely speeding up (flash attention, kv-cache) and quantization basics.

English

Lucas Beyer (bl16)@giffmana·22 Eyl

Hey chat, I need your opinions! Later this week, I'll teach my usual Transformers class. However, I just found out that someone is giving "foundations of attention and transformers" lecture before me already. So I'm thinking of still doing a "recap Lucas style" but then spending more time on some topics my lecture usually doesn't cover, or just scratches the surface. What more advanced/recent topics would you like to see included? Keep in mind this is a teaching/class style talk. Some ideas: more in-depth on decoding, kv-cache. Flex/flash/paged attention. Spend more time on multimodal versions? Tokenizers? I think bad ideas: geglu, global/local, rmsnorm, ... I feel like these are all trivially understood and not worth "teaching", though you may convince me otherwise.

English

598

91.4K

Josip Krapac@josipK·6 Eyl

Are there any works that explore curriculum learning in an unsupervised setting? Does the notion of Y is easier to learn if you first learn X happen automatically in auto-regressive modelling, or diffusion models or any other generative modelling approach?

English

Josip Krapac@josipK·5 Eyl

@eklek_trokut A preporučam i kreme za noge iz DM-a. Malo self-love prije spavanja!

electro cute@eklek_trokut·5 Eyl

u jednom trenutku si mlad u drugom sidiš za kompom pripremaš se za posa i na nogama nosiš eksfolijacijske čarape za svoje ispucale pete

481

Josip Krapac@josipK·5 Eyl

@eklek_trokut Javi se kad počneš s kompresijskim čarapama za proširene vene. Ako već nisi!

Josip Krapac@josipK·16 Tem

@phillip_isola @francoisfleuret It's very similar to what I think makes sense: minimal change to the existing method(s) that makes the biggest impact. What is missing is how one measures impact, but your idea with "open problems" makes sense. Minimal change forces you to connect to the existing knowledge.

English

Phillip Isola@phillip_isola·15 Tem

@francoisfleuret One way around this might be to have a system that penalizes methodological novelty: your reward is the open problems you solved minus the new methods you had to introduce to do so. I think that could be fun to try as a workshop competition or something.

English

2.5K

François Fleuret@francoisfleuret·15 Tem

Bitter hot take: There is an incentive in research not to simplify conceptually as much as you can what you invented because you will [generally] end up epsilon away from an existing made-by-giants-20y-ago thing, and nobody cares about your painful journey and mental blisters.

English

356

21.7K

Josip Krapac@josipK·15 Haz

@AntOne1998095 @LibertyF0x @AlecStapp Can you do it cumulative from when Poland joined? This is just one year, right?

English

Alec Stapp@AlecStapp·14 Haz

Poland went from Iran-level of economic development to Japan-level in a single generation

English

794

2.6K

25.7K

4.5M

Josip Krapac@josipK·11 Haz

@LucaAmb @ziv_ravid @yoavgo I can swap in arguments (variables) in that "reasoning program" provided that "argument types" are "the same" and get a valid result.

English

Josip Krapac@josipK·11 Haz

@LucaAmb @ziv_ravid @yoavgo But if person gives me reasoning steps, and they entail each other I'm more likely to believe both the result and that it has been reached by reasoning. I can apply the same steps and arrive at the same results.

English

(((ل()(ل() 'yoav))))👾@yoavgo·8 Haz

of course LLMs don't "really reason". because if you ask people what is "reasoning" you will get many different answers, either very narrow or technical, or very broad and useless. and LLMs don't match the narrow ones, and the broad ones are too broad. the LLMs do... something.

English

154

11.9K

Josip Krapac@josipK·9 Haz

@ziv_ravid @LucaAmb @yoavgo I have no idea if LLMs do that. They certainly sometimes leave impression, but impression of a mechanism and mechanism are not the same thing.

English

Josip Krapac@josipK·9 Haz

@ziv_ravid @LucaAmb @yoavgo Execution is following the steps of the plan, verifing that following of the steps leads us to desired intermediate state, and triggering backtracking and re-planning.

English

Entdecken

@stanmaltman @yoavgo @stanislavfort @giffmana @eklek_trokut @pmddomingos @TDaltonC @DarioAmodei