7th

411 posts

7th

@fw7th

ml • mechE • plus ultra and things of that nature

Konohakagure Beigetreten Ağustos 2024

37 Folgt41 Follower

7th@fw7th·5d

I'll allot intentional time to attack these in the days following this post. I'll need some sort of yardstick to measure myself against. I'll work problem sets involving system design. I'll also focus on seeing more design decisions, systems and implementation details.

English

7th@fw7th·5d

Day 4/? Inference engine. Since my main goal of doing this was to improve my overall skills, I've identified two bottlenecks. 1. I design inefficient systems/modules under uncertainty. 2. My programming knowledge is very basic/rudimental albeit I complete projects with it.

English

7th@fw7th·5d

@laurathesimp The answer is simply; balls of steel. Something you don't seem to possess

English

laura @laurathesimp·6d

how do people trust 10 agents to run without supervision over the weekend my claude messed up a merge conflict and then apologized and then proceeded to delete my changes

English

360

18.1K

7th@fw7th·27 Mar

@snwy_me probably ragebait

English

488

snwy@snwy_me·27 Mar

i'm fucking fried holy shit

English

712

29.8K

7th@fw7th·27 Mar

@snwy_me of tokens?

English

snwy@snwy_me·26 Mar

what the actual fuck is he talking about

Shubham Saboo@Saboo_Shubham_

“OpenClaw is the iPhone of tokens” — Nvidia CEO on Lex Podcast

English

298

250

10.9K

611.4K

7th@fw7th·27 Mar

@k7agar When it's paid compute too

English

atharva ☆@k7agar·26 Mar

ran the training for 4hrs for the wrong backbone

English

1.8K

7th@fw7th·27 Mar

@shiri_shh His voice sounds like Sam Altman's

English

shirish@shiri_shh·26 Mar

This startup lets you ORDER SUNLIGHT from space to your exact location in 30 seconds 😭

English

1.6K

1.2K

14.8K

4.6M

7th@fw7th·27 Mar

@pa1nark @pepeller @shiri_shh Best comment lmao

English

PiX@pa1nark·26 Mar

@pepeller @shiri_shh actually not needed, just make it run windows...

English

1.8K

7th@fw7th·27 Mar

@realcryobyte @schteppe Forgot to add "make no mistaes, I give you the power of 4 senior develops", rookie mistake

English

cryobyte@realcryobyte·26 Mar

@schteppe to tackle memory safety, C++29 adds an AI agent prompt block. int main(){ chatgpt{ write me a code that is memory and thread safe } }

English

1.9K

Stefan@schteppe·25 Mar

To tackle memory safety, C++29 adds Python support. Whenever you need memory safety, use a Python block: int main(){ py { print(“Hello, World!”) } } Bjarne’s comment: “I give up brø. Just use Python brø”

English

118

3.1K

146.8K

7th@fw7th·27 Mar

@schteppe I love how some comments can't tell his is a joke 😭

English

7th@fw7th·27 Mar

@wildmindai what kind of algorithm is used for gazing, it seems about like temporal averaging or optical flow

English

Wildminder@wildmindai·25 Mar

NVIDIA says: no more "brute force every pixel" of video understanding. AutoGaze- identifies and removes redundant video patches before they enter a Vision Transformer. Now we can processes 4K long-video in real-time. Works with SigLIP2 and NVILA. autogaze.github.io

English

164

2.4K

294.9K

7th@fw7th·27 Mar

@probnstat What kind of data? And what are the inductive biases of deep net? What's our measure of performance? I think further questions would need to be asked no?

English

374

Probability and Statistics@probnstat·27 Mar

ML interview drill: You’re given a dataset with 1M samples, 100 features. A complex model (deep net) gives worse test performance than logistic regression. What’s the MOST likely reason? A) Underfitting B) Overfitting C) Bad optimization D) Data leakage Reply with your answer! Bonus: Name 2 concrete steps you’d take to improve the deep model.

English

157

40.5K

7th@fw7th·27 Mar

@jino_rohit I don't maintain any repos, is it really that bad?

English

Jino Rohit@jino_rohit·26 Mar

i miss the pre agentic AI era, now every repo is filled with slop PRs(even major ones like vllm/sglang). by the time you read the issue and trace the data flow, theres already a slop PR with 999 lines of code. it must really suck for the repo maintainers.

English

904

7th@fw7th·27 Mar

Training Frameworks: Ops are API functions called by tensors. Tensors track what tensors created them and the ops used. This took me some time to figure out. Got frustrated; - re-read attention is all you need - Revised Inner product spaces - studied for fluid mechanics test

English

7th@fw7th·27 Mar

Day 3: ML inference engine Op kernels for inference engines e.g ggml, are different from ML frameworks e.g tinygrad & pytorch. Here's the difference: inf. Engines: build a static graph of predefined ops, some libs abstract away tensors and ops to a "layer", no grad tracking.

English

7th@fw7th·24 Mar

@dogecahedron say i had 3 rows and 2 cols, I would store stride=2 (row major), then "append" subsequent rows, but how do you scale it to n-D?

English

dogecahedron@dogecahedron·23 Mar

nice going cpu first is a good decision. its honestly easy to go from 2D to N-D just for each 1d buffer you store the View as a list of dim+stride. a stride tells you how many steps you take in the buffer for each step along a tensor dimension. in your case stride=1 for the columns and then stride = NColumns for the rows. but this pattern generalizes to more dimensions

English

7th@fw7th·23 Mar

day2: CPU inference engine 1d and 2d tensors working, but it's kind hacky; 1. Custom allocator but just a wrapper around malloc and free with some attachments. 2. implementation now allocates contiguous memory based on sizeof(dtype) * w for a 1d tensor. Tensors are row-major

English

7th@fw7th·24 Mar

@dogecahedron lemme my understanding; stride will allow me to traverse along one dim, and then subsequent rows/cols along that same dim so I store it as a contiguous array in memory?

English

7th@fw7th·23 Mar

3. For 2d tensors I figured out how to map to a contiguous array with; (i * n_cols) + j. Tensors can be randomly filled or with a specific number This won't scale to n-D tensors so I'll probably redesign this in the future. My aim for now is to write all ops.

English

7th@fw7th·23 Mar

@jino_rohit To what extent do you wanna add features?

English

Jino Rohit@jino_rohit·21 Mar

day 16 of ml systems and gpu programming im building tachyon - a lightweight LLM inference engine to run on consumer hardware. im treating this a playground for ideas i read up and implement them into making it an actual inference engine. the library itself is quite readable, the concepts spelled out and everything benchmarked to reproduce. currently it has a llama 3.2 1B instruct model running at 84.7 tokens/sec with kv caching on a rtx 4060. it takes only 3 lines to run!

English

1.1K

Entdecken

@laurathesimp @snwy_me @k7agar @shiri_shh @pa1nark @pepeller @realcryobyte @schteppe