Erik

1.1K posts

Erik

@HdCoder

Inscrit le Mart 2017

174 Abonnements15 Abonnés

Erik@HdCoder·57m

@DeryaTR_ considering that the bar is literal AGI, I don’t think it’s unfair at all tbh

English

Derya Unutmaz, MD@DeryaTR_·3h

ARC-AGI-3 is an important benchmark. However, I have a major issue with the “Human score 100%” statement. How many humans have tested all 1000 puzzles? How were people selected? This was not published for previous ARCs either. In one case, the human score was based on I think 2 people. This is really an unscientific way, as it assumes all humans are the same or that previous exposure to puzzles or video games, for example, is not considered. What education level and background did these humans have? I am sure humans will still score highly, but it would be very surprising if this was 100%. Without this data and scientific measurement, this appears a biased test that assumes solving 100% of the puzzles is purely intrinsic intelligence common to all humans.

ARC Prize@arcprize

Announcing ARC-AGI-3 The only unsaturated agentic intelligence benchmark in the world Humans score 100%, AI <1% This human-AI gap demonstrates we do not yet have AGI Most benchmarks test what models already know, ARC-AGI-3 tests how they learn

English

128

15.4K

Erik@HdCoder·8h

@ptlemic @SebAaltonen @ericjeker The point is that the basics are very important. They constitute the difference between what people call "slop" and what most find acceptable. DLSS5 is clearly so far on the lack of control side of things, that it turns into slop very often

English

Arete@ptlemic·10h

@SebAaltonen @ericjeker When has Jensen said that? Understanding 1) the basics and 2) the everything of just DLSS are two different things. I don't think you neither understand everything of DLSS. DLSS5 is just next step.

English

Sebastian Aaltonen@SebAaltonen·14h

I don't like tech CEOs trying to cause mass panic to the job market. Jensen and Elon too. Excavator + crane didn't mean that construction (shovel) jobs got cut down by 10x. We started building massive bridges and skyscrapers. Early numbers show just that (much more commits).

CG@cgtwts

Anthropic CEO: “50% of all entry-level Lawyers, Consultants, and Finance Professionals will be completely wiped out within the next 1–5 years." grad students and junior hires are cooked.

English

309

21.6K

Erik@HdCoder·8h

@SebAaltonen They just try to boost their stock by every means necessary, since delaying their monetary commitments by even a couple of months could be catastrophic. It's exhausting to listen to honestly

English

Erik@HdCoder·1d

@deepfates Seems plausible they will though. Just like every other technology

English

🎭@deepfates·1d

This is one of the least correct statements you can make about LLMs no offense

Noemi@NoemiTitarenco

I'll be here when everyone realizes every LLM is converging to be the same. The "differences" most people see is just differences in the system prompt, app set up, and tools. I've built enough tools and swapped out models to know that models don't really have personalities.

English

191

11.6K

Erik@HdCoder·2d

@burny_tech Except that they are not really behind. Just not prioritizing agentic coding imo

English

Burny - Effective Curiosity@burny_tech·3d

Google DeepMind seems to be behind others with Gemini for too long now...

English

Erik@HdCoder·3d

@SonyxEth @GoogleAIStudio Right. But this displays a couple of stretched cubes at 5fps? I don't get it

English

157

sonyx.eth@SonyxEth·3d

now the problem is that you cant predict what is even possible to vibe code with @GoogleAIStudio - its already possible to vibe code fully working CAD/BIM apps that charge thousands of dollars for subscriptions.

English

8.9K

Erik@HdCoder·3d

@ben_j_todd Maybe because LeCun *is* right about some things?

English

291

Benjamin Todd@ben_j_todd·3d

How is it possible to write a substack with 6000+ likes where the main message is “LeCun is right about everything”?

English

10.5K

Erik@HdCoder·3d

@hxiao Or maybe it's a bad idea to let models pick applicants, given that the reason you need humans is exactly because their judgement is flawed. This is a negative downward spiral

English

Han Xiao@hxiao·4d

autoresearch basically starts the era of disposable model. AI labs that can't automate their own R&D pipeline will be outrun by those that can. The moat isn't talent anymore - it's the speed of your automated experimentation loop. - minimax 2.7 was built from an autoresearch-like pipeline - models designing models. - half-life of a frontier model is now down to a month in 2026 - at minimax & miromind, the model now decides who to hire. Not HR. Not hiring managers. The model evaluates market talent, identifies capability gaps, and recommends candidates. If your AI can build the next AI, it sure as hell can pick the humans it needs to assist the process.

English

121

915

53.8K

Erik@HdCoder·3d

@effectfully Tinygrad is awesome, but they don't operate inside a company or using VC funding.

English

5.2K

effectfully@effectfully·3d

bro

the tiny corp@__tinygrad__

Few know this, but I (George) was the only person in history to get a perfect score in CMU compilers, which is likely the best compilers course in the world. Combine that with crazy low level knowledge of hardware from 10 years of hacking. Then add a team of people who are talented enough to push back on my dumb ideas and clean up the implementations of the good ones. The team who keeps this whole operation running, software, infrastructure, and product. I love how there's no hype in deep learning compilers. It was one of the most annoying things about self driving cars, all the noobs who burned through billions on crap that was obviously dumb, and the companies who deserved to go bankrupt years ago if not for government bailouts (Tesla and China will devour them all). In this space, the competition is @jimkxa at Tenstorrent, @clattner_llvm at Modular, and @JeffDean at Google. Three of the living legends of computer science. And companies like @nvidia and @AMD, who are definitely live players, making single chips that have more power than the whole Internet two decades ago. This space is so fun to play in. If you haven't, read the tinygrad spec. It's all coming together beautifully.

936

245.9K

Erik@HdCoder·3d

@andersonbcdefg The funny thing is Rust is not even that fast

English

113

Ben (no treats)@andersonbcdefg·4d

people are discovering the "rewrite everything in fast language with codex" life hack

Rach@rachpradhan

We replaced urllib3 inside boto3 with a Zig HTTP client. One import line. Same API. Upto 115x faster with TurboAPI. import faster_boto3 as boto3 Here's what happened..

English

1.6K

156K

Erik@HdCoder·5d

@Ranger80919 @TechLayoffLover You can always be a plumber if Agi turns out to kill all cs jobs. Telling people to stop educating themselves is not something we should promote. Especially the rude, fear mongering kind of the guy with the original post

English

Ranger Due@Ranger80919·5d

@HdCoder @TechLayoffLover Honestly, tech jobs are dropping in salary and becoming more scarce. I would not go into computer science if I were 18 today. I would be a plumber. They can charge whatever they want, people are always glad to see them, and glad to pay. Will never be replaced….

English

Tech Layoff Tracker@TechLayoffLover·9 Mar

A CS professor at a mid-tier state university just sent me their internal placement data Fall 2023: 89% of their graduates had offers by graduation. Average starting salary $94k Spring 2024: 71% placement rate. Average dropped to $78k Fall 2024: 43% placement rate. Those who got offers averaged $61k Spring 2025: 31% of graduates employed in software roles six months out This semester? 19% placement rate and falling Faculty meeting last Tuesday got heated when the department chair suggested "pivoting curriculum toward AI collaboration skills" One professor stood up and said "we're teaching students to build the systems that eliminate their own jobs" The career fair last month had 12 companies show up. Half were MLMs and insurance sales Students keep asking why they're learning data structures when the job postings all say "3+ years experience with LLM integration" Professor told me the hardest part is the parent meetings "My daughter took out $140k in loans for this degree and she's working at Starbucks" Meanwhile the university is still running ads promising "94% job placement rates in high-growth tech careers" The disconnect is crushing everyone involved Faculty knows the industry has fundamentally shifted but the marketing department is still selling the 2019 dream These kids mortgaged their futures for careers that evaporated while they were in class

English

650

16.6K

3.3M

Erik@HdCoder·5d

@TechLayoffLover I think you should stop fear mongering on the internet

English

Tech Layoff Tracker@TechLayoffLover·5d

@HdCoder Do you lick your thumb after you pull it out from your asshole?

English

314

Erik@HdCoder·6d

@martin_casado All that tells me is that they overfit on a benchmark

English

Erik@HdCoder·6d

@tomfgoodwin This stuff is just a more sophisticated version of context engineering. Won't be needed once AI moves beyond pretraining!

English

Tom Goodwin@tomfgoodwin·6d

I’m surely being stupid. But if AI is rather unconstrained by expertise or capacity or to some extent speed Why do we need to divide tasks or departments to 9 agents ( the marketing agent, the optimization agent etc ) to each do one thing. And then another agent to manage the swarm. Cant one agent just be doing it all you know. It seems very skeuomorphic. Will we have HR agents to make sure the agent agents are being looked after ? A office canteen manager agent to feed the agents ? Seems daft

English

197

190

25.5K

Erik@HdCoder·6d

@nafonsopt The lack of good debuggers is by far the biggest issue with Linux, I agree. Apart from that, the bloat is overstated imo. I got used to the (still annoying) package managers etc very quickly

English

Nuno Afonso@nafonsopt·6d

For anybody saying "Just use Linux", you need to realise that Linux is worse than Windows. Windows has all the bloat, and while you can have Linux without any of that you still don't have tools like Remedybg, RAD Debugger and Super Luminal. Once you have such tools, then Linux is a suitable app development environment. But _it is still trash_ because of the whole Linux model of you needing to compile everything. The fact that you cannot run an app built using a newer version of glibc is an insane decision. I shouldn't have to upgrade my whole machine in order to run something built on a newer version. I shouldn't be worried that an upgrade will break my machine. I shouldn't be forced to compile things from scratch to work on my machine. I shouldn't be forced to install N packages, I just want self contained binaries I can just download and run. I shouldn't be forced to develop with an old distro to have "max glibc compatibility". I shouldn't have to worry about X11 / Wayland / Window Managers. I shouldn't have to worry about asking the user to select a folder, display a dialog or show notifications. Linux is such a huge waste of potential, if they got their shit together they would completely obliterate Windows. I first got into Linux in 2000, and even back then there was this "it will take over Windows any time now!". It's been _26 years_! The same way I'd pay quite a lot for Windows without any bloat, I'd be willing to pay for a distro that gives me all this.

Nuno Afonso@nafonsopt

Anybody who thinks that it is ok for telemetry to use 100% of your CPU should be fired immediately.

English

374

501

163.7K

Erik@HdCoder·18 Mar

@wholemars Just wait for the full deployment... I think their tech is roughly the right approach for self driving, but only the final large-scale deployment is a good judge. Who knows what they are doing to smooth out the error rates, e.g. having teleoperation for all vehicles.

English

Whole Mars Catalog@wholemars·18 Mar

Shitting on Tesla for not having enough driverless Robotaxis at this point is like brushing off the moon landing because “they only did it once”. They figured out how to make a car operate with NO DRIVER using 5 MEGAPIXEL CAMERAS. Do you not understand what that means? It means we can make every car on the road autonomous. This is an incredible technical achievement that everyone said was impossible. And now the same morons who said it was impossible are faulting them for a safe and responsible rollout

English

876

27.1K

Erik@HdCoder·18 Mar

@SebAaltonen How much speedup would you estimate Codex gave you during that refactor? Were these mistakes easy to miss or rather obvious to you?

English

248

Sebastian Aaltonen@SebAaltonen·18 Mar

Just noticed that Codex ported our shadow map cascade to be D32F instead of D16. Now restoring our old D16 cascade implementation. To improve the precision, we compressed the range to visible main frustum Z region. Shadow vertices before the ortho frustum front plane were clamped to it. We had a few meter margin to ensure that visible shadow triangles were not affected.

Sebastian Aaltonen@SebAaltonen

List of all pessimizations Codex did while porting our old render pipe to new code base: - 28 byte -> 56 byte vertex format (full fp32 instead of compact packed format) - Full fp32 shader ALU (no double rate fp16) - No packed/fp16 varyings (BW waste on mobile) - Each draw call has camera matrices. (bind group 0 shared data bound once per-pass before) - 4x4 matrices instead of 4x3 affine matrices (25% fatter) - Safe normalize everywhere - RGBA16F IBL instead of RG11B10F (2x fatter, half rate filter + doesn't DCC on all mobile GPUs) I instructed Codex to fix each of these issues when I found them and it did a pretty good job. But sometimes did stupid things like using fp16 for UV varyings (not enough precision). Have to review carefully.

English

9.3K

Erik@HdCoder·18 Mar

@LukeParkerDev I think this stuff can be managed. But tbh, I'm too inexperienced to know how bad it really is. Even the latest GPT models still produce code that would never pass if it was a human. And if it can be fixed- I guess nobody knows that.

English

122

Luke Parker@LukeParkerDev·18 Mar

> AI is great at writing tests for you > the tests I am yet to find something vibed up that has not caused problems later (even if it is months later)

Luke Parker@LukeParkerDev

just merged this PR... deslopity deslopity

English

7.3K

Erik@HdCoder·18 Mar

@matiasgoldberg Yes, you're right. What offends most people is not Gen AI (which is awesome), but the weirdos calling for everyone to drop their standards, resignate etc

English

125

Matías N. Goldberg@matiasgoldberg·18 Mar

If you look back hard enough, you'll find the same or similar hate/arguments were made about: - Photography replacing painters - CGI in late 90s replacing 2D art *Good* gen AI takes a lot of time and dedication. Just like the difference between random photos vs taken by pros

Zach Fuller@zachtothefuller

I think @maximilian_ nailed generative AI right on the head

English

2.7K

Découvrir

@DeryaTR_ @ptlemic @SebAaltonen @ericjeker @deepfates @burny_tech @SonyxEth @GoogleAIStudio