baldwin

1.6K posts

baldwin

@baldodev

Living at the Edge of Chaos

Katılım Şubat 2024

123 Takip Edilen77 Takipçiler

baldwin retweetledi

H@hcompany_ai·7h

Holo3 is here 🚀. Today, we're launching Holo3: our new series of frontier computer-use models. 78.9% on OSWorld-Verified. That puts us ahead of GPT-5.4 and Opus 4.6, at one-tenth of the cost. Weights on Hugging Face. API is live. Test it now! #Holo3 #OpenSource #ComputerUse #OSWorld #AI #AgenticAI

English

812

59K

baldwin@baldodev·6h

@Rahatcodes rip my anthropic social credit score

English

179

rahat@Rahatcodes·8h

Claude Code has a regex that detects "wtf", "ffs", "piece of shit", "fuck you", "this sucks" etc. It doesn't change behavior...it just silently logs is_negative: true to analytics. Anthropic is tracking how often you rage at your AI Do with this information what you will

English

398

549

11K

829.7K

baldwin@baldodev·1d

@x225franc @CommieSlayer8 @Pirat_Nation was playing arc raiders with max settings in 4K with framegen x4 hitting 280 FPS and the experience felt smooth without much input delay, though i am running a 5090

English

134

Kaido@x225franc·1d

@CommieSlayer8 @Pirat_Nation man i swear going to 3x increase latency so fucking much .. why don't they just optimize games ...

English

1.3K

Pirat_Nation 🔴@Pirat_Nation·1d

Nvidia's DLSS 4.5 Dynamic Multi Frame Generation and 6X Multi Frame Generation* beta arrives tomorrow >6X Mode lets the AI generate up to five extra frames for every one frame rendered by the GPU. >Dynamic Mode automatically adjusts the frame generation multiplier in real time to target your chosen FPS cap or monitor refresh rate

English

120

1.7K

154.2K

baldwin@baldodev·1d

@SamaHoole I must have missed the patch notes cuz i went for a walk this morning in the sun and didn’t even pay a dime!

English

Sama Hoole@SamaHoole·2d

The sun was free. They sold you SPF 50 and a vitamin D deficiency. Sleep was free. They sold you an app, a pill, and a wearable that tells you your sleep was bad. Walking was free. They sold you a treadmill, a fitness tracker, and a £180 pair of trainers. Fasting was free. They sold you meal replacement shakes and the anxiety that skipping breakfast would wreck your metabolism. Cold water was free. They sold you a £3,000 plunge barrel and a podcast episode about it. Silence was free. They sold you a meditation app with a premium tier. Animal fat was cheap. They sold you seed oils, then supplements to replace what the animal fat contained. Tallow was cheap. They sold you a seventeen-step skincare routine and a clinical trial proving your face needs ceramides. Meat was cheap. They are currently selling you the idea that you shouldn't eat it. The 20th century removed access to everything the body needs to function. The 21st century is selling it back, one subscription at a time. Your great-grandmother had none of the products. She had all of the things.

English

391

6.6K

28.8K

2.4M

baldwin@baldodev·1d

i shall no longer lose braincells reading schizoposts on AGI

English

baldwin retweetledi

Evolve Performance@evolveOS_·2d

We spent 6 months building a PC optimization tool that actually tells you what it does. No "boost your PC" marketing. Here's exactly what Evolve strips, tunes, and rewires - and why your $2000 setup still feels sluggish on stock Windows. Launching in 2h.

English

9.2K

baldwin retweetledi

Evolve Performance@evolveOS_·3d

We built custom kernel configurations for every major CPU and GPU combination. Not a preset pack. Not a registry cleaner. A desktop app that knows your hardware. Launching tomorrow with 10 free upgrades from Evolve to our highest tier ($180 value). Follow us so you dont miss it

English

7.5K

baldwin retweetledi

𝐀𝐍𝐓𝐔𝐍𝐄𝐒@Antunes1·4d

Host: Sir, do you know if Iranians are starving? Trump: Yeah I do. But you’re so sexy.

English

740

3.3K

33.5K

5.2M

baldwin@baldodev·4d

braindead

TravelGov@TravelGov

Hong Kong: On March 23, 2026, the Hong Kong government changed the implementing rules relating to the National Security Law. It is now a criminal offense to refuse to give the Hong Kong police the passwords or decryption assistance to access all personal electronic devices including cellphones and laptops. This legal change applies to everyone, including U.S. citizens, in Hong Kong, arriving or just transiting Hong Kong International Airport. In addition, the Hong Kong government also has more authority to take and keep any personal devices, as evidence, that they claim are linked to national security offenses. Read more: hk.usconsulate.gov/security-alert…

English

baldwin retweetledi

Mark Gadala-Maria@markgadala·6d

While Americans argue over what is "AI slop" the Chinese are busy creating absolute cinema using Seedance 2.

English

246

510

5.9K

612.8K

baldwin@baldodev·4d

@AI_evangelist42 @arcprize a more correct representation of the results is: "raw API models, without any tool access (not representative of real world use cases), score 100x worse than top 20% of humans on this specific benchmark"

English

baldwin@baldodev·4d

@AI_evangelist42 @arcprize they did everything possible to make the AI models seem as stupid as possible, including misleading "percentage" scores "<1%" is the squared error, for example a 10/100 score would be 1%, so the model would look 100x worse when it's actually be 10x

English

ARC Prize@arcprize·6d

Announcing ARC-AGI-3 The only unsaturated agentic intelligence benchmark in the world Humans score 100%, AI <1% This human-AI gap demonstrates we do not yet have AGI Most benchmarks test what models already know, ARC-AGI-3 tests how they learn

GIF

English

237

586

4.3K

680.8K

baldwin@baldodev·5d

@d4m1n "meanwhile 100% of human testers solved every single environment. first try." no, they only picked top #2 out of 10 human scores for each game. i guess the other 8 people can't be classified as humans lol?

English

114

Dan ⚡️@d4m1n·5d

so let me get this straight > OpenAI renamed their whole division "AGI Deployment" > Jensen said AGI is "already in the room" and then ARC-AGI-3 drops and on a scale from 0 to 100% SOTA models score: GPT-5.4: 0.26% Gemini Pro: 0.37% Claude Opus 4.6: 0.25% Grok: literally 0% meanwhile 100% of human testers solved every single environment. first try. no instructions. no training. AGI is "in the room" brother it couldn't find the room

English

143

180

2.9K

245.4K

baldwin@baldodev·5d

more AGI propaganda 😔, jensen huang and literally the inventor of the term "AGI" both said we have AGI, so what point is ARC-AGI trying to make now? no harness, unscientific study, and a very human-biased scoring methodology. Plus a misleading "percentage" labeling

ARC Prize@arcprize

English

baldwin@baldodev·5d

@AI_evangelist42 @arcprize spoiler its not 100x

English

115

Kyler@AI_evangelist42·6d

@arcprize humans beating AIs by 100x is unprecedented. sounds like a great benchmark

English

5.6K

baldwin@baldodev·5d

@kevinhoff @DeryaTR_ it is not really a percentage, it's their own human-biased efficiency score #score-interpretation" target="_blank" rel="nofollow noopener">docs.arcprize.org/methodology#sc…

English

Kevin Hoff@kevinhoff·6d

@DeryaTR_ The benchmark isn't the point. The gap it reveals is. Argue the methodology all day, the models still score under 1%.

English

Derya Unutmaz, MD@DeryaTR_·6d

ARC-AGI-3 is an important benchmark. However, I have a major issue with the “Human score 100%” statement. How many humans have tested all 1000 puzzles? How were people selected? This was not published for previous ARCs either. In one case, the human score was based on I think 2 people. This is really an unscientific way, as it assumes all humans are the same or that previous exposure to puzzles or video games, for example, is not considered. What education level and background did these humans have? I am sure humans will still score highly, but it would be very surprising if this was 100%. Without this data and scientific measurement, this appears a biased test that assumes solving 100% of the puzzles is purely intrinsic intelligence common to all humans.

ARC Prize@arcprize

English

280

39.1K

baldwin retweetledi

ARC Raiders PVP@ArcRaidersPVP·18 Mar

"If you kill me you hate women"

English

180

1.2K

18.1K

1.3M

baldwin@baldodev·18 Mar

why are my DEXA scan results like 10x less accurate than my inbody results

English

baldwin@baldodev·17 Mar

@opinali @digitalfoundry finally a use case for my 2200W PSU

English

159

Osvaldo Pinali Doederlein@opinali·16 Mar

Oh my god, from @digitalfoundry: this thing needs TWO RTX 5090's to render those demos 🤣😭😭😭 Yes it will be optimized but don't hope for 3X. This tech will be a techdemo for Blackwell users, practical only for RTX 6000. So here's your carrot for the next upgrade.

English

181

3.2K

160.4K

Keşfet

@Rahatcodes @x225franc @CommieSlayer8 @Pirat_Nation @SamaHoole @AI_evangelist42 @arcprize @d4m1n