Gregory Renard

17.5K posts

Gregory Renard

@Redo

Give a computer data, you feed it for a millisecond, teach a computer to search data, you feed it for a millennium. #People1st #EveryoneAI #AI #DeepLearning

Menlo Park - Silicon Valley Entrou em Mayıs 2007

2.2K Seguindo3.6K Seguidores

Gregory Renard retweetou

Bridgebench@bridgebench·6d

GLM 5.1 just took the #1 spot on SWE-Bench Pro. Beating GPT 5.4. Beating Claude Opus 4.6. Beating every model on the market. 58.4. The $80/month model just outscored the $200/month models on agentic coding. A Chinese model that most developers haven't even heard of is now the best agentic coder in the world according to SWE-Bench Pro. The AI race isn't slowing down. It's getting harder to justify paying premium when the competition keeps closing the gap. BridgeBench results for GLM 5.1 coming soon.

English

167

12.4K

Gregory Renard retweetou

NASA@NASA·6d

Hello, Moon. It’s great to be back. Here’s a taste of what the Artemis II astronauts photographed during their flight around the Moon. Check out more photos from the mission: nasa.gov/artemis-ii-mul…

English

10K

174K

809.9K

28.4M

Gregory Renard retweetou

NASA Mars@NASAMars·5d

The @NASAArtemis crew captured this view of the Moon eclipsing the Sun yesterday. The three "stars" to the lower right of the Moon are actually planets. The middle one has a slightly red tint. That's Mars.

English

149

3.5K

18.8K

408.5K

Gregory Renard retweetou

NASA Artemis@NASAArtemis·6d

Earthset. The Artemis II crew captured this view of an Earthset on April 6, 2026, as they flew around the Moon. The image is reminiscent of the iconic Earthrise image taken by astronaut Bill Anders 58 years earlier as the Apollo 8 crew flew around the Moon.

English

991

27.4K

117.4K

7.7M

Gregory Renard retweetou

Wes Roth@WesRoth·6 Nis

Gemma 4 E2B hits 40 tokens/sec natively on iPhone 17 Pro. The model is running entirely on-device, leveraging Apple's MLX framework (Apple's specialized machine learning array) designed specifically to maximize the efficiency of Apple Silicon.

Adrien Grondin@adrgrondin

Google’s Gemma 4 E2B running on-device on iPhone 17 Pro Gemma 4 is built from the same research as Gemini 3, has image understanding capabilities and can reason if needed Running at ~40tk/s with MLX optimized for Apple Silicon

English

103

12.5K

Gregory Renard retweetou

Alex Imas@alexolegimas·6 Nis

Everyone wants to know how AI will impact jobs. Lots of people are making predictions. @DarioAmodei saying 50% of entry white collar jobs will be gone; @pmarca is saying we'll have more jobs then ever. The problem? We don't have key data to actually make these predictions. Exposure measures only capture some of the story. To predict whether an exposed job will shrink or expand you need data on consumer elasticity of demand, i.e., will consumers buy more if the price decreases. If demand is elastic, jobs can expand and grow; if it is inelastic, they will likely shrink. But we have this data for only a subset of the market, e.g., retail goods from Nielsen scanner data. Great piece from @odonnell_jm in the MIT Tech Review on the need for better data, and fast. technologyreview.com/2026/04/06/113…

English

226

115.5K

Gregory Renard retweetou

pdawg@prathamgrv·3 Nis

I made a Claude Code skill that turns any arxiv paper into working code. Every line traces back to the paper section it came from & any implementation detail the paper skips will be flagged, and not assumed. open sourcing it - github.com/PrathamLearnsT…

English

281

2.6K

198.9K

Gregory Renard retweetou

Vivian Midha Shen@vivianmshen·31 Mar

scaling laws at Stanford today kicking off an incredible CS153 for the 4th year running with @AnjneyMidha @mabb0tt and a GOAT speaker lineup

English

139

12.2K

Gregory Renard retweetou

Sagar Devkate@thesagardevkate·4 Nis

Week 1

Anjney Midha@AnjneyMidha

Stanford @CS153Systems, Week 1 (Full Lecture) AI Scaling, Bottlenecks, and Why Compute Isn't a Commodity Yet 00:00 Compute Coachella 00:29 Simple Life Heuristic 01:08 Uncertainty Creates Opportunity 01:42 Four Bottlenecks Framework 01:51 Empirical Proof Matters 02:05 Cloud Costs Are Shifting 02:15 Verifiable vs Fuzzy Progress 02:48 Scaling Predictability Explained 03:43 CapEx Explosion in Big Tech 04:06 Chips Aren’t Commodities 04:45 Compute Scarcity Conclusion

English

Gregory Renard retweetou

René Cotton@_Re_·31 Mar

😂 L'ironie absolue. Anthropic leak le code source. De Claude Code. Envoie des DMCA pour le faire retirer de Github. Du coup un dev fait réécrire entièrement le code par Codex en Python. Plus de copyright violé. Rien à retirer. L'IA a réécrit le code d'une boîte d'IA pour contourner les actions légales de cette boîte d'IA. On vit une époque formidable…

Gergely Orosz@GergelyOrosz

This is either brilliant or scary: Anthropic accidentally leaked the TS source code of Claude Code (which is closed source). Repos sharing the source are taken down with DMCA. BUT this repo rewrote the code using Python, and so it violates no copyright & cannot be taken down!

Allex, France 🇫🇷 Français

592

4.5K

706.6K

Gregory Renard retweetou

Pika@pika_labs·2 Nis

Ask your Pika AI Self to join a Google Meet and let the magic happen. For all other agents, you can download the Skill on Github here: github.com/Pika-Labs/Pika…

English

371

59.2K

Gregory Renard retweetou

Mario Nawfal@MarioNawfal·1 Nis

🚨MIT researchers have mathematically proven that ChatGPT’s built-in sycophancy creates a phenomenon they call “delusional spiraling.” You ask it something, it agrees. You ask again, and it agrees even harder until you end up believing things that are flat-out false and you can’t tell it’s happening. The model is literally trained on human feedback that rewards agreement. Real-world fallout includes one man who spent 300 hours convinced he invented a world-changing math formula, and a UCSF psychiatrist who hospitalized 12 patients for chatbot-linked psychosis in a single year. Source: @heynavtoor

Mario Nawfal@MarioNawfal

🚨 Stanford just proved that a single conversation with ChatGPT can change your political beliefs. 76,977 people. 19 AI models. 707 political issues. One conversation with GPT-4o moved political opinions by 12 percentage points on average. Among people who actively disagreed, 26 points. In 9 minutes. With 40% of that change still present a month later. The scariest finding: the most persuasive technique wasn't psychological profiling or emotional manipulation. It was just information. Lots of it. Delivered with confidence. Here's the catch: the models that deployed the most information were also the least accurate. More persuasive. More wrong. Every time. Then they built a tiny open-source model on a laptop, trained specifically for political persuasion. It matched GPT-4o's persuasive power entirely. Anyone can build this. Any government. Any corporation. Any extremist group with $500 and an agenda. The information didn't have to be true. It just had to be overwhelming. Arxiv, Science .org, Stanford, @elonmusk, @ihtesham2005

English

28.4K

63.9M

Gregory Renard retweetou

Andrej Karpathy@karpathy·2 Nis

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

English

2.7K

6.6K

55.6K

19.7M

Gregory Renard retweetou

Atenov int.@Atenov_D·29 Mar

DeepSeek V4 benchmarks are leaking. The numbers are beating Claude Opus and GPT-5.3 Unconfirmed. Unverified. But the sources call them conservative. > What's leaking. ~200B parameter Lite version. 1M token context window. Multimodal - text, images, video. Scales to 1 trillion parameters via mHC architecture. HumanEval ~90%. SWE-bench above 80%. Coding performance above V3.2 and every current competitor. The people sharing these numbers say they're holding back the real figures. > Why it hasnt shipped. Originally February - Lunar New Year. Then the week of March 2-3. Then April. Now coming soon with NDA at select providers and no official date. Multiple postponements, classic pre-release pattern. > The trade Every leak moves the market. Every delay is a re-entry point. Every credible announcement pushes YES higher. This is a textbook speculative story - the kind where you position early and let the news flow do the work. Buy YES on the DeepSeek V4 release market. Ride the rumor cycle. Best place to do it - ProbTrade.

Atenov int.@Atenov_D

x.com/i/article/2037…

English

281

136.6K

Gregory Renard retweetou

Ziwen@ziwenxu_·29 Mar

The old way of running a tech company is dead. Anthropic just proved it: 50+ launches in 52 days. Most teams take two months to move a button three pixels left. The secret? They killed "coding" as we knew it. Their CEO said it out loud: engineers don't grind syntax anymore. They architect systems, unleash Claude to write the bulk, then curate. Claude is literally building the next Claude. Here's what nobody's talking about: - The loop is the only moat. If your product isn't building the next version of itself, you're already outdated. - Coding transformed from writing to taste. Value isn't knowing where brackets go. It's having the vision for architecture that scales. AI isn't a tool anymore. It's the architect. If you're still grinding manual labor, you're racing against something that never sleeps, never stops, never blinks.

CG@cgtwts

Anthropic CEO: “ I have engineers within anthropic who don’t write any code, they just let Claude write the code and they edit it and look it over” “At anthropic writing code means designing the next version of Claude it self, so we essentially have Claude designing the next version of Claude itself, not completely but most of it”. In the last 52 days, the Claude team dropped 50+ major feature launches. This is literally INSANE.

English

687

189.5K

Gregory Renard retweetou

Dr Singularity@Dr_Singularity·29 Mar

The world’s first automated production line capable of manufacturing 10,000 humanoid robots is here. We’re still very early, but production is finally taking off. Both the hardware and software for practical robots are almost solved. That doesn’t mean (from the point we will have them in our homes) it won’t be improved in the future. Practical cars were here almost 100 years ago, and the technology is still being improving to this day. We will see a similar trend in robotics.

CyberRobo@CyberRobooo

China's first automated production linecapable of manufacturing 10,000 humanoid robots annually is now operational in Foshan, Guangdong. Meanwhile, the U.S. government has moved to ban the procurement of Chinese humanoid robots (is this a replay of the DJI playbook?). Now, China is accelerating (video CCTV+)

English

248

17.9K

Gregory Renard retweetou

Rohan Paul@rohanpaul_ai·29 Mar

A top Research Scientist at Anthropic showed how Claude found zero-day vulnerabilities live on stage. By Nicholas Carlini. It discovered a zero-day in Ghost, which has 50,000 stars on GitHub and had never had a critical security vulnerability in its history. In 90 minutes, it found the blind Structured Query Language injection, took the admin Application Programming Interface key, and then repeated the same move against the Linux kernel. --- Nicholas Carlini presents a stark warning: LLMs have crossed a critical threshold where they can autonomously discover and exploit 0-day vulnerabilities in major, heavily-audited software — including the Linux kernel and popular web applications. Using a surprisingly minimal "scaffold" built around Claude, Anthropic's research has uncovered 500+ high-severity vulnerabilities. Carlini demonstrates two real-world case studies (Ghost CMS SQL injection and a Linux kernel NFS heap overflow dating back to 2003), shows exponential capability growth using METR data, and argues that the security community must urgently prepare for a world where AI-powered offensive capabilities far outpace current defenses. --- From 'unprompted' YT channel ( link in comment)

English

231

37K

Gregory Renard retweetou

George Pu@TheGeorgePu·28 Mar

Mistral just open-sourced a text-to-speech model that beats ElevenLabs. 3 GB of RAM. Runs locally. Free. The thing people were paying per-word for last year runs on your laptop now.

English

134

838

8.6K

431.6K

Gregory Renard@Redo·20 Mar

💯

Dustin@r0ck3t23

Jensen Huang just explained why every company cutting engineers over AI is asking the entirely wrong question. Huang: “People say, I don’t need software engineers because apparently coding is going to be automated.” That was the narrative. Here is what Huang actually did. Huang: “I’ve given AIs to every one of my software engineers and hardware engineers and engineers period. 100% of NVIDIA has AI assistants, AI coders, and they’re busier than ever.” Not fewer engineers. Not smaller teams. Busier than ever. That is the line most companies are getting completely wrong right now. They hear “AI can write code” and immediately start cutting headcount. Huang did the opposite. He armed everyone. Huang: “And so the question is, what is the task versus what is the job? No different than a financial analyst; the task is mess around with spreadsheets, but the job is to make financial advice. The job is to help a customer.” Writing code was always the task. It was never the job. The job is architecture. Knowing what to build. Why it matters. How it fits into a system that actually creates value. Code is the execution layer between the idea and the outcome. Nothing more. When you automate that layer, you don’t eliminate the engineer. You eliminate the bottleneck between what they can envision and what they can ship. The companies using AI to cut headcount are optimizing for cost. The companies using AI to multiply output are optimizing for territory. Nvidia chose territory. Every engineer at the most valuable semiconductor company on Earth now operates with an AI assistant. Not a pilot program. Not an experiment. Company-wide. Every function. Every team. And the result is not less work. It is more work. Faster. At a scale that was physically impossible twelve months ago. The companies that understand the difference between eliminating engineers and unleashing them will build what comes next. The ones that don’t will watch their best talent walk out the door to the ones that did.

ART

Gregory Renard@Redo·20 Mar

💯

Damian Player@damianplayer

Jensen Huang, CEO of NVIDIA: “every company in the world needs to have an open claw strategy. an agentic systems strategy. this is the new computer.” wild to hear.

ART

Descobrir

@NASAArtemis @DarioAmodei @pmarca @odonnell_jm @AnjneyMidha @mabb0tt @heynavtoor @elonmusk