joe

6.9K posts

joe

joe

@joelo

A dad, husband, fintech innovator, musician, hockey goalie, mountain biker, and photographer living in Port Moody, BC

가입일 Şubat 2008
321 팔로잉254 팔로워
joe
joe@joelo·
The MacBook neo is what I always wanted my iPad to be. Love it!!
English
0
0
0
28
joe 리트윗함
Greg Brockman
Greg Brockman@gdb·
Software development is undergoing a renaissance in front of our eyes. If you haven't used the tools recently, you likely are underestimating what you're missing. Since December, there's been a step function improvement in what tools like Codex can do. Some great engineers at OpenAI yesterday told me that their job has fundamentally changed since December. Prior to then, they could use Codex for unit tests; now it writes essentially all the code and does a great deal of their operations and debugging. Not everyone has yet made that leap, but it's usually because of factors besides the capability of the model. Every company faces the same opportunity now, and navigating it well — just like with cloud computing or the Internet — requires careful thought. This post shares how OpenAI is currently approaching retooling our teams towards agentic software development. We're still learning and iterating, but here's how we're thinking about it right now: As a first step, by March 31st, we're aiming that: (1) For any technical task, the tool of first resort for humans is interacting with an agent rather than using an editor or terminal. (2) The default way humans utilize agents is explicitly evaluated as safe, but also productive enough that most workflows do not need additional permissions. In order to get there, here's what we recommended to the team a few weeks ago: 1. Take the time to try out the tools. The tools do sell themselves — many people have had amazing experiences with 5.2 in Codex, after having churned from codex web a few months ago. But many people are also so busy they haven't had a chance to try Codex yet or got stuck thinking "is there any way it could do X" rather than just trying. - Designate an "agents captain" for your team — the primary person responsible for thinking about how agents can be brought into the teams' workflow. - Share experiences or questions in a few designated internal channels - Take a day for a company-wide Codex hackathon 2. Create skills and AGENTS[.md]. - Create and maintain an AGENTS[.md] for any project you work on; update the AGENTS[.md] whenever the agent does something wrong or struggles with a task. - Write skills for anything that you get Codex to do, and commit it to the skills directory in a shared repository 3. Inventory and make accessible any internal tools. - Maintain a list of tools that your team relies on, and make sure someone takes point on making it agent-accessible (such as via a CLI or MCP server). 4. Structure codebases to be agent-first. With the models changing so fast, this is still somewhat untrodden ground, and will require some exploration. - Write tests which are quick to run, and create high-quality interfaces between components. 5. Say no to slop. Managing AI generated code at scale is an emerging problem, and will require new processes and conventions to keep code quality high - Ensure that some human is accountable for any code that gets merged. As a code reviewer, maintain at least the same bar as you would for human-written code, and make sure the author understands what they're submitting. 6. Work on basic infra. There's a lot of room for everyone to build basic infrastructure, which can be guided by internal user feedback. The core tools are getting a lot better and more usable, but there's a lot of infrastructure that currently go around the tools, such as observability, tracking not just the committed code but the agent trajectories that led to them, and central management of the tools that agents are able to use. Overall, adopting tools like Codex is not just a technical but also a deep cultural change, with a lot of downstream implications to figure out. We encourage every manager to drive this with their team, and to think through other action items — for example, per item 5 above, what else can prevent a lot of "functionally-correct but poorly-maintainable code" from creeping into codebases.
English
413
1.6K
12.3K
2.1M
joe
joe@joelo·
@OpenAI We want this
English
0
0
0
8
OpenAI
OpenAI@OpenAI·
Introducing OpenAI Frontier—a new platform that helps enterprises build, deploy, and manage AI coworkers that can do real work. openai.com/index/introduc…
English
570
793
6.2K
2.1M
joe 리트윗함
dominik kundel
dominik kundel@dkundel·
We just published a new AI-Native Engineering Team guide based on what engineering teams are asking for as they adopt Codex and the new GPT-5.1-Codex-Max model. It covers: 🧩 How coding agents fit into each phase of dev across planning, design, maintain 🧰 Practical checklists and setup patterns you can use right away 📈 How to introduce agents into an org and scale as teams build trust Read the guide 👉 cdn.openai.com/business-guide…
dominik kundel tweet media
English
28
208
1.7K
268.3K
joe
joe@joelo·
@ajassy @OpenAI Will OpenAPI’s services be available on bedrock eventually?
English
0
0
0
31
Andy Jassy
Andy Jassy@ajassy·
New multi-year, strategic partnership with @OpenAI will provide our industry-leading infrastructure for them to run and scale ChatGPT inference, training, and agentic AI workloads. Allows OpenAI to leverage our unusual experience running large-scale AI infrastructure securely, reliably, and at scale. OpenAI will start using AWS’s infrastructure immediately and we expect to have all of the capacity deployed before end of next year-- with the ability to expand in 2027 and beyond. aboutamazon.com/news/aws/aws-o…
Andy Jassy tweet media
English
281
542
4.2K
1.1M
joe
joe@joelo·
@leepace It’s extraordinary.
English
0
0
0
23
joe 리트윗함
OpenAI Newsroom
OpenAI Newsroom@OpenAINewsroom·
These “OpenAI tokens” are not OpenAI equity. We did not partner with Robinhood, were not involved in this, and do not endorse it.  Any transfer of OpenAI equity requires our approval—we did not approve any transfer. Please be careful.
English
626
814
13K
3.9M
Aravind Srinivas
Aravind Srinivas@AravSrinivas·
You can use Perplexity directly from WhatsApp now. Answers, sources, image generation. A lot more features coming soon there! +1 (833) 436-3285
English
219
153
2.3K
406.1K
joe
joe@joelo·
@sundarpichai Incredible! Just amazing what @Google is doing for computing, AI, and really, everyone.
English
0
0
0
67
Sundar Pichai
Sundar Pichai@sundarpichai·
Just announced new versions of Gemma 3 – the most capable model to run just one H100 GPU – can now run on just one *desktop* GPU! Our Quantization-Aware Training (QAT) method drastically brings down memory use while maintaining high quality. Excited to make Gemma 3 even more accessible for more developers.
Sundar Pichai tweet media
English
162
482
4.8K
547K
joe 리트윗함
Joanne Jang
Joanne Jang@joannejang·
// i lead model behavior at openai, and wanted to share some thoughts & nuance that went into setting policy for 4o image generation. features capital letters (!) bc i published it as a blog post: -- This week, we launched native image generation in ChatGPT through 4o. It was a special launch for many reasons — one of which our CEO Sam highlighted as "a new high-water mark for us in allowing creative freedom." I wanted to unpack that a bit, as it could be easily missed by those not deep in AI or closely following our evolving thoughts on model behavior (wh… what do you mean you haven’t read the sixty-page Model Spec in your free time??). tl;dr we’re shifting from blanket refusals in sensitive areas to a more precise approach focused on preventing real-world harm. The goal is to embrace humility: recognizing how much we don't know, and positioning ourselves to adapt as we learn. Images are visceral There's something uniquely powerful and visceral about images; they can deliver unmatched delight and shock. Unlike text, images transcend language barriers and evoke varied emotional responses. They can clarify complex ideas instantly. Precisely because images carry so much impact, we felt even more heft — relative to other launches — in shaping policy and behavior. Evolving perspectives on launching what feels like a new capability When it comes to launching (what feels like) a new capability, our perspective has evolved across multiple launches: 1. Trusting user creativity over our own assumptions. AI lab employees should not be the arbiters of what people should and shouldn’t be allowed to create. We’re always humbled after launch, discovering use cases we never imagined — or even ones that seem so obvious in hindsight but didn’t occur to us from our limited perspectives. 2. Seeing risks clearly, but not losing sight of everyday value to users. It’s easy to fixate on potential harms, and broad restrictions always feel safest (and easiest!). We often catch ourselves questioning, “do we really need better meme capabilities when the same memes could be used to offend or hurt people?”. But I think that framing itself is flawed. It implies that subtle, everyday benefits must justify themselves against hypothetical worst-case scenarios, which undervalues how these small moments of delight, humor, and connection genuinely improve people’s lives. 3. Valuing unknown, unimaginable possibilities. Maybe due to our cognitive bias against loss aversion, we rarely consider the negative impacts of inaction; some people refer to it as “invisible graveyards” although that’s a bit too morbid and extreme. There are second order or indirect impacts unlocked by a new capability: all the positive interactions, innovations, and ideas from people that never materialize simply because we feared the worst-case scenario. How we thought about policy decisions for Day 1 Navigating these challenges is hard, but we aimed to maximize creative freedom while preventing real harm. Some examples from our launch decisions: - Public figures: We know it can be tricky with public figures—especially when the lines blur between news, satire, and the interests of the person being depicted. We want our policies to apply fairly and equally to everyone, regardless of their “status”. But rather than be the arbiters of who is “important enough”, we decided to create an opt-out list to allow anyone who can be depicted by our models to decide for themselves. - “Offensive” content: When it comes to “offensive” content, we pushed ourselves to reflect on whether any discomfort was stemming from our personal opinions or preferences vs. potential for real-world harm. Without clear guidelines, the model previously refused requests like "make this person’s eyes look more Asian" or "make this person heavier," unintentionally implying these attributes were inherently offensive. - Hate symbols: We recognize symbols like swastikas carry deep and painful history. At the same time, we understand they can also appear in genuinely educational or cultural contexts. Completely banning them could erase meaningful conversations and intellectual exploration. Instead, we're iterating on technical methods to better identify and refuse harmful misuse. - Minors: Whenever a policy decision involved younger users, we decided to play it safe: choosing stronger protections and tighter guardrails for people under 18 across research and product. Ultimately, these considerations — coupled with our progress toward more precise technical levers — led us toward more permissive policies. We recognize this might be misinterpreted as "OpenAI lowering its safety standards,” but personally, I don’t think that does justice to the team’s extensive research, thoughtful debates, and genuine love & care for users and society. My colleague Jason Kwon once passed onto me: “Ships are safest in the harbor; the safest model is the one that refuses everything. But that’s not what ships or models are for.” The future is built with imagination and adventure. As we continue our research and learn from society, we believe we can continue to find ways to responsibly increase user freedom. When (not if!) our policies evolve, updating them based on real-world feedback isn’t failure; that’s the point of iterative deployment. Please keep sharing your feedback and creations — they genuinely help us improve!
English
273
374
2.5K
1.2M
joe 리트윗함
Satya Nadella
Satya Nadella@satyanadella·
A couple reflections on the quantum computing breakthrough we just announced... Most of us grew up learning there are three main types of matter that matter: solid, liquid, and gas. Today, that changed. After a nearly 20 year pursuit, we’ve created an entirely new state of matter, unlocked by a new class of materials, topoconductors, that enable a fundamental leap in computing. It powers Majorana 1, the first quantum processing unit built on a topological core. We believe this breakthrough will allow us to create a truly meaningful quantum computer not in decades, as some have predicted, but in years. The qubits created with topoconductors are faster, more reliable, and smaller. They are 1/100th of a millimeter, meaning we now have a clear path to a million-qubit processor. Imagine a chip that can fit in the palm of your hand yet is capable of solving problems that even all the computers on Earth today combined could not! Sometimes researchers have to work on things for decades to make progress possible. It takes patience and persistence to have big impact in the world. And I am glad we get the opportunity to do just that at Microsoft. This is our focus: When productivity rises, economies grow faster, benefiting every sector and every corner of the globe. It’s not about hyping tech; it’s about building technology that truly serves the world.
Satya Nadella tweet media
English
5.2K
18.6K
105.8K
27.1M
joe
joe@joelo·
Just an incredible interview. Wow.
Dwarkesh Patel@dwarkesh_sp

.@satyanadella on: - why he doesn’t believe in AGI but does believe in 10% economic growth - Microsoft’s new topological qubit breakthrough and gaming world models - whether Office commoditizes LLMs or the other way around Links below. Enjoy! Timestamps 0:00:00 - Intro 0:05:48 - AI won't be winner-take-all 0:16:02 - World economy growing by 10% 0:22:23 - Decreasing price of intelligence 0:31:03 - Microsoft's Quantum breakthrough 0:43:35 - Microsoft's gaming world model 0:50:35 - Legal barriers to AI 0:56:30 - Getting AGI safety right 1:05:43 - 34 years at Microsoft 1:11:31 - Does Satya Nadella believe in AGI?

English
0
0
0
87
joe
joe@joelo·
I don’t understand why the Canadian news outlets keep trying to make me care about Mark Carney or Chrystia Freeland’s thoughts. Mark is not in government.
English
0
0
0
28
joe 리트윗함
Andrej Karpathy
Andrej Karpathy@karpathy·
I don't have too too much to add on top of this earlier post on V3 and I think it applies to R1 too (which is the more recent, thinking equivalent). I will say that Deep Learning has a legendary ravenous appetite for compute, like no other algorithm that has ever been developed in AI. You may not always be utilizing it fully but I would never bet against compute as the upper bound for achievable intelligence in the long run. Not just for an individual final training run, but also for the entire innovation / experimentation engine that silently underlies all the algorithmic innovations. Data has historically been seen as a separate category from compute, but even data is downstream of compute to a large extent - you can spend compute to create data. Tons of it. You've heard this called synthetic data generation, but less obviously, there is a very deep connection (equivalence even) between "synthetic data generation" and "reinforcement learning". In the trial-and-error learning process in RL, the "trial" is model generating (synthetic) data, which it then learns from based on the "error" (/reward). Conversely, when you generate synthetic data and then rank or filter it in any way, your filter is straight up equivalent to a 0-1 advantage function - congrats you're doing crappy RL. Last thought. Not sure if this is obvious. There are two major types of learning, in both children and in deep learning. There is 1) imitation learning (watch and repeat, i.e. pretraining, supervised finetuning), and 2) trial-and-error learning (reinforcement learning). My favorite simple example is AlphaGo - 1) is learning by imitating expert players, 2) is reinforcement learning to win the game. Almost every single shocking result of deep learning, and the source of all *magic* is always 2. 2 is significantly significantly more powerful. 2 is what surprises you. 2 is when the paddle learns to hit the ball behind the blocks in Breakout. 2 is when AlphaGo beats even Lee Sedol. And 2 is the "aha moment" when the DeepSeek (or o1 etc.) discovers that it works well to re-evaluate your assumptions, backtrack, try something else, etc. It's the solving strategies you see this model use in its chain of thought. It's how it goes back and forth thinking to itself. These thoughts are *emergent* (!!!) and this is actually seriously incredible, impressive and new (as in publicly available and documented etc.). The model could never learn this with 1 (by imitation), because the cognition of the model and the cognition of the human labeler is different. The human would never know to correctly annotate these kinds of solving strategies and what they should even look like. They have to be discovered during reinforcement learning as empirically and statistically useful towards a final outcome. (Last last thought/reference this time for real is that RL is powerful but RLHF is not. RLHF is not RL. I have a separate rant on that in an earlier tweet x.com/karpathy/statu…)
Andrej Karpathy@karpathy

DeepSeek (Chinese AI co) making it look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2048 GPUs for 2 months, $6M). For reference, this level of capability is supposed to require clusters of closer to 16K GPUs, the ones being brought up today are more around 100K GPUs. E.g. Llama 3 405B used 30.8M GPU-hours, while DeepSeek-V3 looks to be a stronger model at only 2.8M GPU-hours (~11X less compute). If the model also passes vibe checks (e.g. LLM arena rankings are ongoing, my few quick tests went well so far) it will be a highly impressive display of research and engineering under resource constraints. Does this mean you don't need large GPU clusters for frontier LLMs? No but you have to ensure that you're not wasteful with what you have, and this looks like a nice demonstration that there's still a lot to get through with both data and algorithms. Very nice & detailed tech report too, reading through.

English
364
2.1K
14.4K
2.4M
joe
joe@joelo·
@stephenrobles I find these just don’t do the job. It gets too hot and over heats your phone too, which throttles the phone’s performance. Need to look for a qi2 one
English
0
0
0
145
Stephen Robles
Stephen Robles@stephenrobles·
Yeah, Anker did it on this latest MagSafe battery
Stephen Robles tweet media
English
60
25
2.5K
394.8K
LuminaProbiotic
LuminaProbiotic@LuminaProbiotic·
We're shipping! Thank you so much to our investors and customers for being with us on this journey. You can still get Lumina at our preorder price for a limited time at the link in the tweet below
LuminaProbiotic tweet media
English
14
8
149
136.7K
🍓🍓🍓
🍓🍓🍓@iruletheworldmo·
who wants early access to o3 mini?
English
247
13
595
43.8K
joe 리트윗함
François Chollet
François Chollet@fchollet·
Cost-efficiency will be the overarching measure guiding deployment decisions. How much are you willing to pay to solve X? The world is once again going to run out of GPUs.
English
19
51
1K
71.9K
joe
joe@joelo·
@elonmusk Will it come to HW3 vehicles?
English
0
0
1
5