Dominik Lukes

15.8K posts

Dominik Lukes banner
Dominik Lukes

Dominik Lukes

@techczech

Exploring schemas and propositions about language models of all kinds on https://t.co/GU07uzb7Ud and on https://t.co/UfdxBd7jvK.

UK Katılım Nisan 2009
796 Takip Edilen2K Takipçiler
Sabitlenmiş Tweet
Dominik Lukes
Dominik Lukes@techczech·
@kepano Civilisation is built on delegating understanding. Doing your own understanding needs to be very strategic. In most situations, it becomes the equivalent of growing your own vegetables. It's enjoyable but does not meaningfully contribute to your nutrition.
English
2
1
22
4.1K
Dominik Lukes
Dominik Lukes@techczech·
@akoustov Now let's do life time bans for correct references that clearly support the opposite of what the author claims they do or are about something entirely irrelevant.
English
1
0
4
486
Dominik Lukes
Dominik Lukes@techczech·
@Makuh90 @UIEnthusiasts Windows just makes some system level things harder. In many ways, it is better than MacOS but having a Unix core is just better in the agentic era. Also, the hardware fragmentation does not help.
English
0
0
4
297
Dominik Lukes
Dominik Lukes@techczech·
My quick verdict on @raycast v2 beta: overall improvement, can't notice any speed loss on M4 Macbook Air (and I have loads of extensions) - perhaps a tiny ms dely on ⌘ + K but that could be an illusion. - cloudsync can't come soon enough - new settings are great - new file search a huge improvement - snippet tagging was long awaited Well done @thomaspaulmann and team.
English
1
3
52
4K
Dominik Lukes
Dominik Lukes@techczech·
@thomaspaulmann @ps73yk @raycast I don't mind rounded corners but when I want an application window to fill the desktop space (without going full screen) I want it to fill the whole thing, not leave distracting pixels in the top corners.
English
0
0
0
14
Thomas Paul Mann
Thomas Paul Mann@thomaspaulmann·
One app, two platforms, four programming languages. The things that look the simplest are often the hardest to build. @raycast is one of them. Here's a technical deep dive on how we built v2 👉 ray.so/v2-deep-dive
English
34
27
514
72.9K
Thomas Ricouard
Thomas Ricouard@Dimillian·
It’s time. Now you’ve tried our Codex mobile, tell me the top missing feature for your workflow! And yes, we’re aware of the current bugs and shortcomings. We’re working hard on it!
English
287
14
527
35.6K
Thomas Paul Mann
Thomas Paul Mann@thomaspaulmann·
@techczech @raycast Thanks! While the first version of Tahoe had its issues, the latest ones are good. I'm sure you won't regret it, and hopefully we play a major part in helping you feel at home! Let me know how it goes...
English
2
0
3
429
Dominik Lukes
Dominik Lukes@techczech·
I still remember how magical o3 felt. Now, I wouldn't touch it: "Within a year, Mythos will probably look quite dumb (relative to other new models). And others may release openly available or unguardrailed models of Mythos-level capabilities."
Logan Graham@logangraham

A lot of people have been wondering about Mythos, Glasswing, and the vulns we / our partners are fixing. Today, I’m excited for us to start sharing more. (For context, I lead Glasswing @AnthropicAI.) Two independent evaluations this week—from XBOW and the UK AISI—confirm what we've been seeing internally: Claude Mythos Preview is a step change in autonomous cybersecurity capabilities. We need to start preparing fast for a world of models with this level of capabilities. The UK AI Security Institute tested the model we shipped at the launch of Project Glasswing and found Mythos Preview is the first model to solve both of their end-to-end cyber ranges, including one (Cooling Tower) which no model had ever cleared. But attackers (and defenders) have sophistication & cost constraints – Mythos is also the only model that clears every one of their tasks estimated over 8 hours under their deliberately low 2.5M-token cap. XBOW tested it on their offensive security benchmarks, finding "token-for-token, unprecedented precision." It's the only model to succeed at subtle V8 sandbox work. Other Glasswing partners shared similar stories. In a few weeks of testing, Mythos Preview has helped them find many thousands of (estimated) high + critical severity vulnerabilities, sometimes double what they'd normally find in a year. I don't share this to boost Mythos. In fact, this is not about Mythos. It’s about preparing for the coming world of models being better, faster, cheaper, and more creative than some of the best human experts at dual use capabilities. Clearly, we need them supporting defenders as widely as can be done safely – and especially the least resourced ones. Within a year, Mythos will probably look quite dumb (relative to other new models). And others may release openly available or unguardrailed models of Mythos-level capabilities. We started Project Glasswing because capabilities like Mythos Preview's won't stay rare, or stay in careful hands. We are bringing it to defenders as fast as we responsibly can, while working to figure out, for example, the right safeguards and patching & disclosure processes. Also, to be clear, compute has never been a limiter in our rollout. Expect a fuller update on our Glasswing work in the coming days. XBOW report: xbow.com/blog/mythos-of… UK AISI report: aisi.gov.uk/blog/how-fast-…

English
0
0
0
90
Dominik Lukes
Dominik Lukes@techczech·
The new AI generated code is solving old problems that were previously not economically valuable enough to devote scarce development resources to. The productivity impact of these will be gradual and cumulative. But also what software is will change.
François Chollet@fchollet

The quantity of code that devs ship has roughly 10xed. But net developer productivity (value created by unit of time) is only up by a bit, if at all. Part of it is that the additional code is solving more incremental problems. A bigger part is that the new code is creating problems of its own.

English
1
0
0
83
Dominik Lukes
Dominik Lukes@techczech·
@CCguerilla What I meant is that it's not automatically a reasonable objection. But most importantly, it's often a political objection masquerading as an ethical one. Plus, everyone knows that this is true because we routinely do not accept all ethical objections as reasonable.
English
1
0
0
8
Kane Murdoch
Kane Murdoch@CCguerilla·
@techczech Hmmmm, disagree. In my mind it's the best example of a reasonable objection.
English
1
0
0
35
Dominik Lukes
Dominik Lukes@techczech·
Ethical objection does not mean a reasonable objection.
English
2
0
0
99
ClaudeDevs
ClaudeDevs@ClaudeDevs·
Starting June 15, paid Claude plans can claim a dedicated monthly credit for programmatic usage. The credit covers usage of: - Claude Agent SDK - claude -p - Claude Code GitHub Actions - Third-party apps built on the Agent SDK
English
1.3K
1K
12.4K
10M
Dominik Lukes
Dominik Lukes@techczech·
Nobody seems to be talking about cognitive decoupling anymore, but I suspect differences in levels of willingness to engage in it are behind some of the biggest disagreements around AI, today.
English
0
0
0
58
Dominik Lukes retweetledi
Séb Krier
Séb Krier@sebkrier·
If anyone builds it, everyone thrives. Over the past decade, a lot of important work on AI alignment has focused on avoiding harm. But freedom from harm isn't the same as freedom to flourish. In this paper, we introduce 'Positive Alignment'. A positively aligned agent is one that helps us navigate our own value trade-offs, builds our resilience, and acts as a scaffold for human flourishing. Doing this without slipping into top-down, technocratic paternalism is the great design challenge of our time. We think a lot more research is now needed to explore this frontier: how do we align models that actively help us thrive? Amazing work by @RubenLaukkonen, @drmichaellevin, @weballergy, @verena_rieser, @AdamCElwood, @996roma, @FranklinMatija, @shamilch, @_fernando_rosas, @scychan_brains, @matybohacek, @sudoraohacker, and others. arxiv.org/abs/2605.10310
Séb Krier tweet media
English
90
224
1.1K
299.9K