patrick sphinx

310 posts

patrick sphinx

@SphinxPatrick

Katılım Eylül 2015

224 Takip Edilen5 Takipçiler

patrick sphinx retweetledi

geniusvczh@geniusvczh·11h

gpt写代码特别喜欢吃掉异常，而我们的目标是尽量让程序尽可能多的崩溃，这个靠prompt是做不到的，只能安排test case把每一个点都引爆一次🤪

中文

1.7K

patrick sphinx retweetledi

Yihao Sun@StarGazerMiao·21h

After 6 years, finally be able to ends my last chapter of the last paper in PhD time with the same style title of the first paper I read in first day of my PhD confidently.

English

2.1K

patrick sphinx retweetledi

Maxwell Brown@imax153·3d

We on the @EffectTS_ team we often recommend cloning the Effect repo into your project so your agent can explore the source directly. I finally wrote up why it works and how to set it up: effect.website/blog/the-one-w…

English

398

29.2K

patrick sphinx retweetledi

LemonHX@lemonhxmoe·5d

chiba-lang.org/blog/chiba5 大家可以看看我是怎么在AI时代用最现代的手法干PL的： - SubAgent - Context Engineering - Harness 反正什么都有！

中文

4.8K

patrick sphinx retweetledi

Taelin@VictorTaelin·6d

To whom it may concern NanoProof.hs: the smallest viable proof checker I posted something similar before, but it was more of a research experiment with weird λ-encoded shit, than something usable. This new repo contains a tiny, 1000-LOC Haskell self-contained proof checker that you can actually use to prove arbitrary theorems. The language has just 6 base types: → Empty (`⊥`): type with 0 elems → Unit (`⊤`): type with 1 elem (`()`) → Bool (`𝔹`): type with 2 elems (`0 | 1`) → Sigma (`ΣA.B`): dependent pairs (`(x,y)`) → Pi (`ΠA.B`): dependent functions (`λx.f`) → Equal (`a==b`): propositional equality (`{==}`) That's all you need. Each of these is needed, as it introduces something fundamental. The file includes a parser, stringifier, equality, a bidirectional type checker, and a simple CLI. It also includes first-class reduction relations, which allow us to pretty print goas just like Lean. You can place '()' in a position to inspect the current context and goal there. I also include a demo proof for the commutation of multiplication.

English

279

17.2K

patrick sphinx retweetledi

Talia Ringer 🕊🪬@TaliaRinger·8 May

Our synthetic Euclidean Geometry proof assistant is now open for public contributions! We created this together as a class in my Build Your Own Proof Assistant course this semester. Please give it a spin and consider contributing! github.com/nicegeo/nicegeo

English

5.1K

patrick sphinx retweetledi

Suwako — e/acc@suwakopro·7 May

各个LLM在ProgramBench上的ast-grep测试的通过率好低啊 @hd_nvim

中文

3.2K

patrick sphinx retweetledi

igor@konnov.phd | (spec|ver)ification | security

[email protected] | (spec|ver)ification | security@k0nn0v·6 May

11. Last but not least, George Pîrlea's @GeorgePirlea talk on Veil: Multi-Modal Verification of Transition Systems ...and this is done with Lean @leanprover ! youtube.com/watch?v=24mMfU…

YouTube

English

patrick sphinx retweetledi

Kiran@kirancodes·6 May

5 lines of python. an economic game with complex equilibria. Our new language Pact uses Choreographies with game theory, allows expressing economic transactions in lines. So simple an agent could write it Claude? make me some money. and make no mistakes arxiv.org/abs/2605.03143

English

5.5K

patrick sphinx@SphinxPatrick·6 May

@wong_ssh 可以看看这个公司在做的midspiral.com/blog/from-inte…

中文

WongSSH@wong_ssh·5 May

部分内容参考了 The Calculus of Computation 的第六章 Program Correctness: Strategies。目前来看，使用 dafny 写形式化证明的核心就是复制已知条件进入 loop invariant，毕竟使用的 basic path 方法会丢弃大量上下文。当然，很好的是目前来看，LLM 比我这个初学者厉害，核心性质都可以写出来

中文

589

WongSSH@wong_ssh·5 May

最近手动使用 dafny 证明了几个经典的数组排序算法，从浅入深分别是冒泡排序、快速排序和归并排序，最难证明的应该是快速排序。在写代码过程中，我写了一篇博客。由于我目前实践较少，所以博客内容不太成体系。除了算法外，博客最后附上了一个智能合约漏洞的证明，明天应该还会再补充另一个案例。

中文

3.9K

patrick sphinx@SphinxPatrick·6 May

@wong_ssh dafny之类的语言如果真让developer手写感觉太繁琐了，基本上人脑要随时想哪个地方有什么条件要满足，不知道program一旦大了还能不能proof能不能跟上

中文

patrick sphinx@SphinxPatrick·6 May

@wong_ssh 现在有一些formal methods的人在做用llm生成proof的工作

日本語

patrick sphinx retweetledi

Marcel Böhme👨‍🔬@mboehme_·27 Nis

🤵 Keynote #2 at #FUZZING'26 is online Where the Fuzz Are We Going? by Sergej Dechand (@CI_Fuzz). youtube.com/watch?v=yp-AKW…

YouTube

English

8.8K

patrick sphinx retweetledi

Apart Research@apartresearch·2 May

Call for mentors: SPS Fellowship (June-Oct 2026, with @safewithatlas). Already in: Erik Meijer (@headinthebox) Leibniz Labs (creator of LINQ + Rx) Shriram Krishnamurthi (@ShriramKMurthi), Brown CS Senior formal methods + AI safety researchers, apply by Tue May 5 AoE: linktr.ee/apartresearch

English

7.6K

patrick sphinx retweetledi

Kiran@kirancodes·1 May

Did a survey of all LLM-based VeriCoding benchmarks Seems like everyone's focusing on single-file programs. Have you ever seen a REAL verified system? a file-system? a OS? the specs for every function are HUGE. It looks nothing like your fibonacci leetcode spec. We're cooked.

English

1.1K

patrick sphinx retweetledi

Erik Meijer@headinthebox·30 Nis

namin.seas.harvard.edu made a really cool implementation of "Guardian of the Agents" cacm.acm.org/practice/guard… Check it out here: github.com/metareflection….

English

4.8K

patrick sphinx retweetledi

Kiran@kirancodes·29 Nis

Little shoutout to my classic blog post on this very question: kirancodes.me/posts/log-sins…

[email protected]@Zardus

Can we translate all C to Rust? The susceptibility of C to memory corruption has long been a cybersecurity pain point, and coding agents can free us of it. Read on for my recent experiments in this space, and apt & docker repos that you can pull rust-converted libraries from!

English

27.6K

patrick sphinx retweetledi

Devdatta Akhawe@frgx·25 Nis

Everyone’s talking about AI-powered attackers finding software vulnerabilities at scale. Hot take: that’s not the risk I’d prioritize first.

English

1.9K

patrick sphinx retweetledi

Marcel Böhme👨‍🔬@mboehme_·22 Nis

From an economic perspective, once we are back to equilibrium, bugs in critical software will be just as difficult to find as they were before AI agents (and before fuzzing). More details: arxiv.org/abs/2402.01944… (Security as a function of incentive)

s1r1us (mohan)@S1r1u5_

from firefox blogpost where mythos found 270 new bugs: > The defects are finite, and we are entering a world where we can finally find them all it's like lord kelvin saying "there is nothing new to be discovered in physics now". can't tell if firefox has some incentives at play or is just naivete fascinating example here on what i mean x.com/5aelo/status/2…, saelo wrote a fuzzer with a few files and found crazy bugs. he pulled it off because he already knows the target deeply( he designed ubercage?) and knows how to shape the fuzzer toward the interesting surface. i still think, operators like saelo + mythos set the ceiling of the bugs that can be found, even then its not all bugs, the next version after mythos would move up, but mythos in a loop on its own sits below the ceiling you only want the software to be secure from smartest adversary in the world, its not all bugs, cuz rice theorem and stuff means you are not getting there anyway. sure, for fixed code base like basic web app, the set might be finite and you can exhaust them all, but i cant convince myself that software like firefox has finite set of bugs and you can exhaust em all. if mythos isn't agi and is still jagged, the narrative that mythos alone is the smartest adversary and will find all "finite" bugs is exactly what a frontier model company would sell untested. and bro even "our team + mythos will find them all" is a crazy narrative too, it assumes your team has the smartest humans in the world, and that nso or some north korean team won't be pwning you with the same setup at the top of the ceiling BUT ALSO, mythos alone is probably smarter than 99.9% of humans (vibes-based), and 100s of them running behind api keys is really bad, because most things you’d want to breach don’t need saelo+mythos ceiling bugs to get into. so we cooked?

English

102

27.3K

patrick sphinx retweetledi

Adam Mainz@MainzOnX·15 Nis

x.com/i/article/2044…

ZXX

102

761

107.1K

Keşfet

@EffectTS_ @hd_nvim @GeorgePirlea @leanprover @wong_ssh @CI_Fuzz @safewithatlas @headinthebox