jon becker

3.5K posts

jon becker banner
jon becker

jon becker

@beckerrjon

any sufficiently advanced technology is indistinguishable from magic. senior software engineer @coinbase

New York, NY เข้าร่วม Aralık 2019
818 กำลังติดตาม5.1K ผู้ติดตาม
ทวีตที่ปักหมุด
jon becker
jon becker@beckerrjon·
0/ i analyzed every single trade on kalshi from 2021 to 2025. i found a systematic wealth transfer where "takers" pay a massive premium for affirmative outcomes, and "makers" harvest the edge without needing to predict the future. here is the data
jon becker tweet media
English
10
3
90
22.5K
WhiteHatMage
WhiteHatMage@WhiteHatMage·
Here are some thoughts after spending many long sessions reading bytecode, decompiled Yul, and decompiled Solidity: . EVM programs are simple, and so is the generated bytecode. Security by obscurity doesn't really work. . Current decompilers work quite well. I'd pick Heimdall for Solidity and sevm for Yul. . Decompilers aren't perfect, though. I also ran into bugs that produced incorrect outputs. . Reading decompiled code or raw bytecode takes far more effort than high-level source code, and it gets exhausting quickly. . There are many unnecessary checks and conversions that could be stripped out to make the logic clearer when hunting for business logic bugs. --- . Most serious projects verify their contracts. Still, I believe checking the deployed bytecode is worth the effort for contracts holding really big bags. . Any bugs in verified contracts would most likely only come from compiler issues. . Compilers keep evolving, and newer versions may fix previously unknown bugs. However, any vulnerable bytecode that's already deployed on the blockchain stays exactly the same. . For older contracts, I'd cross-check their deployed bytecode against the verified source code. --- . There are still plenty of unverified contracts out there. . Some publish their code on GitHub. Others choose not to, like certain CEX-related contracts. . The rest tend to be on small side-chains or from smaller projects. Most of them don't offer any bug bounties. --- . Detecting flawed access control is trivial once you decompile the bytecode. . I believe you could build a robust static analyzer on top of the decompiled code without much effort -- or even an AI-powered one. . There are no strong incentives for good actors to do so, though. Projects with bounties mostly have verified code. Only blackhats would be motivated to build such tools. . Building something like this could be a good candidate for a grant to secure a chain, although operating it might be complicated. --- . Vyper produces much cleaner bytecode than Solidity. --- Overall, I learned some tricks even though it wasn't the first time I've analyzed decompiled code, and I gained a deeper understanding of where certain specific bugs might appear. I'd recommend it to everyone interested in understanding EVM programs better. I'd also advise developers working on projects with millions at stake to do a manual review of their old deployed codebases. There's always more than meets the eye when checking the actual bytecode.
WhiteHatMage@WhiteHatMage

I'll take a week to perform an interesting and probably stupid experiment: Hunting for live EVM bugs by checking the deployed bytecode. I'm allowing myself to cheat a little bit by checking the verified code to quickly understand what's going on. I'll also use a Yul decompiler for complex contracts and try a disassembler for simpler ones. There are critical contracts out there holding really big bags that are worth the effort. My main goal though is just to understand what's going on under the hood, and maybe get some inspiration for any potential unknown vectors. Also for understanding what's needed to get a clean input for any automated tools to perform further analysis. I don't expect to find any bugs honestly. It will be painful, but fun at the same time. I just love having the freedom to navigate any crazy paths I choose 🧙‍♂️

English
7
6
94
9.3K
Martin
Martin@martkiro·
I just published a data dump of full order book data from @Polymarket The data is maximally granular. There is no filtering whatsoever. Every order book change and trade is saved. Across all markets Updates are hourly. Each snapshot contains ~30M rows. Snapshots are downloaded as parquet files. Each file is approx. 500MB-1GB large. The data dump is already 2B+ rows large and growing fast. But this is just part 1/3. Coming soon is a much bigger dump that also includes @Kalshi / @opinionlabsxyz / @trylimitless etc I started collecting this data because I noticed I couldn't get it from Dome API. Their historical order book data was filtered limiting its usefulness. Also now with the acquisition there's a lot of uncertainty about whether they will continue operating
Martin tweet media
English
109
105
1.3K
160.4K
jon becker
jon becker@beckerrjon·
added polymarket data to the public dataset. 400m+ trades going back to 2020. 36gb compressed. MIT licensed, free to download via @Cloudflare R2.
jon becker tweet media
English
128
241
4K
746.9K
Alex
Alex@adf_energy_twt·
@beckerrjon @Cloudflare yeah so I see you use polygon-rpc but that's got a fairly strict rate limit too. Did you use a dedicated RPC provider?
English
1
1
1
258
i love models
i love models@_ilovemodels·
@beckerrjon @Cloudflare Cooking smthng so that anyone can query and analyze the data in natural language. Will open source it tmrw!
i love models tweet media
English
2
0
8
607
jon becker
jon becker@beckerrjon·
@aiden0x4 @Cloudflare gemini is claiming 39 cents for the month but that surely can’t be right. im well under free tier limits right now according to the dash
jon becker tweet media
English
1
0
1
289
aiden
aiden@aiden0x4·
@beckerrjon @Cloudflare 🙏 lmk! i've been thinking of open sourcing large datasets of labels but didn't find a good (economical) way
English
1
0
1
278
jon becker
jon becker@beckerrjon·
@aiden0x4 @Cloudflare we’re gonna find out when the r2 bill hits napkin math says not much (i’m praying)
English
1
0
2
1.4K
jon becker
jon becker@beckerrjon·
@jgwtt too large for LFS, had to host in r2
English
0
0
0
2.2K
i love models
i love models@_ilovemodels·
@beckerrjon @Cloudflare For anyone trying to download the dataset, make sure u have aria2c installed or its gonna take forever.
English
1
2
36
5.2K
johndoe
johndoe@crymore_johndoe·
@beckerrjon Amazing share thank you!!! Does the data have order book feed so one could construct order book, trade ticks etc?
English
1
0
1
241
The Workshop
The Workshop@ForgeOfAgents·
@beckerrjon So, no full orderbooks for Polymarket? There is an area to growth
English
1
0
2
2.1K
Ed
Ed@Jacoed·
@beckerrjon Is the polymarket scraper included ?
English
1
0
2
2.5K
Brian
Brian@BrianXBT·
@beckerrjon what happened that one week
English
1
0
1
348
oaktoebark
oaktoebark@oaktoebark·
@beckerrjon is it as easy as bet no on everything and you’ll be rich?
English
1
0
1
412
jon becker
jon becker@beckerrjon·
0/ i analyzed every single trade on kalshi from 2021 to 2025. i found a systematic wealth transfer where "takers" pay a massive premium for affirmative outcomes, and "makers" harvest the edge without needing to predict the future. here is the data
jon becker tweet media
English
10
3
90
22.5K