

Light
13.4K posts

@lightdxj
content writer & learning graphic design








ritual research digest is back again with a new newsletter covering the latest papers on LLMs and crypto x ai this week they released 4 new papers, i will try to explain each paper in this post/ 1/ ResearchGym: evaluating language model agents on real-world ai research this paper introduces ResearchGym, a benchmark designed to test whether ai agents can handle real-world research tasks from start to finish. instead of simplified problems, it uses five papers (from ICML, ICLR, and ACL) the ai must understand the problem, implement solutions, and run experiments similar to what human researchers do the results show mixed performance: a gpt-5 agent beats the baseline only 1 out of 15 times and completes just 26.5% of tasks on average. sometimes it even outperforms the original solution, but overall, ai research ability is still inconsistent 2/ GLM-5: from vibe coding to agentic engineering this paper introduces GLM-5, a new ai model built to do more than just write code, it can handle longer, real-world tasks like an agent. it uses a special method called deepseek sparse attention, which helps the model handle long context while keeping costs lower even though the model is extremely large, it uses a smart setup where only a small part of it is active at a time. this keeps performance high without needing full power all the time. glm-5 ranks as the top open-source model, showing strong ability to manage ongoing and practical work 3/ large-scale online deanonymization with LLMs this paper shows that ai models can figure out who anonymous users are just by analyzing how they write online. in the past this kind of identification needed structured data like usernames, profiles, or metadata now LLMs can do it using only natural language like posts or comments across different platforms. they tested this in two ways: one where the ai could search the web freely, and another where it followed a step-by-step process to extract clues and match writing styles. even without personal details, writing patterns alone can reveal identity making online anonymity less secure than we once believed scary tbh 4/ Hybrid-Gym: training coding agents to generalize across tasks this paper explains that coding ai should learn more than just fixing bugs from gitHub issues. real developers explore code, understand systems, test software, and design how things work. so the researchers built Hybrid-Gym, a training setup where ai practices these skills using simulated tasks like locating functions in large codebases when they trained a coding model in this environment, it performed much better on real-world coding tests. this shows that practicing broader, realistic tasks helps ai generalize and become more useful for actual software engineering work that's it for this weeks ritual research digest, thank you for your attention to this matter xD gRitual ❖





New update The Path of Recognition Returns. I can also nominate one of you. Comment below your best content ever 💙Like 🔁RT






Polymarket has created a market that would monetize a nuclear attack amid increasing concerns that bets are happening among government insiders who can make military decisions.




day 5 FOKKKK I WANT TO SEND MESSAGE SO MUCH


Now, account abstraction. We have been talking about account abstraction ever since early 2016, see the original EIP-86: github.com/ethereum/EIPs/… Now, we finally have EIP-8141 ( eips.ethereum.org/EIPS/eip-8141 ), an omnibus that wraps up and solves every remaining problem that AA was intended to address (plus more). Let's talk again about what it does. The concept, "Frame Transactions", is about as simple as you can get while still being highly general purpose. A transaction is N calls, which can read each other's calldata, and which have the ability to authorize a sender and authorize a gas payer. At the protocol layer, *that's it*. Now, let's see how to use it. First, a "normal transaction from a normal account" (eg. a multisig, or an account with changeable keys, or with a quantum-resistant signature scheme). This would have two frames: * Validation (check the signature, and return using the ACCEPT opcode with flags set to signal approval of sender and of gas payment) * Execution You could have multiple execution frames, atomic operations (eg. approve then spend) become trivial now. If the account does not exist yet, then you prepend another frame, "Deployment", which calls a proxy to create the contract (EIP-7997 ethereum-magicians.org/t/eip-7997-det… is good for this, as it would also let the contract address reliably be consistent across chains). Now, suppose you want to pay gas in RAI. You use a paymaster contract, which is a special-purpose onchain DEX that provides the ETH in real time. The tx frames are: * Deployment [if needed] * Validation (ACCEPT approves sender only, not gas payment) * Paymaster validation (paymaster checks that the immediate next op sends enough RAI to the paymaster and that the final op exists) * Send RAI to the paymaster * Execution [can be multiple] * Paymaster refunds unused RAI, and converts to ETH Basically the same thing that is done in existing sponsored transactions mechanisms, but with no intermediaries required (!!!!). Intermediary minimization is a core principle of non-ugly cypherpunk ethereum: maximize what you can do even if all the world's infrastructure except the ethereum chain itself goes down. Now, privacy protocols. Two strategies here. First, we can have a paymaster contract, which checks for a valid ZK-SNARK and pays for gas if it sees one. Second, we could add 2D nonces (see docs.erc4337.io/core-standards… ), which allow an individual account to function as a privacy protocol, and receive txs in parallel from many users. Basically, the mechanism is extremely flexible, and solves for all the use cases. But is it safe? At the onchain level, yes, obviously so: a tx is only valid to include if it contains a validation frame that returns ACCEPT with the flag to pay gas. The more challenging question is at the mempool level. If a tx contains a first frame which calls into 10000 accounts and rejects if any of them have different values, this cannot be broadcasted safely. But all of the examples above can. There is a similar notion here to "standard transactions" in bitcoin, where the chain itself only enforces a very limited set of rules, but there are more rules at the mempool layer. There are specific rulesets (eg. "validation frame must come before execution frames, and cannot call out to outside contracts") that are known to be safe, but are limited. For paymasters, there has been deep thought about a staking mechanism to limit DoS attacks in a very general-purpose way. Realistically, when 8141 is rolled out, the mempool rules will be very conservative, and there will be a second optional more aggressive mempool. The former will expand over time. For privacy protocol users, this means that we can completely remove "public broadcasters" that are the source of massive UX pain in railgun/PP/TC, and replace them with a general-purpose public mempool. For quantum-resistant signatures, we also have to solve one more problem: efficiency. Here's are posts about the ideas we have for that: firefly.social/post/lens/1gfe… firefly.social/post/x/2027405… AA is also highly complementary with FOCIL: FOCIL ensures rapid inclusion guarantees for transactions, and AA ensures that all of the more complex operations people want to make actually can be made directly as first-class transactions. Another interesting topic is EOA compatibility in 8141. This is being discussed, in principle it is possible, so all accounts incl existing ones can be put into the same framework and gain the ability to do batch operations, transaction sponsorship, etc, all as first-class transactions that fully benefit from FOCIL. Finally, after over a decade of research and refinement of these techniques, this all looks possible to make happen within a year (Hegota fork). firefly.social/post/bsky/qmaj…


my own problem is I bought that xoob nft at 0.065 ETH who send me😭



