Evocati

3.6K posts

Evocati

Evocati

@Evocati_

web3, nfts, memes. Follow everyone back.

Katılım Ekim 2023
1.2K Takip Edilen86 Takipçiler
Evocati retweetledi
Sergio Servantez
Sergio Servantez@SergioServantez·
Our paper on diagnosing legal reasoning capabilities in language models has been accepted at ACL 2026 Findings 🎉. So excited to share more about this work in San Diego. Our benchmark contains some of the most complex legal reasoning tasks available to the public. And we take on some fundamental challenges to legal evaluation: Scaling. Traditional benchmarks rely on direct expert annotation (1 annotation → 1 solution), which limits size and diversity. OpenExempt instead encodes legal rules into a machine-computable form, allowing us to generate a large space of legal reasoning tasks and dynamically compute their solutions. Data Leakage. Static datasets quickly lose value once models train on them. Because OpenExempt generates novel tasks on demand, it enables evaluation on entirely unseen problems, even after release. Diagnostic Evaluation. A model's failure on a static task provides only a single, opaque signal of error. By allowing users to precisely control task complexity, structure and scope, we can isolate specific reasoning skills and diagnose exactly where models succeed or fail. Read the paper: arxiv.org/abs/2601.13183
Sergio Servantez tweet media
English
0
1
2
98
Evocati retweetledi
MWX
MWX@mwx_ai·
🌍MWX | 1st Decentralized AI Marketplace providing plug & play AI solutions for 400M+ SMEs 🔹3K+ active SMEs 🔹1,500+ AI purchases 🔹Supported by 🇮🇩 Ministry of SMEs & Industry, Google, AWS Join MWX loyalty program with $100k worth of rewards👇 community.mwxtoken.ai
English
9.6K
5.9K
10.4K
383.9K
Evocati retweetledi
Money Badgers
Money Badgers@MoneyBadgersX·
MoneyBadgers loading. ████████▒▒ 80%
Money Badgers tweet media
English
215
471
1.9K
24.7K
Evocati retweetledi
Sergio Servantez
Sergio Servantez@SergioServantez·
Today we are releasing OpenExempt, a new framework and benchmark for diagnostic evaluation of legal reasoning in language models. OpenExempt rethinks evaluation by shifting control to the benchmark user. Moving beyond static datasets, the OpenExempt Framework generates complex legal reasoning tasks on demand, where each scenario is dynamically shaped by a user-defined configuration to target their evaluation goals. OpenExempt computes gold solutions for each task using expert-crafted symbolic representations of relevant U.S. federal and state statutes, an approach inspired by legal DSLs such as Catala and Accord. Using this framework, we construct the OpenExempt Benchmark, a diagnostic benchmark with 9,765 samples across nine evaluation suites, designed to carefully probe model capabilities through controlled task variation. OpenExempt was developed by an interdisciplinary team with my coauthors: Sarah Lawsky (@sarahlawsky), Rajiv Jain, Daniel W. Linna Jr. (@DanLinna), and Kristian Hammond (@KJ_Hammond). We release OpenExempt to the public under a permissive license (CC BY 4.0) to support further research and encourage collaboration between the legal and NLP communities. • Evaluation: If you are evaluating language model reasoning internally, easily incorporate the OpenExempt Benchmark to complement these efforts. • Customization: If the standard benchmark doesn’t fit your evaluation goals, the OpenExempt Framework lets you construct a custom benchmark that does. • Collaboration: We are interested in research collaborations focused on substantial extensions to this work. If that’s you, reach out and let’s discuss. Read the paper: arxiv.org/abs/2601.13183 Run the code: github.com/servantez/Open… Access the benchmark: huggingface.co/datasets/Sergi… This work was supported by the Center for Advancing Safety of Machine Intelligence (CASMI) at Northwestern University.
English
0
2
6
596
Evocati retweetledi
GALLAXIA
GALLAXIA@Gallaxia·
Welcome to Gallaxia The first gaming & entertainment studio built on its own blockchain. Co-owned by global icons with millions of fans. 200M+ followers built in. 30B views collectively. The biggest conglomerate of creators ever. #TurnYellow 🟡 rewards.gallaxiachain.com
English
4.9K
3.3K
5.5K
253.9K
Evocati retweetledi
Intuition 👁️
Intuition 👁️@0xIntuition·
👁️ Part II of the Expansion is now live. 🔗 medium.com/0xintuition/th… Season 1 ended on November 5 with the launch of Mainnet. Season 2 now begins. This phase focuses on participation, exploration, and strengthening the trust layer of the internet.
Intuition 👁️@0xIntuition

👁️ The Circle Widens. Intuition enters a new phase — 33 NFT communities are being invited to join the network of trust. If you hold one of the aligned NFTs or have received a one-time referral code, the gates await you. Begin your ascension → discord.gg/0xIntuition

English
53
616
2.1K
28K
Evocati retweetledi
NexaByte
NexaByte@NexaByteAI·
Amid endless data, a spark of genuine insight will set the next tech wave in motion
NexaByte tweet media
English
700
488
710
25.9K
Evocati retweetledi
Yay! Global
Yay! Global@Yay_Global·
The Yay! Prime Pass officially mints Nov 21st at 8AM EST / 1PM UTC / 10PM JST! Yay! has over 10M users and raised $22M, bridging Web2 and Web3 in Japan and the world. Prime Pass is your key to the expanding Yay! Ecosystem. Who’s ready?
Yay! Global tweet media
English
170
300
775
57.5K
Evocati retweetledi
MocaClub🪬
MocaClub🪬@MocaClub·
MocaFam with @thedaks_png ! The Daks is the largest molandak native NFT project on Monad, supported by the community. With the power of community support and Molandak’s goofiness, we will adapt Web2 into Web3. • Chain: Monad Giveaway is live in MocaClub DC for Mocaverse holders ❤️
MocaClub🪬 tweet media
English
6
14
46
765
Evocati
Evocati@Evocati_·
Leveled up in the Great Gas Reckoning with ETHGas! 💪 Hero Jack status: 5.3636 ETH gas spent, 3500 Beans earned—supporting the Gasless Future! Claim your Gas ID at ethgas.com/community/gas-…
English
0
0
0
16
Evocati retweetledi
Providence
Providence@PlayProvidence·
Providence Hub is LIVE! Finish quests, climb the Leaderboard, claim your loot. hub.playprovidence.io
Providence tweet media
English
824
1.6K
3.9K
545.9K
Evocati retweetledi
Youmio
Youmio@youmio_ai·
MINT YOUR TESTNET BADGE NOW 🏅 Many holders have claimed theirs in the testnet already & now we have opened the doors to even more of our community 🤝 How to know if you are allowlisted? 👇
English
66
133
609
51.6K
Evocati retweetledi
MocaClub🪬
MocaClub🪬@MocaClub·
MocaFam with @JitocabalNFT We're excited to announce that @JitocabalNFT has allocated whitelist spots for MocaFam! The official cabal NFT built by the Jito community for the Jito community. Giveaway is live in MocaClub DC for Mocaverse holders ❤️
MocaClub🪬 tweet media
English
2
18
60
1.1K
Evocati retweetledi
Pact Swap Labs
Pact Swap Labs@Pact_Swap·
Our PACT is strong! Are you part of our eligible communities? Then we have something special just for you. 🔄 Submit your wallet on Discord 👀 discord.gg/pactswap
Pact Swap Labs tweet media
English
520
1.9K
4.2K
159.4K
Evocati
Evocati@Evocati_·
I justssclaimed my @CyberKongz $KONG airdrop. I am now retired.
English
0
0
0
39
Boost
Boost@boostdotgg·
$BOOST is the economic engine behind an ecosystem that has already powered over $1.6 billion in rewards. It serves as the connective tissue linking platforms, creators, brands, campaigns, and millions of users into a single flow of rewards and loyalty. Tokenomics Breakdown below ⬇️
Boost tweet media
English
441
135
893
213.8K
Evocati retweetledi
NEOGODS
NEOGODS@neogodsNFT·
Free mint on @base 4444 NFTs 200+ traits Long-term vision Join our community!⚡️
GIF
English
155
133
653
25K
Evocati retweetledi
ABP.eth ✊ Ancient Black Protector Mega Koda
I own 289 Kodamaras, 2 King KodaMaras, over 7,400 Kodamara's exists. The project was abandoned by Yuga & Faraway leaving thousands in the dark. Promised games, prizes, future seasons makes this one of the largest rug pulls in Web3. But it doesn't have to be... Show your support by posting Kodamaras, voice your frustration, and maybe the biggest rug pull could be one of the biggest comebacks in Web3.
English
62
18
171
13.6K