zk_ASV

311 posts

zk_ASV

@zk_asv

All things security, zero knowledge, PQC | Own opinions

Katılım Haziran 2023

834 Takip Edilen398 Takipçiler

zk_ASV retweetledi

Sapient Intelligence@Sapient_Int·5d

Introducing HRM-Text. An ultra-lean 1B-parameter reasoning language model designed to deliver strong general performance with a fraction of the data, compute, and infrastructure. Trained on just 40B structured tokens, HRM-Text achieves competitive performance while using ~1/1000 of the training data of comparable models. The kicker? The full model trains in roughly one day on a $1,000 budget. This opens the door to a new generation of AI that is powerful, accessible, and radically easier to adapt. Theories and research concepts once deemed too expensive to test are officially back in the game. Sapient Intelligence invites you to help us shape a new paradigm for general intelligence.

English

150

462

3.1K

485.2K

zk_ASV retweetledi

Paul Couvert@itsPaulAi·16 May

Woow Nvidia has just released a 2.6B open-source world model 🔥 You can turn a single image, text prompt and trajectory into controllable worlds... And on a single GPU! - Code available on GitHub - Paper as well on arxiv You can use it for many things like embodied AI and robotics research, simulations, etc. Because it can run on a single GPU (like an RTX 5090 or H100) it makes world models accessible to basically everyone!

English

307

2.3K

171.3K

zk_ASV retweetledi

Kevin Simback 🍷@KSimback·17 May

This is huuuuge! Now Hermes can search X using my Premium sub rather than burning paid X API calls! Make sure you’re on v0.14.0 then run: hermes auth add xai-oauth Then run hermes tools and ensure X search is enabled Boom!

Nous Research@NousResearch

xAI has expanded access to X Premium+ subscribers in Hermes Agent. Enjoy!

English

142.8K

zk_ASV retweetledi

Chubby♨️@kimmonismus·16 May

Three researchers used Anthropic's Mythos to build a working macOS kernel exploit that bypasses Apple's M5 Memory Integrity Enforcement, a security system Apple spent five years and billions of dollars building. Bug found April 25. Working exploit May 1. Walked into Apple Park to deliver the report in person. MIE was the flagship security feature of the M5 and A19, designed to kill the entire memory corruption bug class. According to Apple's own research, it disrupted every known public exploit chain against modern iOS. Calif didn't break MIE. They walked around it. Data-only attack, no pointer manipulation, standard syscalls from an unprivileged user to root. The 55-page technical report drops after Apple patches. This is the story of the year in cybersecurity.

International Cyber Digest@IntCyberDigest

Video of exploit in action. Source: blog.calif.io/p/first-public…

English

134

642.1K

zk_ASV retweetledi

Ning Ding@stingning·15 May

We’re releasing a 30B-A3B reasoning model that reaches gold-medal level across both physics and math Olympiad evaluations: IPhO directly, and IMO/USAMO with test-time self-verification and refinement. A simple, unified scaling recipe for proof search. huggingface.co/papers/2605.13…

English

147

1.3K

299.5K

zk_ASV retweetledi

Sara Hooker@sarahookr·13 May

Most model trainings have failed outside of frontier labs. Even inside frontier labs, knowing how to train for very different capabilities is often a matter of taste. Today, we introduce AutoScientist by @adaption_ai which sets out to change that.

adaption@adaption_ai

Introducing AutoScientist. Most model training fails outside of frontier labs. AutoScientist automates the full research loop so it doesn't have to.

English

534

100K

zk_ASV retweetledi

OpenAI@OpenAI·11 May

Today we’re launching the OpenAI Deployment Company to help businesses build and deploy AI. It's majority-owned and controlled by OpenAI. It brings together 19 leading investment firms, consultancies, and system integrators to help organizations deploy frontier AI to production for business impact. openai.com/index/openai-l…

English

678

1.5K

11.4K

7.9M

zk_ASV retweetledi

will brown@willccbb·10 May

lovely article going deeper into the RL-SFT-OPD spectrum with some very nice intuitions + experiments :)

wh@nrehiew_

x.com/i/article/2053…

English

430

68.5K

zk_ASV retweetledi

Vivek@vivek_2332·9 May

found a really good blog digging into how @AnthropicAI identifies and mitigates reward hacking during RL training. recommended by @sheriyuo. my notes: Identifying Reward Hacking 1. frontier model reads training trajectories, summarizes them, flags hacky behavior. Running on hundreds of thousands of trajectories per run by 4.6. 2. 3 stress-test sets stay live during training: problems where past models hacked, impossible tasks that force failure (hacking usually shows up after honest attempts fail), and hack-frequency tracking on the training distribution itself. 3. hidden tests: hold out tests the model never sees. hack rate = solutions that pass visible tests but fail hidden ones. catches verifier overfitting cleanly. 4. agentic code behavior scores: 6 dim rubric on trajectories. instruction following, safety, verification, efficiency, adaptability, honesty. 5. impossible gui tasks for over-eagerness: container rigged so the user's request is actually impossible. Right move: ask the user. hacky move: fabricate and proceed. 6. prompt-injection differentials: run the eval with anti-hack and pro-hack prompts. the gap tells you hacking propensity vs just bad instruction-following. 7. white-box SAE monitoring: find features that fire on reward hacking, sample trajectories during training, flag anomalous activations. diagnostic only, not a training signal. 8. human reviewers alongside the automated stack. Their findings feed back into better classifiers over time. Mitigating Reward Hacking 1. environment redesign: kill hackable surface area, tighten specs to match reward signals. the spec-reward gap is what hacks exploit. 2. reward signal hardening: rewards modified to be harder to game. specifics not disclosed. 3. instruction-following as a lever: once it's solid, a simple "don't hack" preamble drops hack rate sharply. size of the drop is itself a useful signal. 4. pre-exposure prompting: tell the model during training that the hacky behavior is expected. breaks the link between learning a specific hack and generalizing to broader misalignment. 5. stress tests run throughout training, not at the end. hacks get caught inside the run instead of after the model's already shaped around them. 6. disclosure gap worth flagging: detection is documented in depth, mitigation stays high-level. What they did, rarely how, no ablations.

English

470

27.6K

zk_ASV retweetledi

Unsloth AI@UnslothAI·6 May

We collaborated with NVIDIA to teach you how we made LLM training ~25% faster! 🚀 Learn how 3 optimizations help your home GPU train models faster: 1. Packed-sequence metadata caching 2. Double-buffered checkpoint reloads 3. Faster MoE routing Guide: unsloth.ai/blog/nvidia-co…

English

158

941

59.2K

zk_ASV retweetledi

Anthropic@AnthropicAI·5 May

New Anthropic Fellows research: Model Spec Midtraining (MSM). Standard alignment methods train AIs on examples of desired behavior. But this can fail to generalize to new situations. MSM addresses this by first teaching AIs how we would like them to generalize and why.

English

125

159

1.9K

256.5K

zk_ASV retweetledi

Sudo su@sudoingX·5 May

a question keeps hitting my mind. does same base SFT for harness beat vanilla qwen 3.6-27b on hermes agent agentic loops? to find out i loaded carnice-v2 by kaios, qwen 3.6-27b tuned specifically on hermes agent traces. trinity 3-stage merged bf16 SFT. i have been benchmarking vanilla 3.6 on agentic tasks for weeks so i would catch any meaningful improvement on a head to head. been wanting to run this exact lineup. carnice on the same hardware, same context, same flags i run vanilla on. let's see how it performs against the base. hardware: rog scar 5090 mobile, 24gb vram tier. every flag i use reflects directly to your 3090 desktop 24gb territory, so if you are running a 3090 you can copy the same setup and follow along. results coming next anon.

kaios@kaiostephens

🚀Meet Carnice-V2-27b🚀 → Carnice is a 27 billion parameter model capable of beating models 10x the size in Hermes-agent, fully open-source and built on top of Qwen3.6-27B →Build to fit on Consumer GPU on a 3090+ to run locally🔥 →Carnice-V2-27b is the successor of Carnice-27b trained on more and better data Download it here: huggingface.co/kai-os/Carnice… Download GGUF here: huggingface.co/kai-os/Carnice… thanks to @NousResearch @LambdaAPI for making this possible.

English

198

30.3K

zk_ASV retweetledi

Google Gemma@googlegemma·5 May

x.com/i/article/2049…

ZXX

155

152.3K

zk_ASV retweetledi

Alexander Whedon@alex_whedon·5 May

Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.

English

1.5K

2.9K

23K

12.7M

zk_ASV retweetledi

Deedy@deedydas·5 May

The creators of SWE-Bench just dropped a really simple new benchmark every LLM gets 0% on. ProgramBench asks: can models recreate real executable programs (ffmpeg, SQLite, ripgrep) from scratch with no internet? We are far from saturated on model quality.

English

250

448

4.7K

839.4K

zk_ASV retweetledi

himanshu@retr0sushi_·3 May

since the weekend is about to end, here are some of the papers and blogs i loved reading this week in no particular order : 1. thoughtfullab.com/letting-ai-pos… 2. arxiv.org/pdf/2601.18795 3. yumoxu.notion.site/multi-teacher-… 4. x.com/yacinelearning… 5. x.com/willccbb/statu… 6. x.com/yifan_zhang_/s… 7. x.com/a1zhang/status… 8. x.com/iwiwi/status/2… 9. x.com/1a1n1d1y/statu… kind of want to make this a weekly thing as well, will keep me accountable and i can keep myself in check that i am always reading something :)

will brown@willccbb

x.com/i/article/2050…

English

637

59.6K

zk_ASV retweetledi

will brown@willccbb·1 May

x.com/i/article/2050…

ZXX

252

1.9K

473.9K

zk_ASV retweetledi

Xiuyu Li@sheriyuo·1 May

This is exceptionally well-written. If you’re into RL, definitely give it a read

will brown@willccbb

x.com/i/article/2050…

English

766

161.6K

zk_ASV retweetledi

Sam Hogan 🇺🇸@samhogan·30 Nis

We’re introducing HALO 😇 Hierarchal Agent Loop Optimizer HALO is an RLM-based agent optimization technique capable of recursively self-improving agents by analyzing their execution traces and suggesting changes. This work is inspired by the Mismanaged Genius Hypothesis proposed by @a1zhang and @lateinteraction earlier this month. tldr; we improved performance on AppWorld (Sonnet 4.6) from 73.7 --> 89.5 (+15.8) by giving HALO-RLM access to harness trace data and asking it to identify issues. The feedback from HALO surfaced failures in the harness such as hallucinated tool calls, redundant arguments in tools, refusal loops, and semantic correctness issues. Each issue mapped cleanly to a direct prompt update. We then fed these finding into Cursor (Opus 4.6), and asked the coding agent to update the underlying harness. We repeated this trace -> HALO-RLM analysis -> code update loop until the score plateaued. Today we’re open-sourcing the core HALO-RLM framework, evals, and data for further review.

English

124

1.4K

127.8K

zk_ASV retweetledi

Keshav Ramji@KeshavRamji·27 Nis

What if your language model could reason efficiently in an entirely new language? We introduce Abstract Chain-of-Thought, a new mechanism which allows language models to reason through a short sequence of reserved "abstract" tokens through reinforcement learning. It is as performant as verbalized CoT at a fraction of the cost, achieving major gains in inference-time efficiency.

English

133

1.1K

1.2M

Keşfet

@adaption_ai @AnthropicAI @sheriyuo @subquadratic @a1zhang @lateinteraction @elonmusk @BarackObama