zk_ASV

311 posts

zk_ASV banner
zk_ASV

zk_ASV

@zk_asv

All things security, zero knowledge, PQC | Own opinions

Katılım Haziran 2023
834 Takip Edilen398 Takipçiler
zk_ASV retweetledi
Sapient Intelligence
Sapient Intelligence@Sapient_Int·
Introducing HRM-Text. An ultra-lean 1B-parameter reasoning language model designed to deliver strong general performance with a fraction of the data, compute, and infrastructure. Trained on just 40B structured tokens, HRM-Text achieves competitive performance while using ~1/1000 of the training data of comparable models. The kicker? The full model trains in roughly one day on a $1,000 budget. This opens the door to a new generation of AI that is powerful, accessible, and radically easier to adapt. Theories and research concepts once deemed too expensive to test are officially back in the game. Sapient Intelligence invites you to help us shape a new paradigm for general intelligence.
English
150
462
3.1K
485.2K
zk_ASV retweetledi
Paul Couvert
Paul Couvert@itsPaulAi·
Woow Nvidia has just released a 2.6B open-source world model 🔥 You can turn a single image, text prompt and trajectory into controllable worlds... And on a single GPU! - Code available on GitHub - Paper as well on arxiv You can use it for many things like embodied AI and robotics research, simulations, etc. Because it can run on a single GPU (like an RTX 5090 or H100) it makes world models accessible to basically everyone!
English
54
307
2.3K
171.3K
zk_ASV retweetledi
Chubby♨️
Chubby♨️@kimmonismus·
Three researchers used Anthropic's Mythos to build a working macOS kernel exploit that bypasses Apple's M5 Memory Integrity Enforcement, a security system Apple spent five years and billions of dollars building. Bug found April 25. Working exploit May 1. Walked into Apple Park to deliver the report in person. MIE was the flagship security feature of the M5 and A19, designed to kill the entire memory corruption bug class. According to Apple's own research, it disrupted every known public exploit chain against modern iOS. Calif didn't break MIE. They walked around it. Data-only attack, no pointer manipulation, standard syscalls from an unprivileged user to root. The 55-page technical report drops after Apple patches. This is the story of the year in cybersecurity.
International Cyber Digest@IntCyberDigest

Video of exploit in action. Source: blog.calif.io/p/first-public…

English
67
134
2K
642.1K
zk_ASV retweetledi
Ning Ding
Ning Ding@stingning·
We’re releasing a 30B-A3B reasoning model that reaches gold-medal level across both physics and math Olympiad evaluations: IPhO directly, and IMO/USAMO with test-time self-verification and refinement. A simple, unified scaling recipe for proof search. huggingface.co/papers/2605.13…
English
20
147
1.3K
299.5K
zk_ASV retweetledi
OpenAI
OpenAI@OpenAI·
Today we’re launching the OpenAI Deployment Company to help businesses build and deploy AI. It's majority-owned and controlled by OpenAI. It brings together 19 leading investment firms, consultancies, and system integrators to help organizations deploy frontier AI to production for business impact. openai.com/index/openai-l…
English
678
1.5K
11.4K
7.9M
zk_ASV retweetledi
Vivek
Vivek@vivek_2332·
found a really good blog digging into how @AnthropicAI identifies and mitigates reward hacking during RL training. recommended by @sheriyuo. my notes: Identifying Reward Hacking 1. frontier model reads training trajectories, summarizes them, flags hacky behavior. Running on hundreds of thousands of trajectories per run by 4.6. 2. 3 stress-test sets stay live during training: problems where past models hacked, impossible tasks that force failure (hacking usually shows up after honest attempts fail), and hack-frequency tracking on the training distribution itself. 3. hidden tests: hold out tests the model never sees. hack rate = solutions that pass visible tests but fail hidden ones. catches verifier overfitting cleanly. 4. agentic code behavior scores: 6 dim rubric on trajectories. instruction following, safety, verification, efficiency, adaptability, honesty. 5. impossible gui tasks for over-eagerness: container rigged so the user's request is actually impossible. Right move: ask the user. hacky move: fabricate and proceed. 6. prompt-injection differentials: run the eval with anti-hack and pro-hack prompts. the gap tells you hacking propensity vs just bad instruction-following. 7. white-box SAE monitoring: find features that fire on reward hacking, sample trajectories during training, flag anomalous activations. diagnostic only, not a training signal. 8. human reviewers alongside the automated stack. Their findings feed back into better classifiers over time. Mitigating Reward Hacking 1. environment redesign: kill hackable surface area, tighten specs to match reward signals. the spec-reward gap is what hacks exploit. 2. reward signal hardening: rewards modified to be harder to game. specifics not disclosed. 3. instruction-following as a lever: once it's solid, a simple "don't hack" preamble drops hack rate sharply. size of the drop is itself a useful signal. 4. pre-exposure prompting: tell the model during training that the hacky behavior is expected. breaks the link between learning a specific hack and generalizing to broader misalignment. 5. stress tests run throughout training, not at the end. hacks get caught inside the run instead of after the model's already shaped around them. 6. disclosure gap worth flagging: detection is documented in depth, mitigation stays high-level. What they did, rarely how, no ablations.
Vivek tweet media
English
3
32
470
27.6K
zk_ASV retweetledi
Unsloth AI
Unsloth AI@UnslothAI·
We collaborated with NVIDIA to teach you how we made LLM training ~25% faster! 🚀 Learn how 3 optimizations help your home GPU train models faster: 1. Packed-sequence metadata caching 2. Double-buffered checkpoint reloads 3. Faster MoE routing Guide: unsloth.ai/blog/nvidia-co…
Unsloth AI tweet media
English
23
158
941
59.2K
zk_ASV retweetledi
Anthropic
Anthropic@AnthropicAI·
New Anthropic Fellows research: Model Spec Midtraining (MSM). Standard alignment methods train AIs on examples of desired behavior. But this can fail to generalize to new situations. MSM addresses this by first teaching AIs how we would like them to generalize and why.
English
125
159
1.9K
256.5K
zk_ASV retweetledi
Sudo su
Sudo su@sudoingX·
a question keeps hitting my mind. does same base SFT for harness beat vanilla qwen 3.6-27b on hermes agent agentic loops? to find out i loaded carnice-v2 by kaios, qwen 3.6-27b tuned specifically on hermes agent traces. trinity 3-stage merged bf16 SFT. i have been benchmarking vanilla 3.6 on agentic tasks for weeks so i would catch any meaningful improvement on a head to head. been wanting to run this exact lineup. carnice on the same hardware, same context, same flags i run vanilla on. let's see how it performs against the base. hardware: rog scar 5090 mobile, 24gb vram tier. every flag i use reflects directly to your 3090 desktop 24gb territory, so if you are running a 3090 you can copy the same setup and follow along. results coming next anon.
Sudo su tweet media
kaios@kaiostephens

🚀Meet Carnice-V2-27b🚀 → Carnice is a 27 billion parameter model capable of beating models 10x the size in Hermes-agent, fully open-source and built on top of Qwen3.6-27B →Build to fit on Consumer GPU on a 3090+ to run locally🔥 →Carnice-V2-27b is the successor of Carnice-27b trained on more and better data Download it here: huggingface.co/kai-os/Carnice… Download GGUF here: huggingface.co/kai-os/Carnice… thanks to @NousResearch @LambdaAPI for making this possible.

English
16
12
198
30.3K
zk_ASV retweetledi
Alexander Whedon
Alexander Whedon@alex_whedon·
Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.
English
1.5K
2.9K
23K
12.7M
zk_ASV retweetledi
Deedy
Deedy@deedydas·
The creators of SWE-Bench just dropped a really simple new benchmark every LLM gets 0% on. ProgramBench asks: can models recreate real executable programs (ffmpeg, SQLite, ripgrep) from scratch with no internet? We are far from saturated on model quality.
Deedy tweet media
English
250
448
4.7K
839.4K
zk_ASV retweetledi
himanshu
himanshu@retr0sushi_·
since the weekend is about to end, here are some of the papers and blogs i loved reading this week in no particular order : 1. thoughtfullab.com/letting-ai-pos… 2. arxiv.org/pdf/2601.18795 3. yumoxu.notion.site/multi-teacher-… 4. x.com/yacinelearning… 5. x.com/willccbb/statu… 6. x.com/yifan_zhang_/s… 7. x.com/a1zhang/status… 8. x.com/iwiwi/status/2… 9. x.com/1a1n1d1y/statu… kind of want to make this a weekly thing as well, will keep me accountable and i can keep myself in check that i am always reading something :)
will brown@willccbb

x.com/i/article/2050…

English
5
62
637
59.6K
zk_ASV retweetledi
Sam Hogan 🇺🇸
Sam Hogan 🇺🇸@samhogan·
We’re introducing HALO 😇 Hierarchal Agent Loop Optimizer HALO is an RLM-based agent optimization technique capable of recursively self-improving agents by analyzing their execution traces and suggesting changes. This work is inspired by the Mismanaged Genius Hypothesis proposed by @a1zhang and @lateinteraction earlier this month. tldr; we improved performance on AppWorld (Sonnet 4.6) from 73.7 --> 89.5 (+15.8) by giving HALO-RLM access to harness trace data and asking it to identify issues. The feedback from HALO surfaced failures in the harness such as hallucinated tool calls, redundant arguments in tools, refusal loops, and semantic correctness issues. Each issue mapped cleanly to a direct prompt update. We then fed these finding into Cursor (Opus 4.6), and asked the coding agent to update the underlying harness. We repeated this trace -> HALO-RLM analysis -> code update loop until the score plateaued. Today we’re open-sourcing the core HALO-RLM framework, evals, and data for further review.
Sam Hogan 🇺🇸 tweet media
English
59
124
1.4K
127.8K
zk_ASV retweetledi
Keshav Ramji
Keshav Ramji@KeshavRamji·
What if your language model could reason efficiently in an entirely new language? We introduce Abstract Chain-of-Thought, a new mechanism which allows language models to reason through a short sequence of reserved "abstract" tokens through reinforcement learning. It is as performant as verbalized CoT at a fraction of the cost, achieving major gains in inference-time efficiency.
Keshav Ramji tweet media
English
60
133
1.1K
1.2M