wagwan

2.4K posts

wagwan

@ukelelelelo

basement Katılım Mart 2023

433 Takip Edilen137 Takipçiler

Sabitlenmiş Tweet

wagwan@ukelelelelo·27 Eki

machines of loving grace take my soul and give me a min risk estimate

English

378

wagwan@ukelelelelo·12h

claude more like c-laude

English

wagwan@ukelelelelo·12h

euphoria is turning out to be interesting

English

wagwan retweetledi

Cat is life ~猫は人生~@nekokolife·1d

カモノハシを正面から見た写真

日本語

150

6.1K

45.6K

1.1M

wagwan@ukelelelelo·1d

we bare bears

insane poses@insaneposes

English

wagwan@ukelelelelo·2d

got so frustrated with the trivia night on wed in stackers, decided to code my own version for friends ; project update soon

English

wagwan retweetledi

elie@eliebakouch·4d

Qwen first release on interpretability (qwen scope) is very interesting they use SAE features to identify what causes repetition in model outputs, then use steering to manufacture a "bad" rollout where the model repeats a lot. this gives RL a clear negative signal to learn from, since repetition barely shows up in normal rollouts so the model never gets punished for it they also use SAE features as a fingerprint for benchmarks, you look at which features each benchmark activates and compare overlap. lets you find redundancy inside a benchmark and across benchmarks without running any model. for instance 63% of GSM8K features are in MATH but only 10% the other way

English

116

784

38.8K

sharvi@sharvi_endait·5d

cool when your research project sparks a discussion in the ai community

Rohan Paul@rohanpaul_ai

This paper teaches LLMs to avoid showing personally identifiable information (PII) in chain of thought, without hurting answers much. The problem is that chain of thought, meaning the model's step by step notes, can spill names, health details, or account data even if the final reply is cleaned. Instead of removing private bits after the model thinks, the authors try to make it reason with placeholders so sensitive details never get written out. They created a dataset of 350 prompts with privacy safe reasoning traces, then tested 2 approaches, prompt engineering, meaning stricter instructions, and supervised fine tuning, meaning training on examples. They measured leakage by counting how much input PII shows up in the reasoning text and by using another LLM as a judge, and they found prompts help strong models most while fine tuning helps weaker ones. That matters because chain of thought often gets logged inside apps, so private reasoning reduces a big hidden privacy risk. ---- Paper Link – arxiv. org/abs/2601.05076 Paper Title: "Chain-of-Sanitized-Thoughts: Plugging PII Leakage in CoT of Large Reasoning Models"

English

206

wagwan@ukelelelelo·5d

@sharvi_endait lfg 🥳🥳🥳

wagwan retweetledi

Kxir 🪐@ikxir1·6d

Friend : Bhai Tu Ghar Mai Daaru Kaise Chupata Hai Yaar ?

हिन्दी

101

2.4K

49.5K

wagwan retweetledi

Sami Gold@souljagoyteller·27 Nis

Found in New York magazine, September 20 1993

English

264

11.3K

662.4K

wagwan@ukelelelelo·27 Nis

your lips , my lips , uncle chips 🗣️🗣️🗣️

English

wagwan@ukelelelelo·27 Nis

i am in a room with people filled with people who are not in tech , and they are actually happy with their life - honestly if feels that taking decisions for a high paying job always leads to dissatisfaction with life

English