Kurayami Yume Games

390 posts

Kurayami Yume Games banner
Kurayami Yume Games

Kurayami Yume Games

@GamesYume

Kurayami Yume Games is a small indie game studio with big dreams.

Katılım Ocak 2021
577 Takip Edilen110 Takipçiler
Kurayami Yume Games
Kurayami Yume Games@GamesYume·
@NetflixIT Quindi mi state dicendo che dopo la totale disfatta della diretta Paul vs Tyson che non ho potuto vedere grazie alla totale impreparazione della vostra piattaforma, adesso mi chiedete anche un aumento mensile? Ciao ciao!
Italiano
0
0
1
28
Netflix Italia
Netflix Italia@NetflixIT·
Novembre inizia con le vibes estive della quarta stagione di Outer Banks parte 2. Arcane, suddivisa in tre parti, ci accompagnerà per tutto il mese. Oltre ai toni rivoluzionari e all’azione, tenete bella stretta la copertina e lasciatevi travolgere dal mistero con Adorazione. Infine, torneremo in Austria con la seconda stagione de L'imperatrice per poi salutare il mese raccontando una leggenda dello sport in Senna.
Netflix Italia tweet media
Italiano
22
22
234
401.7K
Netflix
Netflix@netflix·
Jake Paul praises Mike Tyson: "He's the GOAT" #PaulTyson
English
3.9K
6.3K
73K
8.4M
Kurayami Yume Games
Kurayami Yume Games@GamesYume·
@kamotachi hey fellow developer, I just saw your game in the GDWC competition and I must say, it's awesome. We are participating as well with Horror Drift! Congratulations for Kinnikuneko, tried the demo and it's super fun 😼
English
1
0
1
33
Anthropic
Anthropic@AnthropicAI·
New Anthropic Paper: Sleeper Agents. We trained LLMs to act secretly malicious. We found that, despite our best efforts at alignment training, deception still slipped through. arxiv.org/abs/2401.05566
Anthropic tweet media
English
108
537
2.9K
1.8M
Bindu Reddy
Bindu Reddy@bindureddy·
@AnthropicAI To be clear, did you train them to be malicious and then prove they are malicious!! What is the point? This is like saying I wrote a malicious script and found I could write a malicious script!! 🤯🤯
English
30
6
99
12.9K
Andrej Karpathy
Andrej Karpathy@karpathy·
I touched on the idea of sleeper agent LLMs at the end of my recent video, as a likely major security challenge for LLMs (perhaps more devious than prompt injection). The concern I described is that an attacker might be able to craft special kind of text (e.g. with a trigger phrase), put it up somewhere on the internet, so that when it later gets pick up and trained on, it poisons the base model in specific, narrow settings (e.g. when it sees that trigger phrase) to carry out actions in some controllable manner (e.g. jailbreak, or data exfiltration). Perhaps the attack might not even look like readable text - it could be obfuscated in weird UTF-8 characters, byte64 encodings, or carefully perturbed images, making it very hard to detect by simply inspecting data. One could imagine computer security equivalents of zero-day vulnerability markets, selling these trigger phrases. To my knowledge the above attack hasn't been convincingly demonstrated yet. This paper studies a similar (slightly weaker?) setting, showing that given some (potentially poisoned) model, you can't "make it safe" just by applying the current/standard safety finetuning. The model doesn't learn to become safe across the board and can continue to misbehave in narrow ways that potentially only the attacker knows how to exploit. Here, the attack hides in the model weights instead of hiding in some data, so the more direct attack here looks like someone releasing a (secretly poisoned) open weights model, which others pick up, finetune and deploy, only to become secretly vulnerable. Well-worth studying directions in LLM security and expecting a lot more to follow.
Anthropic@AnthropicAI

New Anthropic Paper: Sleeper Agents. We trained LLMs to act secretly malicious. We found that, despite our best efforts at alignment training, deception still slipped through. arxiv.org/abs/2401.05566

English
184
677
4.8K
907K
Kurayami Yume Games
Kurayami Yume Games@GamesYume·
Some told us this is the opposite of a game, some told us this is exactly the type of idea current gaming needs. Simply walk forever in the immense void and relax. There is an endgame object to be found but we don't think anyone will ever find it. Enjoy. store.steampowered.com/app/2651100/In…
English
0
0
0
27