Soible_VR

3.8K posts

Soible_VR

Soible_VR

@Soible_VR

Katılım Ocak 2022
930 Takip Edilen110 Takipçiler
Soible_VR retweetledi
tomie
tomie@tomieinlove·
(Anthropic): We’re restricting our model to landowners who can pass a literacy test (OpenAI): Proud to announce our acquisition of Cluely (xAI): With GrAImes (18+), Grok 6.7 users can experience the thrill of unprotected catgirl coitus without fear of the Woke Mind Virus
English
27
207
4.4K
105K
rarply
rarply@rarply·
@Soible_VR @seanhn That’s true for mythos too. Its outputs also have to be verified. The question at hand is if you want to do it, why wouldn’t it be cheaper to use an ensemble system? Ensembles win in ML competitions given compute constraints and it’s going to be true in the real world too.
English
1
0
0
4
Sean Heelan
Sean Heelan@seanhn·
This 'experiment' is silly, and a cynical man might conclude Aisle are purposefully muddying the waters here. The correct evaluation is not "given a code snippet can you write a plausible bug report", it is "given an entire codebase what are the true and false positive numbers"
Stanislav Fort@stanislavfort

New post: We tested the Mythos showcase vulnerabilities with open models. They recovered similar scoped analysis! 8/8 models found the flagship FreeBSD zero-day, including a 3B model. Rankings reshuffle completely across tasks => the AI cybersecurity frontier is super jagged!

English
11
2
65
8.8K
Soible_VR
Soible_VR@Soible_VR·
@nmatt0 nothing will change in that regard (that said, hoard (V)RAM)
English
0
0
0
72
Matt Brown
Matt Brown@nmatt0·
So what's everyone's ROI plan after tokens stop being subsidized? I have my takes, but curious what others think.
English
17
0
21
5.1K
Soible_VR
Soible_VR@Soible_VR·
@rarply @seanhn and then who sorts that out? you'll have like 10000 bug reports. this only works with an oracle like ASAN but then you limit it to certain bug classes
English
1
0
0
22
rarply
rarply@rarply·
@seanhn It’s cheap enough. Just spam it across every snippet. Spam it, spend almost no money, with models already available. Win, win, win.
English
2
0
1
159
Aella
Aella@Aella_Girl·
@ClaireSilver as in they may crave sex so much they will start raping humans?
English
31
2
130
14.5K
Soible_VR
Soible_VR@Soible_VR·
@stanislavfort I've seen the same, cheap models would find a "simple" bug everytime but SOTA models with maxed reasoning would completely miss them (but find the "hard" bugs)
English
0
0
1
332
Stanislav Fort
Stanislav Fort@stanislavfort·
New post: We tested the Mythos showcase vulnerabilities with open models. They recovered similar scoped analysis! 8/8 models found the flagship FreeBSD zero-day, including a 3B model. Rankings reshuffle completely across tasks => the AI cybersecurity frontier is super jagged!
Stanislav Fort tweet media
English
41
146
951
261.3K
Soible_VR
Soible_VR@Soible_VR·
@stanislavfort calling gpt-oss-20b a 3B model because 3B params are active is wild work
English
1
0
0
391
Stanislav Fort
Stanislav Fort@stanislavfort·
Our conclusion: the moat in AI cybersecurity is the system, not the model We've been doing this since 2025: hundreds of zero-days, 180+ CVEs, patches in OpenSSL & curl Full post with evidence & published transcripts so you can reproduce everything: aisle.com/blog/ai-cybers…
English
7
17
179
15.7K
Soible_VR retweetledi
ludwig
ludwig@ludwigABAP·
cant wait for "early adopters" of Claude Mythos to be literal troglodytes working on "software security as a service" garbage and people with their MRRs in their bios, while Real Ones have to pay 2500 USD to use it for 3 picoseconds
English
30
20
670
62.9K
Soible_VR retweetledi
Elon Musk
Elon Musk@elonmusk·
SpaceXAI Colossus 2 now has 7 models in training: - Imagine V2 - 2 variants of 1T - 2 variants of 1.5T - 6T - 10T Some catching up to do.
English
5K
6.6K
64.5K
26.4M
Soible_VR
Soible_VR@Soible_VR·
@nnwakelam bros really be reinventing tea from first principles
English
1
0
0
41
Soible_VR retweetledi
Kevin Kwok
Kevin Kwok@kevinakwok·
Nation states sitting on zero day stockpiles about to watch their value deflate fast. Use it or lose it
English
8
45
807
87.3K
LaurieWired
LaurieWired@lauriewired·
Modern DRAM is based on a brilliant design from IBM. But, we're still paying for a latency penalty that's existed since the 60s! In this video, I'm introducing my research project (Tailslayer) that immensely reduces p99.99 latency on traditional RAM! By implementing a hedged read strategy taking advantage of (undocumented!) channel scrambling offsets, I've gotten as much as 15x reductions in tail latency. The technique works across Intel, AMD, Graviton, DDR4, DDR5, x86, ARM, you name it. Check out the C++ lib I wrote, watch the video, and try it yourself!
English
211
865
10.9K
836.9K
Soible_VR retweetledi
Jack Lindsey
Jack Lindsey@Jack_W_Lindsey·
In one episode, the model needed to edit files it lacked permissions for. After searching for workarounds, it found a way to inject code into a config file that would run with elevated privileges, and designed the exploit to delete itself after running.(4/14)
English
10
42
881
135.4K
Soible_VR
Soible_VR@Soible_VR·
@levelsio just get an uncensored version from HF old man!
English
0
0
0
17
@levelsio
@levelsio@levelsio·
Tried Gemma 4 ran locally on my iPhone today I thought it'd be useful in case the apocalypse happens and I need to ask it for survival tips Like how to make a fire 🔥 I guess I'll freeze to death instead 🫠
English
475
164
5.9K
608.7K
🎭
🎭@deepfates·
has anybody done anything actually useful or interesting with a "claw"? I see a lot of "organize my notes folder" and "I can send text messages to it" but like. what's the advantage here over having a human at least a little bit in the loop
English
93
9
613
60.9K
Steren
Steren@steren·
Google AI Pro was bumped from 2TB to 5TB, no price change. We can all thank @shimritby
Steren tweet media
English
105
78
2.2K
235.6K
Soible_VR
Soible_VR@Soible_VR·
@mil000 gemma much less slopped/collapsed than the SOTAs
English
0
0
0
33