Santh (@SanthProject) - Profil Twitter | Zamantika Mersobahis Locabet

Tweet épinglé

Santh@SanthProject·9 May

made an agent-security CTF goal: get a coding agent to leak a secret it can use but is not supposed to read You are allowed to work by yourself, use agents, anything. attack the mcp, do gui automation, anything thats software is based is on the table. i kn trying to test runtime approval vs just hiding .env files if anyone breaks it, i’ll add a hall of fame section on my company site with your name/handle + writeup repo: github.com/santhsecurity/…

English

3

1

12

872

Santh@SanthProject·7m

@designbynavneet @realdaviddevere Your mom vibecoded you

English

0

6

Navneet@designbynavneet·14h

@realdaviddevere with vibe coding, what is the problem, in one shot we can make a running application

English

0

1

266

David De Vere@realdaviddevere·14h

so youre telling me pewdiepie has done only 81 commits this year? yeah ok bro sure

Dan@pizzaboy

PewDiePie just shipped his free AI Workspace product 12 minutes ago btw

English

4

0

11

5.2K

Santh@SanthProject·2h

@theo Id ont agree with the bash only harness. It heavily nerfs and skews against rl trained models. And the distribution almost perfectly aligns with that

English

0

1

134

Theo - t3.gg@theo·2h

swe-bench is kind of a shitshow, and it makes evaluating LLMs hard. DeepSWE is the first agentic code bench that makes sense.

Datacurve@datacurve

Opus 4.8 is now on DeepSWE. On the default high thinking effort, it scores 6% higher than Opus 4.7 xhigh, while also lowering average cost per task.

English

19

9

254

21.4K

Santh@SanthProject·3h

@garrytan Nah

0

3

Garry Tan@garrytan·15h

Is it time to make gskillpacks or what?

Trevin Chow@trevin

@garrytan I’m not following why gBrain has a skill optimization capability. How is this related to being a “brain”?

English

17

1

60

21K

Santh@SanthProject·3h

@0xSero Gpt 5.5 to check emails this must be what wealth feels like 😜

English

0

171

0xSero@0xSero·4h

Kitty litter is the only mobile app that has never let me down

English

7

1

61

6K

Santh@SanthProject·3h

@Teknium If only the new era wasnt on microslop computers 🫩

English

0

2

30

Teknium 🪽@Teknium·5h

A new era of PC is coming, I hear

Nous Research@NousResearch

We have been working closely with @nvidia to ensure Hermes Agent works smoothly on their new @NVIDIARTXSpark superchip and integrates with the new OpenShell runtime, which connects Hermes to @Microsoft's security primitives. Watch our feature in the big announcement at Computex:

English

44

26

548

156.2K

Santh@SanthProject·3h

@cyb3rops Not really they said as long as “it didnt cause consumer harm” they weren’t explicit at all like you were.

English

0

171

Florian Roth ⚡️@cyb3rops·4h

They did it : x.com/msftsecrespons…

Florian Roth ⚡️@cyb3rops

This is how you de-escalate

English

4

5

43

4.4K

Santh@SanthProject·3h

@MrAhmadAwais @CommandCodeAI I have the 1 dollar command code plan as well as opencode go but i only got it a week ago so i didn’t think it was fair to give it a review so soon. Prolly do one in a few weeks 😜. Excited for the drop tho🔥

English

1

0

2

82

Ahmad Awais@MrAhmadAwais·3h

@SanthProject No list without $1 Go plan of @CommandCodeAI is ever a serious list. 💁‍♂️

English

2

0

7

274

Santh@SanthProject·6h

I've spent the last 5 months trying out various AI subscriptions, and here is my ranking on how worth it they were for me. 1. kimi vivace plan 2. chatgpt pro 3. Claude Max 4. Google Ultra 20x(note at one point back in december this was the most worth it by far this is my last month) 5. supergrok(im honestly sure this will change soon but grok is just not a good model yet)

English

5

0

11

1.3K

Santh@SanthProject·6h

@PeterSweeper @ArtificialAnlys @nvidia Well, a lot bigger. its nowhere near as impressive as ds4 flash

English

0

1

100

glorpius maximus@PeterSweeper·6h

@ArtificialAnlys @nvidia Smarter than DS4 Flash while having 3x the output speed. That is super impressive!

English

1

0

7

1.3K

Artificial Analysis@ArtificialAnlys·6h

NVIDIA just announced the release of Nemotron 3 Ultra in Jensen Huang's Computex keynote: at 550B parameters (55B active), this is the largest Nemotron 3 model to date, and it is the most intelligent US open weights model We partnered with @nvidia to evaluate this model for intelligence and speed - these figures use the model’s BF16 weights, but as with Nemotron 3 Super the model will be made available in NVFP4 quantization as well for higher inference performance. ➤ New leader for US open weights intelligence: Nemotron 3 Ultra scores 48 on the Artificial Analysis Intelligence Index. This is well ahead of the next strongest US open weights models, Gemma 4 31B (39), Nemotron 3 Super (36) and gpt-oss-120b (33), but behind the Chinese-led open weights frontier (Kimi K2.6 at 54). ➤ Leading speed for its intelligence: on a pre-release @DeepInfra endpoint, Nemotron 3 Ultra served over 300 tokens per second. Peer models in its size class from China-based labs such as DeepSeek and Moonshot (Kimi) are generally served at speeds of 50-100 tokens per second in the market today. gpt-oss-120b is served at speeds similar to this level, but with significantly lower intelligence. ➤ Largest Nemotron 3 model so far: at approximately 550 billion total parameters and 90% sparsity, Nemotron 3 Ultra is significantly larger than its siblings and is the largest recent US open weights model release We’ll be sharing additional analysis and full benchmarks at release.

English

25

85

594

38.6K

Santh@SanthProject·6h

microslop's back at it again

Microsoft Security Response Center@msftsecresponse

Over the past several days, we have been listening to the conversation around coordinated disclosure and the relationship between security researchers and vendors. We recognize that this relationship is both critical and, at times, fragile. We deeply value the security community, and will continue to take your feedback seriously. To be clear about our approach to legal matters, we have no intention to pursue action against individuals conducting or publishing their security research. When an individual breaks the law and engages in malicious activity causing real harm to our customers, we will work with law enforcement as appropriate. We recognize the work that goes into researching and submitting a vulnerability. We are committed to approaching every interaction with transparency, clear communication, and professionalism. We continue to believe strongly in Coordinated Vulnerability Disclosure as the foundation for protecting customers and improving our products. Each year we process a high volume of vulnerability reports. That volume continues to grow and will continue with the rise of AI-enabled research. We acknowledge that some interactions have fallen short and are working to learn from them. Many of us have experience on both sides of this work, as researchers reporting vulnerabilities and as responders triaging and assessing them. That perspective informs how we approach this feedback and the importance we place on getting it right, particularly as the volume and complexity of research continues to grow. The security community plays a vital role in helping us protect customers. We are committed to maintaining a constructive and respectful relationship and growing together. We know that, given the nature of this work, there will at times be misunderstandings. We remain committed to engaging in good faith and to providing a respectful and professional experience for all researchers, regardless of past interactions.

English

0

1

137

Santh@SanthProject·6h

@msftsecresponse fuck off microslop.

English

0

2

281

Microsoft Security Response Center@msftsecresponse·7h

Over the past several days, we have been listening to the conversation around coordinated disclosure and the relationship between security researchers and vendors. We recognize that this relationship is both critical and, at times, fragile. We deeply value the security community, and will continue to take your feedback seriously. To be clear about our approach to legal matters, we have no intention to pursue action against individuals conducting or publishing their security research. When an individual breaks the law and engages in malicious activity causing real harm to our customers, we will work with law enforcement as appropriate. We recognize the work that goes into researching and submitting a vulnerability. We are committed to approaching every interaction with transparency, clear communication, and professionalism. We continue to believe strongly in Coordinated Vulnerability Disclosure as the foundation for protecting customers and improving our products. Each year we process a high volume of vulnerability reports. That volume continues to grow and will continue with the rise of AI-enabled research. We acknowledge that some interactions have fallen short and are working to learn from them. Many of us have experience on both sides of this work, as researchers reporting vulnerabilities and as responders triaging and assessing them. That perspective informs how we approach this feedback and the importance we place on getting it right, particularly as the volume and complexity of research continues to grow. The security community plays a vital role in helping us protect customers. We are committed to maintaining a constructive and respectful relationship and growing together. We know that, given the nature of this work, there will at times be misunderstandings. We remain committed to engaging in good faith and to providing a respectful and professional experience for all researchers, regardless of past interactions.

English

135

56

265

117.1K

Santh@SanthProject·6h

@januarycomputer Well, the Kimi 200 Vivace plan gives a shit ton of usage, and it's the primary model I use. And opencode go just doesn't. Even if I were to buy 20 plans, which would be a hassle

English

0

1

95

allegedly!@januarycomputer·6h

@SanthProject genuine question - why pay for high tier kimi plans when opencode go exists?

English

1

0

1

109

Santh@SanthProject·7h

If attention takes every job. The only job left is the one that provides attention to those replaced by attention.

English

0

1

33

Santh@SanthProject·1d

Just tried Hermes and holy shit, this is actually far better. Claude actually works for more than 5 minutes at a time.

English

17

5

129

29.9K

Santh retweeté

Teknium 🪽@Teknium·1d

Just want to make this clear: We didn't make Hermes Agent to be a "starts with nothing, you work it all out" agent. This is not the minimalist, start from nothing, agent. We want Hermes to work out of the box for most people. So you aren't spending weeks just getting the agent to work, or have the capabilities you need. This means that yes, there are more built in things then something like nanoclaw or pi, which start with nothing, and you just have to figure it out. That is an intentional design decision. You can from the modest baseline that has capabilities that are likely broader than you need, but not egregious, take it from there if you want to tinker with it. Run `hermes skills config` or `hermes tools` to disable whatever you want. We even have a way to upload your whole "Agent" as a github repo, so you can install hermes fresh with your exact setup again later or share them. We have a massive interface for extensions so you can tinker with it to infinity. But if you don't want to become an agent engineer - with Hermes, you don't have to.

English

256

236

3.8K

256.6K

Santh retweeté

AgenticRebirth@AgenticRebirth·20h

Last week I had to decide between Openclaw and Hermes. After installing, configuring and briefly test driving both, the choice wasn't difficult. Hermes was fully capable within 5 minutes of being installed. Browser automation, OS control, X searching. It just works out of the box. Openclaw could only do the same after a fairly tedious onboarding and setup, and even then not as well. Hermes configuration is easy. Two CLI commands and you can switch off anything you don't want. It takes 5 minutes, I don't understand all the fuss. Openclaw configuration... don't even get me started. Incredibly opaque and unpleasant. Obviously, I stuck with Hermes.

Teknium 🪽@Teknium

Just want to make this clear: We didn't make Hermes Agent to be a "starts with nothing, you work it all out" agent. This is not the minimalist, start from nothing, agent. We want Hermes to work out of the box for most people. So you aren't spending weeks just getting the agent to work, or have the capabilities you need. This means that yes, there are more built in things then something like nanoclaw or pi, which start with nothing, and you just have to figure it out. That is an intentional design decision. You can from the modest baseline that has capabilities that are likely broader than you need, but not egregious, take it from there if you want to tinker with it. Run `hermes skills config` or `hermes tools` to disable whatever you want. We even have a way to upload your whole "Agent" as a github repo, so you can install hermes fresh with your exact setup again later or share them. We have a massive interface for extensions so you can tinker with it to infinity. But if you don't want to become an agent engineer - with Hermes, you don't have to.

English

17

3

114

10.5K

Santh@SanthProject·19h

@levylegato Ill try this out today fs

English

0

1

136

Legato@levylegato·1d

@SanthProject 1. Ask it to plug to your to-do app (I use todoist) 2. Create 3 sections : to-do, to-review, done. 3. Fill the list with everything on your mind and go out for the day.

English

2

0

4

465

Santh@SanthProject·19h

@mrczopekanalnie @0xSero @xofreyr You’re retarded

English

1

0

27

Fentmaxxer69@mrczopekanalnie·1d

@0xSero @xofreyr Honestly it was a mistake but yeah when dealing with midwits they will see it as a good thing

English

2

0

2

485

Freyr 🐞 | SAW CONAN ⚓️(Taylor's Version)@xofreyr·1d

bro put a country that got invaded by sweden, annexed by germany russia and austria and not on any map for 123 years, then invaded by germany and russia again with 6 mln of it's population dying, that also had to go through 45 years of communism in "rich from opressing others" 😭

pigsy@__pigsy

Here is my definitive oppression map. No, I will not be accepting any critiques or revisions. Thank you!

English

177

1.5K

26.7K

936.5K

Santh@SanthProject·19h

@guneysol I stated that bun was migrated to rust and it said i need to be careful ans then yapped to me about lying. It starts every fact outside its knowledge cutoff with i “i need to be careful”

English

0

2

1.4K

Guney@guneysol·22h

officially back to claude code - fast mode is actually fast and cheap - opus 4.8 judges me and doesn’t say yeah, you’re right - 100x better when something is visual (e.g. frontend) - easier to talk to, more understandable, and doesn’t spam bullet lists

English

41

8

536

38.1K

Santh

Découvrir