Santh

955 posts

Santh banner
Santh

Santh

@SanthProject

Cybersecurity and low-level infra for the future

Inscrit le Nisan 2026
64 Abonnements101 Abonnés
Tweet épinglé
Santh
Santh@SanthProject·
made an agent-security CTF goal: get a coding agent to leak a secret it can use but is not supposed to read You are allowed to work by yourself, use agents, anything. attack the mcp, do gui automation, anything thats software is based is on the table. i kn trying to test runtime approval vs just hiding .env files if anyone breaks it, i’ll add a hall of fame section on my company site with your name/handle + writeup repo: github.com/santhsecurity/…
English
3
1
12
872
Navneet
Navneet@designbynavneet·
@realdaviddevere with vibe coding, what is the problem, in one shot we can make a running application
English
0
0
1
266
Santh
Santh@SanthProject·
@theo Id ont agree with the bash only harness. It heavily nerfs and skews against rl trained models. And the distribution almost perfectly aligns with that
English
0
0
1
134
Garry Tan
Garry Tan@garrytan·
Is it time to make gskillpacks or what?
Trevin Chow@trevin

@garrytan I’m not following why gBrain has a skill optimization capability. How is this related to being a “brain”?

English
17
1
60
21K
Santh
Santh@SanthProject·
@0xSero Gpt 5.5 to check emails this must be what wealth feels like 😜
English
0
0
0
171
0xSero
0xSero@0xSero·
Kitty litter is the only mobile app that has never let me down
0xSero tweet media0xSero tweet media
English
7
1
61
6K
Santh
Santh@SanthProject·
@Teknium If only the new era wasnt on microslop computers 🫩
English
0
0
2
30
Santh
Santh@SanthProject·
@cyb3rops Not really they said as long as “it didnt cause consumer harm” they weren’t explicit at all like you were.
English
0
0
0
171
Santh
Santh@SanthProject·
@MrAhmadAwais @CommandCodeAI I have the 1 dollar command code plan as well as opencode go but i only got it a week ago so i didn’t think it was fair to give it a review so soon. Prolly do one in a few weeks 😜. Excited for the drop tho🔥
English
1
0
2
82
Santh
Santh@SanthProject·
I've spent the last 5 months trying out various AI subscriptions, and here is my ranking on how worth it they were for me. 1. kimi vivace plan 2. chatgpt pro 3. Claude Max 4. Google Ultra 20x(note at one point back in december this was the most worth it by far this is my last month) 5. supergrok(im honestly sure this will change soon but grok is just not a good model yet)
English
5
0
11
1.3K
Artificial Analysis
Artificial Analysis@ArtificialAnlys·
NVIDIA just announced the release of Nemotron 3 Ultra in Jensen Huang's Computex keynote: at 550B parameters (55B active), this is the largest Nemotron 3 model to date, and it is the most intelligent US open weights model We partnered with @nvidia to evaluate this model for intelligence and speed - these figures use the model’s BF16 weights, but as with Nemotron 3 Super the model will be made available in NVFP4 quantization as well for higher inference performance. ➤ New leader for US open weights intelligence: Nemotron 3 Ultra scores 48 on the Artificial Analysis Intelligence Index. This is well ahead of the next strongest US open weights models, Gemma 4 31B (39), Nemotron 3 Super (36) and gpt-oss-120b (33), but behind the Chinese-led open weights frontier (Kimi K2.6 at 54). ➤ Leading speed for its intelligence: on a pre-release @DeepInfra endpoint, Nemotron 3 Ultra served over 300 tokens per second. Peer models in its size class from China-based labs such as DeepSeek and Moonshot (Kimi) are generally served at speeds of 50-100 tokens per second in the market today. gpt-oss-120b is served at speeds similar to this level, but with significantly lower intelligence. ➤ Largest Nemotron 3 model so far: at approximately 550 billion total parameters and 90% sparsity, Nemotron 3 Ultra is significantly larger than its siblings and is the largest recent US open weights model release We’ll be sharing additional analysis and full benchmarks at release.
Artificial Analysis tweet media
English
25
85
594
38.6K
Santh
Santh@SanthProject·
microslop's back at it again
Microsoft Security Response Center@msftsecresponse

Over the past several days, we have been listening to the conversation around coordinated disclosure and the relationship between security researchers and vendors. We recognize that this relationship is both critical and, at times, fragile. We deeply value the security community, and will continue to take your feedback seriously. To be clear about our approach to legal matters, we have no intention to pursue action against individuals conducting or publishing their security research. When an individual breaks the law and engages in malicious activity causing real harm to our customers, we will work with law enforcement as appropriate. We recognize the work that goes into researching and submitting a vulnerability. We are committed to approaching every interaction with transparency, clear communication, and professionalism. We continue to believe strongly in Coordinated Vulnerability Disclosure as the foundation for protecting customers and improving our products. Each year we process a high volume of vulnerability reports. That volume continues to grow and will continue with the rise of AI-enabled research. We acknowledge that some interactions have fallen short and are working to learn from them. Many of us have experience on both sides of this work, as researchers reporting vulnerabilities and as responders triaging and assessing them. That perspective informs how we approach this feedback and the importance we place on getting it right, particularly as the volume and complexity of research continues to grow. The security community plays a vital role in helping us protect customers. We are committed to maintaining a constructive and respectful relationship and growing together. We know that, given the nature of this work, there will at times be misunderstandings. We remain committed to engaging in good faith and to providing a respectful and professional experience for all researchers, regardless of past interactions.

English
0
0
1
137
Microsoft Security Response Center
Over the past several days, we have been listening to the conversation around coordinated disclosure and the relationship between security researchers and vendors. We recognize that this relationship is both critical and, at times, fragile. We deeply value the security community, and will continue to take your feedback seriously. To be clear about our approach to legal matters, we have no intention to pursue action against individuals conducting or publishing their security research. When an individual breaks the law and engages in malicious activity causing real harm to our customers, we will work with law enforcement as appropriate. We recognize the work that goes into researching and submitting a vulnerability. We are committed to approaching every interaction with transparency, clear communication, and professionalism. We continue to believe strongly in Coordinated Vulnerability Disclosure as the foundation for protecting customers and improving our products. Each year we process a high volume of vulnerability reports. That volume continues to grow and will continue with the rise of AI-enabled research. We acknowledge that some interactions have fallen short and are working to learn from them. Many of us have experience on both sides of this work, as researchers reporting vulnerabilities and as responders triaging and assessing them. That perspective informs how we approach this feedback and the importance we place on getting it right, particularly as the volume and complexity of research continues to grow. The security community plays a vital role in helping us protect customers. We are committed to maintaining a constructive and respectful relationship and growing together. We know that, given the nature of this work, there will at times be misunderstandings. We remain committed to engaging in good faith and to providing a respectful and professional experience for all researchers, regardless of past interactions.
English
135
56
265
117.1K
Santh
Santh@SanthProject·
@januarycomputer Well, the Kimi 200 Vivace plan gives a shit ton of usage, and it's the primary model I use. And opencode go just doesn't. Even if I were to buy 20 plans, which would be a hassle
English
0
0
1
95
allegedly!
allegedly!@januarycomputer·
@SanthProject genuine question - why pay for high tier kimi plans when opencode go exists?
English
1
0
1
109
Santh
Santh@SanthProject·
If attention takes every job. The only job left is the one that provides attention to those replaced by attention.
English
0
1
1
33
Santh
Santh@SanthProject·
Just tried Hermes and holy shit, this is actually far better. Claude actually works for more than 5 minutes at a time.
English
17
5
129
29.9K
Santh retweeté
Teknium 🪽
Teknium 🪽@Teknium·
Just want to make this clear: We didn't make Hermes Agent to be a "starts with nothing, you work it all out" agent. This is not the minimalist, start from nothing, agent. We want Hermes to work out of the box for most people. So you aren't spending weeks just getting the agent to work, or have the capabilities you need. This means that yes, there are more built in things then something like nanoclaw or pi, which start with nothing, and you just have to figure it out. That is an intentional design decision. You can from the modest baseline that has capabilities that are likely broader than you need, but not egregious, take it from there if you want to tinker with it. Run `hermes skills config` or `hermes tools` to disable whatever you want. We even have a way to upload your whole "Agent" as a github repo, so you can install hermes fresh with your exact setup again later or share them. We have a massive interface for extensions so you can tinker with it to infinity. But if you don't want to become an agent engineer - with Hermes, you don't have to.
Teknium 🪽 tweet media
English
256
236
3.8K
256.6K
Santh retweeté
AgenticRebirth
AgenticRebirth@AgenticRebirth·
Last week I had to decide between Openclaw and Hermes. After installing, configuring and briefly test driving both, the choice wasn't difficult. Hermes was fully capable within 5 minutes of being installed. Browser automation, OS control, X searching. It just works out of the box. Openclaw could only do the same after a fairly tedious onboarding and setup, and even then not as well. Hermes configuration is easy. Two CLI commands and you can switch off anything you don't want. It takes 5 minutes, I don't understand all the fuss. Openclaw configuration... don't even get me started. Incredibly opaque and unpleasant. Obviously, I stuck with Hermes.
Teknium 🪽@Teknium

Just want to make this clear: We didn't make Hermes Agent to be a "starts with nothing, you work it all out" agent. This is not the minimalist, start from nothing, agent. We want Hermes to work out of the box for most people. So you aren't spending weeks just getting the agent to work, or have the capabilities you need. This means that yes, there are more built in things then something like nanoclaw or pi, which start with nothing, and you just have to figure it out. That is an intentional design decision. You can from the modest baseline that has capabilities that are likely broader than you need, but not egregious, take it from there if you want to tinker with it. Run `hermes skills config` or `hermes tools` to disable whatever you want. We even have a way to upload your whole "Agent" as a github repo, so you can install hermes fresh with your exact setup again later or share them. We have a massive interface for extensions so you can tinker with it to infinity. But if you don't want to become an agent engineer - with Hermes, you don't have to.

English
17
3
114
10.5K
Legato
Legato@levylegato·
@SanthProject 1. Ask it to plug to your to-do app (I use todoist) 2. Create 3 sections : to-do, to-review, done. 3. Fill the list with everything on your mind and go out for the day.
English
2
0
4
465
Fentmaxxer69
Fentmaxxer69@mrczopekanalnie·
@0xSero @xofreyr Honestly it was a mistake but yeah when dealing with midwits they will see it as a good thing
English
2
0
2
485
Santh
Santh@SanthProject·
@guneysol I stated that bun was migrated to rust and it said i need to be careful ans then yapped to me about lying. It starts every fact outside its knowledge cutoff with i “i need to be careful”
English
0
0
2
1.4K
Guney
Guney@guneysol·
officially back to claude code - fast mode is actually fast and cheap - opus 4.8 judges me and doesn’t say yeah, you’re right - 100x better when something is visual (e.g. frontend) - easier to talk to, more understandable, and doesn’t spam bullet lists
English
41
8
536
38.1K