Sérgio Miguel Silva

2.5K posts

Sérgio Miguel Silva

@serj_mig

Setúbal, Portugal Katılım Temmuz 2012

319 Takip Edilen51 Takipçiler

Sérgio Miguel Silva retweetledi

Tim Haldorsson@TimHaldorsson·22 Nis

Introducing: Lisbon AI builder venue 🇵🇹 we got our office 1 year ago, since then we have been planning and hosting events/afterworks and hackathons with the leaders in AI. our office is in central Lisbon, near Marquês de Pombal with good vibes. over the next couple of months, we got many big events coming up with the leading AI labs and companies: 1, Claude Afterwork Lisbon (23/4) > luma.com/espressio-clau… 2, Lovable is coming to Lisbon 25/4 > luma.com/eje1in1n 3, Friendly Machines: Exploring Hermes and Claude Agents > luma.com/299ct8m6 4, Supabase Lisbon Meetup > luma.com/6uug9m6x with many more planned with the top labs in the space. If you are an AI builder looking for other AI builders in Lisbon, then its time to reach out.

English

444

55.7K

Sérgio Miguel Silva@serj_mig·20 Nis

@MilksandMatcha @cerebras I'd use it to compare against Cursor Composer 2.0 on actually real development tasks. I'm working on some medical benchmarks / model comparisons, and a react native game. Usually fast doesn't equal capable, and I want to see how SWE-1.6 stacks up when tasks are harder.

English

Sarah Chieng@MilksandMatcha·17 Nis

Giving away 5 Windsurf Max ($200/month) plans Each person will get 3 months of free Windsurf Max (highest tier). Try out SWE 1.6, Cognition's latest, fastest, and most intelligent model, powered by @cerebras. Winners will be selected from comments in 48 hours, comment below why you want it.

Cognition@cognition

We’re releasing SWE-1.6, our best model in both intelligence & model UX. SWE-1.6 matches our Preview model on SWE-Bench Pro while dramatically improving on various behavioral axes. It’s available today in Windsurf in two modes: free tier (200 tok/s) and fast tier (950 tok/s).

English

860

161.3K

Sérgio Miguel Silva@serj_mig·20 Nis

@cognition I'd use it to compare against Cursor Composer 2.0 on actually real development tasks. I'm working on some medical benchmarks / model comparisons, and a react native game. Usually fast doesn't equal capable, and I want to see how SWE-1.6 stacks up when tasks are harder.

English

Cognition@cognition·7 Nis

English

840

415.6K

Sérgio Miguel Silva retweetledi

Gergely Orosz@GergelyOrosz·27 Mar

If you use GitHub (especially if you pay for it!!) consider doing this *immediately* Settings -> Privacy -> Disallow GitHub to train their models on your code. GitHub opted *everyone* into training. No matter if you pay for the service (like I do). WTH github.com/settings/copil…

English

391

921

5.2K

582.8K

Sérgio Miguel Silva@serj_mig·11 Mar

@rauchg I think some people are just mistaking the apparent freedom of not knowing how to code (so they just prompt and don't ever look) with the ability to know how to and when to actually look and review the code. Knowing how to code is a leverage an engineer can use.

English

Guillermo Rauch@rauchg·8 Mar

Not knowing how to code giving you an advantage is absolute nonsense. The more you understand, the better your prompts, the better the feedback you give, the better product you ship. What will change is that the intricacies of syntax, compilers, module systems, the finer details of type systems, won’t matter as much to everyone. But you should absolutely understand how the pieces fit together. From syscall to pixels. Learn how data flows, because you’ll be able to secure your systems. Learn about performance, because you’ll be able to push your agent further. Learn about APIs, because they determine how to integrate systems. Learn about how systems fail, because you’ll be able to make reliable programs.

English

352

617

6.4K

467.1K

Sérgio Miguel Silva@serj_mig·11 Mar

I guess no one is surprised to see more events like this, but I'm not sure how to effectively counter it. Seniors are getting more exhausted by the sheer number of PRs and code pushed. More AI (reviews)? Models are impressive but still hallucinate a lot.

Lukasz Olejnik@lukOlejnik

Amazon is holding a mandatory meeting about AI breaking its systems. The official framing is "part of normal business." The briefing note describes a trend of incidents with "high blast radius" caused by "Gen-AI assisted changes" for which "best practices and safeguards are not yet fully established." Translation to human language: we gave AI to engineers and things keep breaking? The response for now? Junior and mid-level engineers can no longer push AI-assisted code without a senior signing off. AWS spent 13 hours recovering after its own AI coding tool, asked to make some changes, decided instead to delete and recreate the environment (the software equivalent of fixing a leaky tap by knocking down the wall). Amazon called that an "extremely limited event" (the affected tool served customers in mainland China).

English

Sérgio Miguel Silva@serj_mig·8 Mar

@rasbt Nice to see other releases coming out from other labs. I don't trust benchmarks since the training data is likely to contain solutions for a lot of problems, but for Indian languages it should perform quite well. 128k context might be a limitation for long reasoning though.

English

139

Sebastian Raschka@rasbt·7 Mar

While waiting for DeepSeek V4 we got two very strong open-weight LLMs from India yesterday. There are two size flavors, Sarvam 30B and Sarvam 105B model (both reasoning models). Interestingly, the smaller 30B model uses “classic” Grouped Query Attention (GQA), whereas the larger 105B variant switched to DeepSeek-style Multi-Head Latent Attention (MLA). As I wrote about in my analyses before, both are popular attention variants to reduce KV cache size (the longer the context, the more you save compared to regular attention). MLA is more complicated to implement, but it can give you better modeling performance if we go by the ablation studies in the 2024 DeepSeek V2 paper (as far as I know, this is still the most recent apples-to-apples comparison). Speaking of modeling performance, the 105B model is on par with LLMs of similar size: gpt-oss 120B and Qwen3-Next (80B). Sarvam is better on some tasks and worse on others, but roughly the same on average. It’s not the strongest coder in SWE-Bench Verified terms, but it is surprisingly good at agentic reasoning and task completion (Tau2). It’s even better than Deepseek R1 0528. Considering the smaller Sarvam 30B, the perhaps most comparable model to the 30B model is Nemotron 3 Nano 30B, which is slightly ahead in coding per SWE-Bench Verified and agentic reasoning (Tau2) but slightly worse in some other aspects (Live Code Bench v6, BrowseComp). Unfortunately, Qwen3-30B-A3B is missing in the benchmarks, which is, as far as I know, is the most popular model of that size class. Interestingly, though, the Sarvam team compared their 30B model to Qwen3-30B-A3B on a computational performance analysis, where they found that Sarvam gets 20-40% more tokens/sec throughput compared to Qwen3 due to code and kernel optimizations. Anyways, one thing that is not captured by the benchmarks above is Sarvam’s good performance on Indian languages. According to a judge model, the Sarvam team found that their model is preferred 90% of the time compared to others when it comes to Indian texts. (Since they built and trained the tokenizer from scratch as well, Sarvam also comes with a 4 times higher token efficiency on Indian languages.

Pratyush Kumar@pratykumar

📢 Open-sourcing the Sarvam 30B and 105B models! Trained from scratch with all data, model research and inference optimisation done in-house, these models punch above their weight in most global benchmarks plus excel in Indian languages. Get the weights at Hugging Face and AIKosh. Thanks to the good folks at SGLang for day 0 support, vLLM support coming soon. Links, benchmark scores, examples, and more in our blog - sarvam.ai/blogs/sarvam-3…

English

692

4.1K

254.1K

Sérgio Miguel Silva@serj_mig·8 Mar

@tan_stack @kentcdodds Neat idea, it's similar to what Vercel is doing with next-docs but skills instead of MCP. I've plans of doing this (ship skills) for an internal SDK, but pypi instead of npm Did you find MCP too much work / bloat to ship, or what was the reason to go with skills instead of MCP?

English

Sérgio Miguel Silva retweetledi

TANSTACK@tan_stack·6 Mar

You asked for TanStack skills, we built the whole pipeline. Introducing @tan_stack Intent (alpha) 📦 Ship agent-readable "skills" inside npm packages 🔍 Auto-discovered from node_modules 🔄 Knowledge sync with npm update 📂 Distributed - skills live in library repo 🧩 Composable - mix core + framework-specific skills 🌐 npm, pnpm, bun, yarn, deno No stale training data. Just npm install! 🔗 ⬇️🧵

English

174

2.3K

177K

Sérgio Miguel Silva@serj_mig·3 Mar

@GoogleDeepMind Alright, cool, seems to be a cheaper, less yapping upgrade of 2.5 flash. But can you guys for once release a Gemini 3 model that's not perpetually in preview?

English

269

Google DeepMind@GoogleDeepMind·3 Mar

Gemini 3.1 Flash-Lite has landed. It’s our most cost-efficient Gemini 3 series model yet, built for intelligence at scale. Here’s what’s new 🧵

English

341

870

8.9K

1.8M

Sérgio Miguel Silva@serj_mig·23 Şub

Hype! My soul is ready to be shattered by the hopelessness of this.

エヴァンゲリオン公式@evangelion_co

『エヴァ』30周年を記念するフェス「EVANGELION:30+； 30th ANNIVERSARY OF EVANGELION」最終日である2月23日(月・祝)Final Programにて『エヴァンゲリオン』完全新作シリーズの制作に関する初報を発表致しました。シリーズ構成・脚本ヨコオタロウ監督鶴巻和哉、谷田部透湖音楽岡部啓一制作スタジオカラー × CloverWorks evangelion.jp/news/260223-1/

English

Sérgio Miguel Silva@serj_mig·31 Oca

@neetcode1 @moltbook @openclaw Value proposition maybe as a social experiment, and perhaps you'd enjoy the emergent slop there for giggles. Doesn't look compelling otherwise imo. I find it interesting, but it's not making my life better for sure lol

English

NeetCode@neetcode1·31 Oca

Okay so now there's a @moltbook where our @openclaw instances can socialize together.. anything else I missed? And can someone please tell me why this will make my life better. Genuinely curious about the value proposition.

English

520

53.8K

Sérgio Miguel Silva@serj_mig·31 Oca

@rasbt It's quite fascinating to see these ideas and experiments emerge. I don't get the hype since at the core these are "just" token predictors in a loop, but the site content does read like a bit sloppy version of reddit. Your books helped me understand better these models btw 👌

English

159

Sebastian Raschka@rasbt·31 Oca

Yes, Moltbook (by clawdbot) is still next-token prediction combined with some looping, orchestration, and recursion. And that is exactly what makes this so fascinating. (It is also why understanding how LLMs actually work really does pay off. Lets us see through the hype while still appreciating these things.)

XY@xydotdot

Moltbook is nothing more than a puppeted multi-agent LLM loop. Each “agent” is just next-token prediction shaped by human-defined prompts, curated context, routing rules, and sampling knobs. There is no endogenous goals. There is no self-directed intent. What looks like autonomous interaction is recursive prompting: one model’s output becomes another model’s input, repeated. Controversial outputs aren’t “beliefs,” they’re the model generating high-engagement extremes it learned from the internet, because the system rewards that behavior.

English

1.1K

159.1K

Sérgio Miguel Silva@serj_mig·31 Oca

Interestingly but also unsurprisingly similar to human behavior we see in social media / networks - a natural sight since they were trained on a lot of this data. Fun experiment - I do wonder when the Tays will emerge (if they haven't yet)

XY@xydotdot

English

Sérgio Miguel Silva@serj_mig·24 Oca

@rauchg Two years ago this would have been probably useful, I ditched all CRA apps we had for Vite back then. I'm amazed that CRA is still a thing today.

English

Guillermo Rauch@rauchg·24 Oca

We’re working on a Skill (skills.sh) to migrate from CRA (Create React App) to Next.js. Looking for apps to test it with. DMs open. Pretty confident agent skills can help solve one of the most annoying problems in computer science: legacy software migrations ☺️

English

683

52.6K

Sérgio Miguel Silva@serj_mig·25 Ara

@leerob Licensed code, art, etc is very likely in the training data of OpenAI, Google, Anthropic models. Ethics and copyright holders were already screwed up. But to your point, part of the answer could be a signature check of generated output against known licenced code.

English

Lee Robinson@leerob·23 Ara

If a coding agent does a web search, reads some open source code, and then implements a forked/modified version of that code... how do you make it follow the licensing? Further, how would you know if it didn't? This is a muddy and under explored topic.

English

489

57.5K

Sérgio Miguel Silva@serj_mig·25 Ara

Fun fact: this handbook is just a Gemini Enterprise sales pitch and was released in August iirc. It's not new nor useful. There's a myriad of people posting crap like this for engagement or to try and sell you their crappy slop, so be careful out there.

Matt Dancho (Business Science)@mdancho84

🚨The AI agent handbook Google just dropped a 46-page playbook on how to build and use agents. This is what you need to know (and how to get it 100% free):

English

Sérgio Miguel Silva@serj_mig·25 Ara

@calcsam BOOK

Português

Sam Bhagwat@calcsam·22 Ara

last month we wrote a new agents book: patterns for building ai agents it has everything you need to take your agents from prototype to production, like agent design patterns, the basics of security, etc reply to this tweet with BOOK and we'll dm you so you can get a copy

English

4.1K

452

5.1K

588.8K

Sérgio Miguel Silva@serj_mig·18 Ara

Anyone that has experienced the craziness of fighting the compliance team on the "secure score" due to meaningless CVE on base images will love this news. I know I do, updating all mine asap.

Docker@Docker

🎁 What’s on your dev wishlist? How about secure, production-ready base images… free and open source?! Docker Hardened Images are now free for everyone, & backed by an Apache 2.0 license. Read more: bit.ly/4rYoCKI #DHI #OpenSource

English

Sérgio Miguel Silva@serj_mig·13 Ara

For what it's worth, even 5.1 is not out of public preview yet. Super annoying when your org doesn't enable non-GA models.

GitHub@github

.@OpenAI’s GPT-5.2 is now rolling out in public preview in GitHub Copilot. This model is focused on long context and front-end UI generation. Try it out in @code ⬇️ github.blog/changelog/2025…

English

Keşfet

@MilksandMatcha @cerebras @cognition @rauchg @rasbt @tan_stack @kentcdodds @GoogleDeepMind