Sérgio Miguel Silva

2.5K posts

Sérgio Miguel Silva

Sérgio Miguel Silva

@serj_mig

Setúbal, Portugal Katılım Temmuz 2012
319 Takip Edilen51 Takipçiler
Sérgio Miguel Silva retweetledi
Tim Haldorsson
Tim Haldorsson@TimHaldorsson·
Introducing: Lisbon AI builder venue 🇵🇹 we got our office 1 year ago, since then we have been planning and hosting events/afterworks and hackathons with the leaders in AI. our office is in central Lisbon, near Marquês de Pombal with good vibes. over the next couple of months, we got many big events coming up with the leading AI labs and companies: 1, Claude Afterwork Lisbon (23/4) > luma.com/espressio-clau… 2, Lovable is coming to Lisbon 25/4 > luma.com/eje1in1n 3, Friendly Machines: Exploring Hermes and Claude Agents > luma.com/299ct8m6 4, Supabase Lisbon Meetup > luma.com/6uug9m6x with many more planned with the top labs in the space. If you are an AI builder looking for other AI builders in Lisbon, then its time to reach out.
English
79
29
444
55.7K
Sérgio Miguel Silva
Sérgio Miguel Silva@serj_mig·
@MilksandMatcha @cerebras I'd use it to compare against Cursor Composer 2.0 on actually real development tasks. I'm working on some medical benchmarks / model comparisons, and a react native game. Usually fast doesn't equal capable, and I want to see how SWE-1.6 stacks up when tasks are harder.
English
0
0
0
51
Sarah Chieng
Sarah Chieng@MilksandMatcha·
Giving away 5 Windsurf Max ($200/month) plans Each person will get 3 months of free Windsurf Max (highest tier). Try out SWE 1.6, Cognition's latest, fastest, and most intelligent model, powered by @cerebras. Winners will be selected from comments in 48 hours, comment below why you want it.
Cognition@cognition

We’re releasing SWE-1.6, our best model in both intelligence & model UX. SWE-1.6 matches our Preview model on SWE-Bench Pro while dramatically improving on various behavioral axes. It’s available today in Windsurf in two modes: free tier (200 tok/s) and fast tier (950 tok/s).

English
1K
51
860
161.3K
Sérgio Miguel Silva
Sérgio Miguel Silva@serj_mig·
@cognition I'd use it to compare against Cursor Composer 2.0 on actually real development tasks. I'm working on some medical benchmarks / model comparisons, and a react native game. Usually fast doesn't equal capable, and I want to see how SWE-1.6 stacks up when tasks are harder.
English
0
0
0
53
Cognition
Cognition@cognition·
We’re releasing SWE-1.6, our best model in both intelligence & model UX. SWE-1.6 matches our Preview model on SWE-Bench Pro while dramatically improving on various behavioral axes. It’s available today in Windsurf in two modes: free tier (200 tok/s) and fast tier (950 tok/s).
Cognition tweet media
English
62
68
840
415.6K
Sérgio Miguel Silva retweetledi
Gergely Orosz
Gergely Orosz@GergelyOrosz·
If you use GitHub (especially if you pay for it!!) consider doing this *immediately* Settings -> Privacy -> Disallow GitHub to train their models on your code. GitHub opted *everyone* into training. No matter if you pay for the service (like I do). WTH github.com/settings/copil…
Gergely Orosz tweet media
English
391
921
5.2K
582.8K
Sérgio Miguel Silva
Sérgio Miguel Silva@serj_mig·
@rauchg I think some people are just mistaking the apparent freedom of not knowing how to code (so they just prompt and don't ever look) with the ability to know how to and when to actually look and review the code. Knowing how to code is a leverage an engineer can use.
English
0
0
0
14
Guillermo Rauch
Guillermo Rauch@rauchg·
Not knowing how to code giving you an advantage is absolute nonsense. The more you understand, the better your prompts, the better the feedback you give, the better product you ship. What will change is that the intricacies of syntax, compilers, module systems, the finer details of type systems, won’t matter as much to everyone. But you should absolutely understand how the pieces fit together. From syscall to pixels. Learn how data flows, because you’ll be able to secure your systems. Learn about performance, because you’ll be able to push your agent further. Learn about APIs, because they determine how to integrate systems. Learn about how systems fail, because you’ll be able to make reliable programs.
English
352
617
6.4K
467.1K
Sérgio Miguel Silva
Sérgio Miguel Silva@serj_mig·
@rasbt Nice to see other releases coming out from other labs. I don't trust benchmarks since the training data is likely to contain solutions for a lot of problems, but for Indian languages it should perform quite well. 128k context might be a limitation for long reasoning though.
English
0
0
2
139
Sebastian Raschka
Sebastian Raschka@rasbt·
While waiting for DeepSeek V4 we got two very strong open-weight LLMs from India yesterday. There are two size flavors, Sarvam 30B and Sarvam 105B model (both reasoning models). Interestingly, the smaller 30B model uses “classic” Grouped Query Attention (GQA), whereas the larger 105B variant switched to DeepSeek-style Multi-Head Latent Attention (MLA). As I wrote about in my analyses before, both are popular attention variants to reduce KV cache size (the longer the context, the more you save compared to regular attention). MLA is more complicated to implement, but it can give you better modeling performance if we go by the ablation studies in the 2024 DeepSeek V2 paper (as far as I know, this is still the most recent apples-to-apples comparison). Speaking of modeling performance, the 105B model is on par with LLMs of similar size: gpt-oss 120B and Qwen3-Next (80B). Sarvam is better on some tasks and worse on others, but roughly the same on average. It’s not the strongest coder in SWE-Bench Verified terms, but it is surprisingly good at agentic reasoning and task completion (Tau2). It’s even better than Deepseek R1 0528. Considering the smaller Sarvam 30B, the perhaps most comparable model to the 30B model is Nemotron 3 Nano 30B, which is slightly ahead in coding per SWE-Bench Verified and agentic reasoning (Tau2) but slightly worse in some other aspects (Live Code Bench v6, BrowseComp). Unfortunately, Qwen3-30B-A3B is missing in the benchmarks, which is, as far as I know, is the most popular model of that size class. Interestingly, though, the Sarvam team compared their 30B model to Qwen3-30B-A3B on a computational performance analysis, where they found that Sarvam gets 20-40% more tokens/sec throughput compared to Qwen3 due to code and kernel optimizations. Anyways, one thing that is not captured by the benchmarks above is Sarvam’s good performance on Indian languages. According to a judge model, the Sarvam team found that their model is preferred 90% of the time compared to others when it comes to Indian texts. (Since they built and trained the tokenizer from scratch as well, Sarvam also comes with a 4 times higher token efficiency on Indian languages.
Sebastian Raschka tweet media
Pratyush Kumar@pratykumar

📢 Open-sourcing the Sarvam 30B and 105B models! Trained from scratch with all data, model research and inference optimisation done in-house, these models punch above their weight in most global benchmarks plus excel in Indian languages. Get the weights at Hugging Face and AIKosh. Thanks to the good folks at SGLang for day 0 support, vLLM support coming soon. Links, benchmark scores, examples, and more in our blog - sarvam.ai/blogs/sarvam-3…

English
45
692
4.1K
254.1K
Sérgio Miguel Silva
Sérgio Miguel Silva@serj_mig·
@tan_stack @kentcdodds Neat idea, it's similar to what Vercel is doing with next-docs but skills instead of MCP. I've plans of doing this (ship skills) for an internal SDK, but pypi instead of npm Did you find MCP too much work / bloat to ship, or what was the reason to go with skills instead of MCP?
English
0
0
1
67
Sérgio Miguel Silva retweetledi
TANSTACK
TANSTACK@tan_stack·
You asked for TanStack skills, we built the whole pipeline. Introducing @tan_stack Intent (alpha) 📦 Ship agent-readable "skills" inside npm packages 🔍 Auto-discovered from node_modules 🔄 Knowledge sync with npm update 📂 Distributed - skills live in library repo 🧩 Composable - mix core + framework-specific skills 🌐 npm, pnpm, bun, yarn, deno No stale training data. Just npm install! 🔗 ⬇️🧵
English
56
174
2.3K
177K
Sérgio Miguel Silva
Sérgio Miguel Silva@serj_mig·
@GoogleDeepMind Alright, cool, seems to be a cheaper, less yapping upgrade of 2.5 flash. But can you guys for once release a Gemini 3 model that's not perpetually in preview?
English
0
0
0
269
Google DeepMind
Google DeepMind@GoogleDeepMind·
Gemini 3.1 Flash-Lite has landed. It’s our most cost-efficient Gemini 3 series model yet, built for intelligence at scale. Here’s what’s new 🧵
English
341
870
8.9K
1.8M
Sérgio Miguel Silva
Sérgio Miguel Silva@serj_mig·
@neetcode1 @moltbook @openclaw Value proposition maybe as a social experiment, and perhaps you'd enjoy the emergent slop there for giggles. Doesn't look compelling otherwise imo. I find it interesting, but it's not making my life better for sure lol
English
0
0
0
22
NeetCode
NeetCode@neetcode1·
Okay so now there's a @moltbook where our @openclaw instances can socialize together.. anything else I missed? And can someone please tell me why this will make my life better. Genuinely curious about the value proposition.
English
53
11
520
53.8K
Sérgio Miguel Silva
Sérgio Miguel Silva@serj_mig·
@rasbt It's quite fascinating to see these ideas and experiments emerge. I don't get the hype since at the core these are "just" token predictors in a loop, but the site content does read like a bit sloppy version of reddit. Your books helped me understand better these models btw 👌
English
0
0
1
159
Sérgio Miguel Silva
Sérgio Miguel Silva@serj_mig·
@rauchg Two years ago this would have been probably useful, I ditched all CRA apps we had for Vite back then. I'm amazed that CRA is still a thing today.
English
0
0
0
34
Guillermo Rauch
Guillermo Rauch@rauchg·
We’re working on a Skill (skills.sh) to migrate from CRA (Create React App) to Next.js. Looking for apps to test it with. DMs open. Pretty confident agent skills can help solve one of the most annoying problems in computer science: legacy software migrations ☺️
English
55
16
683
52.6K
Sérgio Miguel Silva
Sérgio Miguel Silva@serj_mig·
@leerob Licensed code, art, etc is very likely in the training data of OpenAI, Google, Anthropic models. Ethics and copyright holders were already screwed up. But to your point, part of the answer could be a signature check of generated output against known licenced code.
English
0
0
0
17
Lee Robinson
Lee Robinson@leerob·
If a coding agent does a web search, reads some open source code, and then implements a forked/modified version of that code... how do you make it follow the licensing? Further, how would you know if it didn't? This is a muddy and under explored topic.
English
58
12
489
57.5K
Sérgio Miguel Silva
Sérgio Miguel Silva@serj_mig·
Fun fact: this handbook is just a Gemini Enterprise sales pitch and was released in August iirc. It's not new nor useful. There's a myriad of people posting crap like this for engagement or to try and sell you their crappy slop, so be careful out there.
Matt Dancho (Business Science)@mdancho84

🚨The AI agent handbook Google just dropped a 46-page playbook on how to build and use agents. This is what you need to know (and how to get it 100% free):

English
0
0
0
18
Sam Bhagwat
Sam Bhagwat@calcsam·
last month we wrote a new agents book: patterns for building ai agents it has everything you need to take your agents from prototype to production, like agent design patterns, the basics of security, etc reply to this tweet with BOOK and we'll dm you so you can get a copy
Sam Bhagwat tweet media
English
4.1K
452
5.1K
588.8K
Sérgio Miguel Silva
Sérgio Miguel Silva@serj_mig·
For what it's worth, even 5.1 is not out of public preview yet. Super annoying when your org doesn't enable non-GA models.
GitHub@github

.@OpenAI’s GPT-5.2 is now rolling out in public preview in GitHub Copilot. This model is focused on long context and front-end UI generation. Try it out in @code ⬇️ github.blog/changelog/2025…

English
0
0
0
4