Rafael David Tinoco

424 posts

Rafael David Tinoco

@rafaeldtinoco

eBPF, kernel, security | Jibril Runtime Creator | Former Tracee Maintainer, Ubuntu Core Developer and Mainframer

Curitiba Katılım Haziran 2019

532 Takip Edilen676 Takipçiler

Rafael David Tinoco@rafaeldtinoco·6 Nis

Almost there. Nice way to start the week.

English

Rafael David Tinoco retweetledi

ngrok@ngrokHQ·25 Mar

Quantization can make an LLM 4x smaller and 2x faster, with barely any quality loss. But what *is* it? @samwhoo crafted a beautiful interactive essay explaining it from first principles, aimed at coders, not mathematicians. ngrok.com/blog/quantizat…

English

204

1.6K

663.9K

Rafael David Tinoco@rafaeldtinoco·25 Mar

Stressing poll(), timerfd, scheduling vruntime, bad work queue cleanup logic, maximizing a tiny race window, joining with side channel, stack pivoting, this work is impressive (unfortunately its red, but amazing ;)

English

Rafael David Tinoco@rafaeldtinoco·25 Mar

The only shame is that it requires KPTI disabled for the prefetch side channel attack (that makes all the other wonderful non deterministic conditions even more difficult.

English

127

Rafael David Tinoco@rafaeldtinoco·25 Mar

Sorry #security lovers, despite the #litellm and #trivy seriousness and effects, I cannot stop thinking how much more work and cognitive demand an async use after free of a deferred worker pointer of a protocol handler requires. Have fun with the #exploit.

V4bel@v4bel

I discovered a race-based vulnerability class in the Linux kernel: "Out-of-Cancel" A structural flaw where cancel_work_sync() is used as a barrier for object lifetime management, causing UAF across multiple networking subsystems. I wrote an exploit for CVE-2026-23239 (espintcp). It interleaves Delayed ACK timers, NET_RX softirqs, timerfd hardirqs, workqueue scheduling, and CFS scheduler manipulation to hit a ~Xµs race window. Blog: v4bel.github.io/linux/2026/03/… This is the race scenario diagram 😁:

English

467

Rafael David Tinoco retweetledi

LlamaIndex 🦙@llama_index·19 Mar

We've spent years building LlamaParse into the most accurate document parser for production AI. Along the way, we learned a lot about what fast, lightweight parsing actually looks like under the hood. Today, we're open-sourcing a light-weight core of that tech as LiteParse 🦙 It's a CLI + TS-native library for layout-aware text parsing from PDFs, Office docs, and images. Local, zero Python dependencies, and built specifically for agents and LLM pipelines. Think of it as our way of giving the community a solid starting point for document parsing: npm i -g @llamaindex/liteparse lit parse anything.pdf - preserves spatial layout (columns, tables, alignment) - built-in local OCR, or bring your own server - screenshots for multimodal LLMs - handles PDFs, office docs, images Blog: llamaindex.ai/blog/liteparse… Repo: github.com/run-llama/lite…

English

416

580K

Rafael David Tinoco retweetledi

Piotr Mińkowski@piotr_minkowski·4 Mar

Are you looking for the best model to run locally on your hardware? Instead of pulling different models and trying, you can just get recommendations for a given category using LLM Checker (github.com/Pavelevich/llm…). 👇👇👇

English

399

27.9K

Rafael David Tinoco retweetledi

Vali Neagu@AmbsdOP·2 Mar

YES! Someone reverse-engineered Apple's Neural Engine and trained a neural network on it. Apple never allowed this. ANE is inference-only. No public API, no docs. They cracked it open anyway. Why it matters: • M4 ANE = 6.6 TFLOPS/W vs 0.08 for an A100 (80× more efficient) • "38 TOPS" is a lie - real throughput is 19 TFLOPS FP16 • Your Mac mini has this chip sitting mostly idle Translation: local AI inference that's faster AND uses almost no power. Still early research but the door is now open. → github.com/maderix/ANE #AI #MachineLearning #AppleSilicon #LocalAI #OpenSource #ANE #CoreML #AppleSilicon #NPU #KCORES

English

159

734

7.1K

550.4K

Rafael David Tinoco retweetledi

Bo Wang@BoWang87·27 Şub

DeepSeek just dropped a new paper — and it's not about a new model. It's about the infrastructure bottleneck that limits every agentic LLM at scale. And their fix nearly doubles throughput. DualPath (arxiv.org/abs/2602.21548): the bottleneck in multi-turn agentic inference isn't compute anymore. It's storage bandwidth. When agents run at scale, every new turn reloads the full KV-Cache from external storage. At long contexts + high concurrency, this I/O completely dominates. The GPU sits waiting on the NIC. In standard disaggregated setups (prefill + decode engines), there's a brutal asymmetry: — Prefill engine NIC: saturated loading KV-Cache — Decode engine NIC: sitting completely idle The bandwidth is right there. It's just not being used. DualPath adds a second loading path: Path 1 (existing): Storage → Prefill engine Path 2 (new): Storage → Decode engine → Prefill engine via RDMA The idle decode NIC now does real work. RDMA transfers over the compute network avoid congestion without touching latency-critical inference traffic. A global scheduler dynamically balances load across both paths in real time. Results on 3 models with production agentic workloads: 📈 Offline throughput: up to 1.87× 📈 Online serving: 1.96× average — without violating SLO As agents get more capable — longer context, more tool calls, persistent memory — KV-Cache I/O only gets worse. DeepSeek is solving this at the infrastructure layer while everyone else is focused on model benchmarks. One of those "obvious in hindsight" ideas: the bandwidth was always there.

English

100

580

57.4K

Rafael David Tinoco retweetledi

Qwen@Alibaba_Qwen·25 Şub

The Qwen3.5 series maintains near-lossless accuracy under 4-bit weight and KV cache quantization. In terms of long-context efficiency: Qwen3.5-27B supports 800K+ context length Qwen3.5-35B-A3B exceeds 1M context on consumer-grade GPUs with 32GB VRAM Qwen3.5-122B-A10B supports 1M+ context length on server-grade GPUs with 80GB VRAM In addition, we have open-sourced the Qwen3.5-35B-A3B-Base model to better support research and innovation. We can't wait to see what the community builds next!

Qwen@Alibaba_Qwen

🚀 Introducing the Qwen 3.5 Medium Model Series Qwen3.5-Flash · Qwen3.5-35B-A3B · Qwen3.5-122B-A10B · Qwen3.5-27B ✨ More intelligence, less compute. • Qwen3.5-35B-A3B now surpasses Qwen3-235B-A22B-2507 and Qwen3-VL-235B-A22B — a reminder that better architecture, data quality, and RL can move intelligence forward, not just bigger parameter counts. • Qwen3.5-122B-A10B and 27B continue narrowing the gap between medium-sized and frontier models — especially in more complex agent scenarios. • Qwen3.5-Flash is the hosted production version aligned with 35B-A3B, featuring: – 1M context length by default – Official built-in tools 🔗 Hugging Face: huggingface.co/collections/Qw… 🔗 ModelScope: modelscope.cn/collections/Qw… 🔗 Qwen3.5-Flash API: modelstudio.console.alibabacloud.com/ap-southeast-1… Try in Qwen Chat 👇 Flash: chat.qwen.ai/?models=qwen3.… 27B: chat.qwen.ai/?models=qwen3.… 35B-A3B: chat.qwen.ai/?models=qwen3.… 122B-A10B: chat.qwen.ai/?models=qwen3.… Would love to hear what you build with it.

English

169

2.1K

455.4K

Rafael David Tinoco retweetledi

STH@ServeTheHome·16 Oca

The MikroTik CRS804 DDQ is a low-cost 4x 400GbE or 1.6Tbps class switch that is going to bring high-speed networking to new segments servethehome.com/mikrotik-crs80…

English

179

14.8K

Rafael David Tinoco retweetledi

exQUIZitely 🕹️@exQUIZitely·8 Oca

The most "claustrophobic" game ever? Descent (1995) is a first-person shooter developed by Parallax Software, notable for being the first FPS with fully true 3D graphics and six degrees of freedom movement. Players pilot the Pyro-GX spaceship through mineshafts on various planets, infected by a virus that has turned mining robots hostile. There you go, the whole story in one sentence! Movement is the game's hallmark: full six degrees of freedom allows free flight in any direction - forward/backward, left/right (slide/strafe), up/down, and 360° rotation - creating disorienting, stomach-churning zero-gravity combat. For someone like me, being claustrophobic, this was both tough to play yet highly fascinating. I feel Descent is an underrated game that got a bit lost in the shuffle of other great games around the mid 90s.

English

1.4K

560

8.8K

831.6K

Rafael David Tinoco@rafaeldtinoco·7 Oca

@Grummz Why not to make it vertical scrolling ? Coders would buy ;).

English

Grummz@Grummz·7 Oca

Lenovo unveiled a bunch of concepts for their "Rollable" screens concept at CES today. This laptop has a screen that grows widescreen at the touch of a button. They have monitors and other configs too.

English

317

667

8.1K

608.4K

Rafael David Tinoco retweetledi

cr0@Defensive-Security.com / EDRmetry / PurpleLabs

[email protected] / EDRmetry / PurpleLabs@cr0nym·20 May

Having low-level real-time #Linux events visibility really changes the Linux hunting & detection game. I always liked to analyze network flows with Zeek + Splunk. Now, a similar approach is possible on system events thanks to Kunai Runtime Security, Jibril, or Tetragon. You can learn so much about the behavior of Linux internals, especially when EDRmetry powers your offensive mindset - an effective Linux EDR/SIEM Evaluation Testing Playbook, which allows for Detection Coverage/Incident Response testing by executing dedicated Linux offensive tests mapped to MITRE ATT&CK™ Framework. Detailed write-up is in progress. #redteam #blueteam #linux #purplelabs

cr0@Defensive-Security.com / EDRmetry / PurpleLabs tweet media

English

1.8K

Rafael David Tinoco retweetledi

Proton@ProtonPrivacy·18 Ağu

So here’s Perplexity’s playbook: 1️⃣ Collect everything 2️⃣ Train AI on it 3️⃣ Monetize through ads, pricing models, or partnerships. If it sounds familiar, it’s because that’s Google’s own playbook, but instead used by a less-than-3-year-old startup with VCs eager to get a return on their investments. 6/7

English

268

22.8K

Rafael David Tinoco@rafaeldtinoco·19 Tem

@jorgemessiasagu Alô....

131

Jorge Messias@jorgemessiasagu·19 Tem

Nota do Advogado-Geral da União Na condição de Advogado-Geral da União da República Federativa do Brasil, considero um dever, nesse momento, expressar apoio e solidariedade aos ministros do Supremo Tribunal Federal e ao Procurador-Geral da República, atingidos, juntamente com seus familiares, por atos arbitrários de revogação de vistos por nação estrangeira, em razão de cumprirem, em termos constitucionais, as suas legítimas funções institucionais. Não se pode coadunar com a deturpação que pretende imputar a tais autoridades brasileiras a prática de atos de violação de direitos fundamentais tampouco censura à liberdade de expressão, quando em verdade sua atuação se orienta nos estritos limites do ordenamento jurídico, em favor da conservação da integridade da nossa Democracia e dos predicados do Estado de Direito. O exercício da jurisdição, no contexto de um sistema de Justiça estável e alinhado com as garantias da cidadania, não pode sofrer, em hipótese alguma, assédio de índole política, muito menos mediante o concurso de Estado estrangeiro. Asseguro que nenhum expediente inidôneo ou ato conspiratório sórdido haverá de intimidar o Poder Judiciário de nosso país em seu agir independente e digno. Jorge Messias Advogado-Geral da União

Português

524

1.2K

295.6K

Rafael David Tinoco@rafaeldtinoco·18 Tem

@CocaCola_Br vocês destruiram a #CocaCola no Brasil. O gosto da coca é horrível. Antes era tão bom ou melhor do que a coca do México feita com açucar de verdade. Agora tem gosto de água suja. Blergh.