Rafael David Tinoco

424 posts

Rafael David Tinoco banner
Rafael David Tinoco

Rafael David Tinoco

@rafaeldtinoco

eBPF, kernel, security | Jibril Runtime Creator | Former Tracee Maintainer, Ubuntu Core Developer and Mainframer

Curitiba Katılım Haziran 2019
532 Takip Edilen676 Takipçiler
Rafael David Tinoco
Rafael David Tinoco@rafaeldtinoco·
Almost there. Nice way to start the week.
Rafael David Tinoco tweet media
English
0
0
0
58
Rafael David Tinoco retweetledi
ngrok
ngrok@ngrokHQ·
Quantization can make an LLM 4x smaller and 2x faster, with barely any quality loss. But what *is* it? @samwhoo crafted a beautiful interactive essay explaining it from first principles, aimed at coders, not mathematicians. ngrok.com/blog/quantizat…
English
16
204
1.6K
663.9K
Rafael David Tinoco
Rafael David Tinoco@rafaeldtinoco·
Stressing poll(), timerfd, scheduling vruntime, bad work queue cleanup logic, maximizing a tiny race window, joining with side channel, stack pivoting, this work is impressive (unfortunately its red, but amazing ;)
English
0
0
0
86
Rafael David Tinoco
Rafael David Tinoco@rafaeldtinoco·
The only shame is that it requires KPTI disabled for the prefetch side channel attack (that makes all the other wonderful non deterministic conditions even more difficult.
English
2
0
1
127
Rafael David Tinoco retweetledi
LlamaIndex 🦙
LlamaIndex 🦙@llama_index·
We've spent years building LlamaParse into the most accurate document parser for production AI. Along the way, we learned a lot about what fast, lightweight parsing actually looks like under the hood. Today, we're open-sourcing a light-weight core of that tech as LiteParse 🦙 It's a CLI + TS-native library for layout-aware text parsing from PDFs, Office docs, and images. Local, zero Python dependencies, and built specifically for agents and LLM pipelines. Think of it as our way of giving the community a solid starting point for document parsing: npm i -g @llamaindex/liteparse lit parse anything.pdf - preserves spatial layout (columns, tables, alignment) - built-in local OCR, or bring your own server - screenshots for multimodal LLMs - handles PDFs, office docs, images Blog: llamaindex.ai/blog/liteparse… Repo: github.com/run-llama/lite…
English
13
57
416
580K
Rafael David Tinoco retweetledi
Piotr Mińkowski
Piotr Mińkowski@piotr_minkowski·
Are you looking for the best model to run locally on your hardware? Instead of pulling different models and trying, you can just get recommendations for a given category using LLM Checker (github.com/Pavelevich/llm…). 👇👇👇
Piotr Mińkowski tweet media
English
11
51
399
27.9K
Rafael David Tinoco retweetledi
Vali Neagu
Vali Neagu@AmbsdOP·
YES! Someone reverse-engineered Apple's Neural Engine and trained a neural network on it. Apple never allowed this. ANE is inference-only. No public API, no docs. They cracked it open anyway. Why it matters: • M4 ANE = 6.6 TFLOPS/W vs 0.08 for an A100 (80× more efficient) • "38 TOPS" is a lie - real throughput is 19 TFLOPS FP16 • Your Mac mini has this chip sitting mostly idle Translation: local AI inference that's faster AND uses almost no power. Still early research but the door is now open. → github.com/maderix/ANE #AI #MachineLearning #AppleSilicon #LocalAI #OpenSource #ANE #CoreML #AppleSilicon #NPU #KCORES
Vali Neagu tweet media
English
159
734
7.1K
550.4K
Rafael David Tinoco retweetledi
Bo Wang
Bo Wang@BoWang87·
DeepSeek just dropped a new paper — and it's not about a new model. It's about the infrastructure bottleneck that limits every agentic LLM at scale. And their fix nearly doubles throughput. DualPath (arxiv.org/abs/2602.21548): the bottleneck in multi-turn agentic inference isn't compute anymore. It's storage bandwidth. When agents run at scale, every new turn reloads the full KV-Cache from external storage. At long contexts + high concurrency, this I/O completely dominates. The GPU sits waiting on the NIC. In standard disaggregated setups (prefill + decode engines), there's a brutal asymmetry: — Prefill engine NIC: saturated loading KV-Cache — Decode engine NIC: sitting completely idle The bandwidth is right there. It's just not being used. DualPath adds a second loading path: Path 1 (existing): Storage → Prefill engine Path 2 (new): Storage → Decode engine → Prefill engine via RDMA The idle decode NIC now does real work. RDMA transfers over the compute network avoid congestion without touching latency-critical inference traffic. A global scheduler dynamically balances load across both paths in real time. Results on 3 models with production agentic workloads: 📈 Offline throughput: up to 1.87× 📈 Online serving: 1.96× average — without violating SLO As agents get more capable — longer context, more tool calls, persistent memory — KV-Cache I/O only gets worse. DeepSeek is solving this at the infrastructure layer while everyone else is focused on model benchmarks. One of those "obvious in hindsight" ideas: the bandwidth was always there.
Bo Wang tweet mediaBo Wang tweet mediaBo Wang tweet media
English
34
100
580
57.4K
Rafael David Tinoco retweetledi
Qwen
Qwen@Alibaba_Qwen·
The Qwen3.5 series maintains near-lossless accuracy under 4-bit weight and KV cache quantization. In terms of long-context efficiency: Qwen3.5-27B supports 800K+ context length Qwen3.5-35B-A3B exceeds 1M context on consumer-grade GPUs with 32GB VRAM Qwen3.5-122B-A10B supports 1M+ context length on server-grade GPUs with 80GB VRAM In addition, we have open-sourced the Qwen3.5-35B-A3B-Base model to better support research and innovation. We can't wait to see what the community builds next!
Qwen@Alibaba_Qwen

🚀 Introducing the Qwen 3.5 Medium Model Series Qwen3.5-Flash · Qwen3.5-35B-A3B · Qwen3.5-122B-A10B · Qwen3.5-27B ✨ More intelligence, less compute. • Qwen3.5-35B-A3B now surpasses Qwen3-235B-A22B-2507 and Qwen3-VL-235B-A22B — a reminder that better architecture, data quality, and RL can move intelligence forward, not just bigger parameter counts. • Qwen3.5-122B-A10B and 27B continue narrowing the gap between medium-sized and frontier models — especially in more complex agent scenarios. • Qwen3.5-Flash is the hosted production version aligned with 35B-A3B, featuring: – 1M context length by default – Official built-in tools 🔗 Hugging Face: huggingface.co/collections/Qw… 🔗 ModelScope: modelscope.cn/collections/Qw… 🔗 Qwen3.5-Flash API: modelstudio.console.alibabacloud.com/ap-southeast-1… Try in Qwen Chat 👇 Flash: chat.qwen.ai/?models=qwen3.… 27B: chat.qwen.ai/?models=qwen3.… 35B-A3B: chat.qwen.ai/?models=qwen3.… 122B-A10B: chat.qwen.ai/?models=qwen3.… Would love to hear what you build with it.

English
59
169
2.1K
455.4K
Rafael David Tinoco retweetledi
STH
STH@ServeTheHome·
The MikroTik CRS804 DDQ is a low-cost 4x 400GbE or 1.6Tbps class switch that is going to bring high-speed networking to new segments servethehome.com/mikrotik-crs80…
English
12
29
179
14.8K
Rafael David Tinoco retweetledi
exQUIZitely 🕹️
exQUIZitely 🕹️@exQUIZitely·
The most "claustrophobic" game ever? Descent (1995) is a first-person shooter developed by Parallax Software, notable for being the first FPS with fully true 3D graphics and six degrees of freedom movement. Players pilot the Pyro-GX spaceship through mineshafts on various planets, infected by a virus that has turned mining robots hostile. There you go, the whole story in one sentence! Movement is the game's hallmark: full six degrees of freedom allows free flight in any direction - forward/backward, left/right (slide/strafe), up/down, and 360° rotation - creating disorienting, stomach-churning zero-gravity combat. For someone like me, being claustrophobic, this was both tough to play yet highly fascinating. I feel Descent is an underrated game that got a bit lost in the shuffle of other great games around the mid 90s.
English
1.4K
560
8.8K
831.6K
Grummz
Grummz@Grummz·
Lenovo unveiled a bunch of concepts for their "Rollable" screens concept at CES today. This laptop has a screen that grows widescreen at the touch of a button. They have monitors and other configs too.
English
317
667
8.1K
608.4K
Rafael David Tinoco retweetledi
cr0@Defensive-Security.com / EDRmetry / PurpleLabs
Having low-level real-time #Linux events visibility really changes the Linux hunting & detection game. I always liked to analyze network flows with Zeek + Splunk. Now, a similar approach is possible on system events thanks to Kunai Runtime Security, Jibril, or Tetragon. You can learn so much about the behavior of Linux internals, especially when EDRmetry powers your offensive mindset - an effective Linux EDR/SIEM Evaluation Testing Playbook, which allows for Detection Coverage/Incident Response testing by executing dedicated Linux offensive tests mapped to MITRE ATT&CK™ Framework. Detailed write-up is in progress. #redteam #blueteam #linux #purplelabs
cr0@Defensive-Security.com / EDRmetry / PurpleLabs tweet mediacr0@Defensive-Security.com / EDRmetry / PurpleLabs tweet mediacr0@Defensive-Security.com / EDRmetry / PurpleLabs tweet mediacr0@Defensive-Security.com / EDRmetry / PurpleLabs tweet media
English
1
8
28
1.8K
Rafael David Tinoco retweetledi
Proton
Proton@ProtonPrivacy·
So here’s Perplexity’s playbook: 1️⃣ Collect everything 2️⃣ Train AI on it 3️⃣ Monetize through ads, pricing models, or partnerships. If it sounds familiar, it’s because that’s Google’s own playbook, but instead used by a less-than-3-year-old startup with VCs eager to get a return on their investments. 6/7
English
3
31
268
22.8K
Jorge Messias
Jorge Messias@jorgemessiasagu·
Nota do Advogado-Geral da União Na condição de Advogado-Geral da União da República Federativa do Brasil, considero um dever, nesse momento, expressar apoio e solidariedade aos ministros do Supremo Tribunal Federal e ao Procurador-Geral da República, atingidos, juntamente com seus familiares, por atos arbitrários de revogação de vistos por nação estrangeira, em razão de cumprirem, em termos constitucionais, as suas legítimas funções institucionais. Não se pode coadunar com a deturpação que pretende imputar a tais autoridades brasileiras a prática de atos de violação de direitos fundamentais tampouco censura à liberdade de expressão, quando em verdade sua atuação se orienta nos estritos limites do ordenamento jurídico, em favor da conservação da integridade da nossa Democracia e dos predicados do Estado de Direito. O exercício da jurisdição, no contexto de um sistema de Justiça estável e alinhado com as garantias da cidadania, não pode sofrer, em hipótese alguma, assédio de índole política, muito menos mediante o concurso de Estado estrangeiro. Asseguro que nenhum expediente inidôneo ou ato conspiratório sórdido haverá de intimidar o Poder Judiciário de nosso país em seu agir independente e digno. Jorge Messias Advogado-Geral da União
Português
524
1.2K
7K
295.6K
Rafael David Tinoco
Rafael David Tinoco@rafaeldtinoco·
@CocaCola_Br vocês destruiram a #CocaCola no Brasil. O gosto da coca é horrível. Antes era tão bom ou melhor do que a coca do México feita com açucar de verdade. Agora tem gosto de água suja. Blergh.
Rafael David Tinoco tweet media
Português
0
0
0
19
Jorge Niedbalski
Jorge Niedbalski@niedbalski·
Hi @OrbStack Where can I get the source code of the linux kernel you are distributing for Mac OS? Thank you!
English
1
0
1
150