tanuja devadiga

3.9K posts

tanuja devadiga

@TanujaDeva47734

가입일 Temmuz 2023

3.1K 팔로잉102 팔로워

tanuja devadiga 리트윗함

Thorsten Ball@thorstenball·13h

Here we go, straight from the still closed coffee shop in this hotel: A new Joy & Curiosity! registerspill.thorstenball.com/p/joy-and-curi…

English

25.9K

tanuja devadiga 리트윗함

antirez@antirez·7h

DeepSeek v4 Flash with *local inference* after 24h of playing with that: even with the 2 bit selective quantization GGUF, iti is the FIRST time I feel I have a frontier model running on my computer. This is *crazy*, and probably a much stronger change in the landscape than PRO.

English

55.5K

tanuja devadiga 리트윗함

antirez@antirez·2h

@badlogicgames Yup: x.com/antirez/status… The GGUF tool calling template is wrong but I'm uploading a new GGUF file. Otherwise there is the right template file in one of the latest commits and there is to specify it when executing llama.cpp. Uploading the fixed file ASAP btw on HF.

antirez@antirez

Here you can find experimental support for DeepSeek v4 Flash in llama.cpp: github.com/antirez/llama.… And a GGUF file you can use in order to run the inference with just 128 GB of RAM: huggingface.co/antirez/deepse… Check the first part of the README for instructions

English

13.8K

tanuja devadiga 리트윗함

antirez@antirez·3h

This is DeepSeek v4 Flash quantized at 2 bit that runs as LLM of the pi agent. Perfect tool calling apparently, so this model, with this specific quantization scheme that I used at least, is capable of working very well. Now I need a real speedup not in t/s generation but prompt processing.

English

122

tanuja devadiga 리트윗함

dex@dexhorthy·2h

what are the things that the ai is still bad at, and how can I ensure I do those things, without sync waiting for ai to do the things that ai is good at. AI can: - read and understand 100ks LOC fast - propose many design options / questions for both product and tech - NEED HUMAN TO DESIGN THE ARCHITECTURE - AI can outline a plan, steps to get to the desired end state - Might need human to fiddle with the order to make it hands-on testable during the build, decide good milestones, optional - AI can take the outline and rip out the changes When you yolo back and forth with claude, you're splicing A and B into every other step instead of letting the AI cook and just plug you in where you're needed We need humans for A and B because designing good codebase architecture that is maintainable, and making things incrementally testable along the way, are not skills that coding RL like SWE-bench-multilingual can deliver If we're gonna make models do these things we need to get clever with new evals and benchmarks - i am following SlopCodeBench closely for this one

English
3
3
26
1.2K

tanuja devadiga 리트윗함

dex@dexhorthy·1d
And when you don’t understand the subject and don’t have taste and judgement (eg backend engineer making iOS PR) is the absolute worst way to use ai - you are vibing slop The valuable and impressive use case with AI is - can you get it to honor your taste and judgement, without being such a micromanager that your actual throughput isn’t that much faster This can be done but it requires skill and the intuition that only comes from 1000+ hours working with an llm
Karri Saarinen@karrisaarinen
A common dynamic I observe with AI: it feels most impressive when you don’t know much about the subject, don’t care or don’t have a clear idea of what the you want. This applies across design, code, legal, and more. If I don’t know code very well, every piece of code it writes feels very impressive. Once you know what something should feel or look like, it becomes almost impossible to guide AI there. And you definitely can’t one-shot it.
English
9
10
109
10.4K

tanuja devadiga 리트윗함

Cursor@cursor_ai·2d
GPT-5.5 is now available in Cursor! It's currently the top model on CursorBench at 72.8%. We've partnered with OpenAI to offer it for 50% off through May 2.
English
168
270
5.7K
479.4K

tanuja devadiga 리트윗함

Matt Pocock@mattpocockuk·1d
One thing I wish harnesses did by default: When opening a file, FIRST pre-compile the file and extract only the type signatures and comments for that file (with tsgo this would be instant). Then, if you want to see the implementation, only unwrap the functions you're interested in. Essentially .d.ts for the first step, .ts for the second. Would save a ton of tokens and allow agents to explore more aggressively.
English
45
11
403
33K

tanuja devadiga 리트윗함

Mario Zechner@badlogicgames·1d
this is probably the most important piece of software of the decade next to vllm and sglang. i'm not joking.
Georgi Gerganov@ggerganov
llama.cpp at 100k stars now that 90% of the code worldwide is being written by AI agents, I predict that within 3-6 months, 90% of all AI agents will be running locally with llama.cpp 😄 Jokes aside, I am going to use this small milestone as an opportunity to reflect a bit on the project and the state of AI from the perspective of local applications. There is a lot to say and discuss and yet it feels less and less important to try to make a point. Opinions about viability of local LLMs are strongly polarized, details are overlooked, the scientific approach is lacking. Arguments are predominantly based on vibes and hype waves. One thing is clear though - local LLMs are used more and more. I expect this trend to continue and likely 2026 will end up being one of the most important years for the local AI movement. I admit that I didn't expect the agentic era to come so quickly to the local LLM space. One year ago, the available models were too computationally expensive for doing long-context tasks. There wasn't an obvious path towards meaningful agentic applications. The memory and compute requirements were huge. Last summer, with the release of gpt-oss, things started to change. It was the first time we saw a glimpse of tool calling that actually works well within the resource constraints of our daily devices. Later in the year, even better models were released and by now, useful local agentic workflows are a reality. Comparing local vs hosted capabilities at a given moment of time is pointless. To try put things into perspective: - We don't need frontier intelligence to automate searches and sending emails - We don't need trillion parameter models to be able to summarize articles or technical documents - We don't need massive GPU data centers to control our home appliances or turn the lights off in the garage I believe that there is a certain level of intelligence we as humans can comprehend and meaningfully utilize to improve our working process. Beyond that level, access to more intelligence becomes unnecessary at best and counterproductive at worst. I also believe that that level of useful artificial intelligence is completely within reach locally and it has always been just a matter of implementing the right software stack to bring it to the end user. With llama.cpp, I am confident that we continue to be on the right track of building that software stack! The llama.cpp project is going stronger than ever. With more than 1500 contributors, the project keeps growing steadily. From technical point of view, I think that llama.cpp + ggml is the only solution that actually makes sense. That is, the software stack must run efficiently on every possible device, hardware and operating system. The technology is too important to be vendor-locked. It has to be developed in the open, by the community, together with the independent hardware vendors. This is the only right way to build something that will truly make a difference in the long run. I won't try to convince you about what is currently and will be possible with local AI. We will just continue to build as usual. I am confident that after the smoke clears and we look objectively at what we have built together, the benefits will be obvious to everyone. Big shoutout to all llama.cpp maintainers. I feel extremely lucky to be able to work together with so many talented contributors. Every day I learn something new and I feel there is so much more cool stuff that we are going to build. Also, I am really thankful that the project continues to have reliable partners to support it! Cheers!
English
28
69
1.5K
174.5K

tanuja devadiga 리트윗함

dex@dexhorthy·2d
drop everything and watch this immediately Captures so much between the lines stuff WRT crafting good software that I have struggled to articulate Incredible lessons from 40 years of software engineering, distilled and applied masterfully to the craft of building with AI
Matt Pocock@mattpocockuk
A talk I gave a few weeks ago. Software fundamentals matter more than ever. Here's why: youtube.com/watch?v=v4F1gF…
English
11
33
522
77.9K

tanuja devadiga 리트윗함

Matt Pocock@mattpocockuk·2d
A talk I gave a few weeks ago. Software fundamentals matter more than ever. Here's why: youtube.com/watch?v=v4F1gF…
YouTube
English
36
143
1.3K
297K

tanuja devadiga 리트윗함

Simon Willison@simonw·19 Nis
Note to @AnthropicAI - much as I appreciate the public system prompts this would be so much more valuable to me as a Claude power user if you published the tool descriptions as well
Simon Willison@simonw
Since Anthropic publish their system prompts we can generate a diff between Claude Opus 4.6 and 4.7 - here are my notes on what's changed simonwillison.net/2026/Apr/18/op…
English
28
10
266
34.6K

tanuja devadiga 리트윗함

Yuchen Jin@Yuchenj_UW·19 Nis
> Vercel got pawned > severe enough to notify law enforcement > the only advice: “review your environment variables” > what does that even mean? > $10B company, and this is how you communicate Cyber attacks ramping fast, starting to see why Anthropic is scared to release Mythos.
Vercel@vercel
We’ve identified a security incident that involved unauthorized access to certain internal Vercel systems, impacting a limited subset of customers. Please see our security bulletin: vercel.com/kb/bulletin/ve…
English
37
23
840
95.1K

tanuja devadiga 리트윗함

Aadi Kulshrestha@MankyDankyBanky·17 Nis
I trained a 12M parameter LLM on my own ML framework using a Rust backend and CUDA kernels for flash attention, AdamW, and more. Wrote the full transformer architecture, and BPE tokenizer from scratch. The framework features: - Custom CUDA kernels (Flash Attention, fused LayerNorm, fused GELU) for 3x increased throughput - Automatic WebGPU fallback for non-NVIDIA devices - TypeScript API with Rust compute backend - One npm install to get started, prebuilt binaries for every platform Try out the model for yourself: mni-ml.github.io/demos/transfor… Built with @_reesechong. Check out the repos and blog if you want to learn more. Shoutout to @modal for the compute credits allowing me to train on 2 A100 GPUs without going broke cc @sundeep @GavinSherry
English
131
258
3.5K
778.7K

tanuja devadiga 리트윗함

Nathan Lambert@natolambert·18 Nis
A big problem with this is that we don't really have a clear description of what mythos capabilities are. A model on each of the benchmarks in the launch blog post, sure. A model that you can swap right in for the same use-cases and notice no drop in perf? Doubt it.
rohit@krishnanrohit
Dario seems to think China and open source will hit Mythos capabilities in 6-12 months
English
7
3
82
11.3K

tanuja devadiga 리트윗함

Teknium 🪽@Teknium·19 Nis
Welcome to the crew!
sprmn.base.eth@sprmn2024
I started contributing to @NousResearch Hermes Agent by doing one thing: reading the code. Then a small fix. Then another. Gateway platforms, skills, bug fixes... It kept going for a long time. Today I received the Developer role in the Nous Research Discord. 🎉 Special thanks to @Teknium for reviewing and valuing every contribution throughout this journey. For anyone thinking about contributing to open source: the best starting point is reading the code. The rest follows. github.com/NousResearch/h… 🤖
English
2
5
178
11.6K

tanuja devadiga 리트윗함

Ivan Velichko@iximiuz·19 Nis
3x playground uptime just landed at iximiuz Labs 🚀 - Work on any tasks for up to 24h - Run sandboxed agents w/o interruption - Take longer breaks while solving challenges or following course lessons without losing progress
GIF
English
1
7
49
2.9K

tanuja devadiga 리트윗함

Matthew Dabit@MattDabit·18 Nis
LLMs accelerate shipping. They don't replace thinking. Before you ship, review your code thoroughly. Think it through. Define success metrics upfront. AI hits a wall on something? Use it to explain, then master the concept yourself. Better human always beats better prompt. Ship slop and your AI stays mediocre. Level up first. Your velocity and customer will both win.
English
5
9
102
2.5K

tanuja devadiga 리트윗함

ClaudeDevs@ClaudeDevs·17 Nis
Some of you ran into Opus 4.7 refusing normal code edits with "this might be malware" warnings. That was a bug on our side, not the model being cautious. Older builds applied a stale safety prompt that Opus 4.7 doesn't need. Run claude update or relaunch the app.
English
169
170
4.6K
315.2K

tanuja devadiga 리트윗함

Ivan Velichko@iximiuz·17 Nis
We just got our 200th challenge published 🚀 If you're learning Linux, containers, Kubernetes, or networking, check out our collection of practical problems at labs.iximiuz.com/challenges Learning by doing is the way! P.S. Many of these problems are completely free 😉
English
0
23
150
6.7K

탐색

@badlogicgames @AnthropicAI @_reesechong @modal @sundeep @GavinSherry @elonmusk @BarackObama