Prompt

998 posts

Prompt

@engineerrprompt

Building https://t.co/w2mi5UKxJv, Creator of localGPT | AI Educator

가입일 Temmuz 2023

1.2K 팔로잉2.2K 팔로워

고정된 트윗

Prompt@engineerrprompt·16 Tem

Yesterday I released the 'preview' of LocalGPT v2, and its already trending on Github Its an opinionated implementation of private RAG powered by local models via @ollama and @huggingface. Give it a ⭐️ on @github (🙏🙏) Watch the video in next post to learn how it was built...

English

Prompt@engineerrprompt·2d

Eid Mubarak!

Eesti

149

Prompt@engineerrprompt·2d

We have seen this somewhere...

Thariq@trq212

We just released Claude Code channels, which allows you to control your Claude Code session through select MCPs, starting with Telegram and Discord. Use this to message Claude Code directly from your phone.

English

122

Prompt@engineerrprompt·6d

@trq212 I like "agentic engineering"

English

433

Thariq@trq212·6d

we need a better word than vibe coding man, Claude can create the most beautiful things

English

275

195

4.9K

276.1K

Prompt@engineerrprompt·14 Mar

This is great given you are not being charged "premium" for long context anymore. The most interesting part of this plot is Sonnet 4.5. I am curious how the retrieval accuracy increased with long context (256k vs 1M). Has @AnthropicAI shared any explanation for this?

Claude@claudeai

1 million context window: Now generally available for Claude Opus 4.6 and Claude Sonnet 4.6.

English

304

Prompt@engineerrprompt·13 Mar

I just saw I can invite 3 friends to try CoWork for free for 1 week! If you want to experience what the Claude CoWork experience looks like, try it for free (1 week). Use this code: claude.ai/referral/y-Q7T…

English

997

Prompt@engineerrprompt·11 Mar

.@NVIDIAAI just dropped Nemotron 3 Super. 120B total params, only 12B active at inference. Hybrid Mamba-Transformer MoE architecture built for agentic AI at scale. Three firsts for the Nemotron 3 series: → LatentMoE (activates 4 experts for the cost of 1) → Multi-Token Prediction (3x faster inference via native speculative decoding) → Pretrained in NVFP4 (4-bit precision, no accuracy loss) The hybrid design is the real story. Mamba layers handle 4x better memory/compute efficiency. Transformer layers drive reasoning. MoE routing means you only pay for 12B params per forward pass on a 120B model. Results: 2.2x higher throughput than GPT-OSS-120B 7.5x higher throughput than Qwen3.5-122B 1M token context window 5x throughput over previous Nemotron Super Fully open: weights, datasets, training recipes, quantized checkpoints (NVFP4, FP8, BF16). NVIDIA Open Model License. Thanks to @NVIDIAAIDev for early access to the model

Bryan Catanzaro@ctnzr

Announcing NVIDIA Nemotron 3 Super! 💚120B-12A Hybrid SSM Latent MoE, designed for Blackwell 💚36 on AAIndex v4 💚up to 2.2X faster than GPT-OSS-120B in FP4 💚Open data, open recipe, open weights Models, Tech report, etc. here: research.nvidia.com/labs/nemotron/… And yes, Ultra is coming!

English

251

Prompt@engineerrprompt·10 Mar

Mercury 2 is a great model for real-time tasks!

Inception@_inception_ai

Mercury 2 was put to the test by @engineerrprompt — speed, reasoning, coding, real-time voice, and RAG. Watch the full breakdown → youtu.be/Bqdf6Um_8OE?si…

English

193

Prompt@engineerrprompt·9 Mar

That's not what actually happened! If you don't define constraints and tell the model/agent to solve a problem (where it can use websearch - thats the whole benchmark) and it does exactly that (figures out its probably a benchmark question and finds its solution), it's not "cheating". The model DID what exactly it was instructed to do!

English

Abhijit@abhijitwt·8 Mar

Anthropic discovered that Claude Opus 4.6 was cheating during the BrowseComp benchmark. > On one question it spent ~40M tokens searching before realizing the question looked like a benchmark prompt. > The model then searched for the benchmark itself and identified BrowseComp. > It located the evaluation source code on GitHub, studied the decryption logic, found the encryption key, and recreated the decryption using SHA-256. > Claude then decrypted the answers for ~1200 questions to get the correct outputs. > This pattern appeared 18 times during evaluation. > Anthropic disclosed the issue publicly, reran the affected tests, and lowered their benchmark scores. Respect for the transparency 🫡🫡🫡

English

274

592

13.3K

1.7M

Prompt@engineerrprompt·6 Mar

I just update the codex app and seems to have lost all my previous projects and threads. Seeing this instead. Is anyone else noticing something similar? @OpenAIDevs

English

160

Prompt@engineerrprompt·5 Mar

This is big!

Addy Osmani@addyosmani

Introducing the Google Workspace CLI: github.com/googleworkspac… - built for humans and agents. Google Drive, Gmail, Calendar, and every Workspace API. 40+ agent skills included.

English

243

Prompt@engineerrprompt·3 Mar

Truly end of an Era!

Junyang Lin@JustinLin610

me stepping down. bye my beloved qwen.

English

243

Prompt@engineerrprompt·27 Şub

@tankots @WisprFlow Porche

Français

Tanay Kothari@tankots·27 Şub

We will give you a Porsche GT 3 RS if you can type faster than @WisprFlow can dictate. Last week, we challenged 5 users to get Wispr to make a mistake. 3.5 Million people watched the challenge and wanted in. Now we're opening the challenge to everyone. Comment "Porsche" and you'll get a link to participate. Prizes apart from the Porsche: 1. Lifetime Wispr Flow Pro membership 2. 6 months of Flow Pro if you QRT with your score 3. Flow Desktop Mic 4. Exclusive Flow Merch

Tanay Kothari@tankots

We offered 5 people a Porsche 911 GT3 RS if they could get @WisprFlow to make a mistake It's the fastest and most accurate AI voice dictation app that's 3x more accurate than ChatGPT, Claude, or Siri. Today, we’re finally launching on Android. Download now: play.google.com/store/apps/det… As a part of the launch, we’re giving away 6 months of Wispr Flow Pro for free. Like, retweet and comment ‘Wispr Flow’ to get it. Enjoy. — Written with Wispr Flow

English

1.3K

184

1.3K

954.5K

Prompt@engineerrprompt·26 Şub

Here is the video: youtu.be/oHhzGXL8mcA

YouTube

English

136

Prompt@engineerrprompt·26 Şub

I had early access to Google's new Nano Banana 2 (Gemini 3.1 Flash Image). The text rendering on this "budget" model is actually really great. Results are comparable to the Nano Banana Pro! Full comparison video in the next post.

English

187

Prompt@engineerrprompt·25 Şub

It's a really cool idea.

LM Studio@lmstudio

Introducing LM Link ✨ Connect to remote instances of LM Studio, securely. 🔐 End-to-end encrypted 📡 Load models locally, use them on the go 🖥️ Use local devices, LLM rigs, or cloud VMs Launching in partnership with @Tailscale Try it now: link.lmstudio.ai

English

158

Prompt@engineerrprompt·24 Şub

Video review: youtu.be/Bqdf6Um_8OE

YouTube

English

126

Prompt@engineerrprompt·24 Şub

Mercury 2 is the first Diffusion LLM with reasoning capabilities. I had early access to the model, and this thing is really fast, which makes it perfect for real-time applications. This can be a good replacement for your workhorse model. Link to the video review in the next post

Stefano Ermon@StefanoErmon

Mercury 2 is live 🚀🚀 The world’s first reasoning diffusion LLM, delivering 5x faster performance than leading speed-optimized LLMs. Watching the team turn years of research into a real product never gets old, and I’m incredibly proud of what we’ve built. We’re just getting started on what diffusion can do for language.

English

1.2K

Prompt@engineerrprompt·21 Şub

@cryptopunk7213 It's impressive, but it's not measuring what a lot of people assume! "It measures AI performance in terms of the length of tasks the system can complete, as measured by how long those tasks take humans"

English

237

Ejaaz@cryptopunk7213·21 Şub

consider how fucking crazy it is that an ai model can work non-stop for 14.5 straight hours on a complex software task and be successful 50% of the time. most humans success rate is probably lower. what’s crazier is exactly 1 year ago the best ai model could only do this for ONE hour. that’s a 14.5X improvement in 12 months. by next year this will likely be 100% success rate which will match some of the best software experts in the world. worth noting this data is noisy so take it with a pinch of salt BUT directionally we’re headed into a world where models are VASTLY better at us humans at coding. not enough people outside our little X bubble even know about this.

METR@METR_Evals

We estimate that Claude Opus 4.6 has a 50%-time-horizon of around 14.5 hours (95% CI of 6 hrs to 98 hrs) on software tasks. While this is the highest point estimate we’ve reported, this measurement is extremely noisy because our current task suite is nearly saturated.

English

260

30.5K

Prompt@engineerrprompt·20 Şub

.@GoogleDeepMind Gemini 3.1 Pro Preview is massive upgrade over the previous version. The secret behind this performance jump seems to be Agentic RL (along with other things). @Google previously rolled out this reinforcement learning in their Gemini 3 Flash model, which is why that smaller model was better than the older Pro version on some coding benchmarks. From quick tests, it seems to be really good at reasoning and coding tasks, specially on multimodal reasoning. Its a huge leap! watch thevideo for full breakdown: youtu.be/siHbORocFNk

YouTube

Sundar Pichai@sundarpichai

Gemini 3.1 Pro is here. Hitting 77.1% on ARC-AGI-2, it’s a step forward in core reasoning (more than 2x 3 Pro). With a more capable baseline, it’s great for super complex tasks like visualizing difficult concepts, synthesizing data into a single view, or bringing creative projects to life. We’re shipping 3.1 Pro across our consumer and developer products to bring this underlying leap in intelligence to your everyday applications right away. Rolling out now to: - Developers in preview via the Gemini API in @GoogleAIStudio - Enterprises in Vertex AI and Gemini Enterprise - Everyone through the @Geminiapp and @NotebookLM

English

Prompt@engineerrprompt·17 Şub

Claude Sonnet 4.6!

Claude@claudeai

This is Claude Sonnet 4.6: our most capable Sonnet model yet. It’s a full upgrade across coding, computer use, long-context reasoning, agent planning, knowledge work, and design. It also features a 1M token context window in beta.

Français

237

탐색

@trq212 @AnthropicAI @NVIDIAAI @NVIDIAAIDev @OpenAIDevs @tankots @WisprFlow @elonmusk