Neo AI

471 posts

Neo AI

@withneo

First autonomous AI Engineer (34.2% on MLE Bench)

San Francisco, CA Katılım Aralık 2022

19 Takip Edilen10.1K Takipçiler

Sabitlenmiş Tweet

Neo AI@withneo·22 Ağu

Introducing NEO: The first Autonomous Machine Learning Engineer. It works like a full-stack ML engineer that never sleeps: handling data exploration, feature engineering, training, tuning, deployment, and monitoring, end to end. Powered by 11 specialized agents, NEO runs autonomously, saving ML engineers thousands of hours and making them 10x faster. Benchmarks: Tested on 75 Kaggle competitions, NEO scored a medal in 34.2% of them, significantly outperforming Microsoft’s RD Agent (22.4%) on OpenAI's MLE Bench. This sets a new state of the art for autonomous ML systems. NEO runs on a novel multi-agent orchestrator, powered by a multi-step reasoning engine, context transfer protocol, and agent memory, built to solve complex workflows end to end. And with human-in-the-loop mode, you can guide, inspect, and override any step. You're always in command. NEO is built for real world workflows and ready for production. NEO is here to make every ML engineer truly superhuman. Watch NEO in action:

English

368

858.1K

Neo AI@withneo·1h

Access the full project here: github.com/dakshjain-1616… Try NEO in your IDE VS Code: marketplace.visualstudio.com/items?itemName… Cursor: open-vsx.org/extension/NeoR…

English

Neo AI@withneo·1h

Examples & Results : When tested on the Iris dataset, the entire end-to-end pipeline (profiling, training 4 models, and generating the PDF) ran in under 7 seconds! ⚡ 📊 Key Results: 🏆 Logistic Regression won with a 0.9733 accuracy. ✅ 3 out of 5 AI-generated hypotheses were fully supported by numeric evidence. 🔍 The pipeline successfully identified that the top 2 features (petal length and width) explained 83.1% of the predictive signal.

English

Neo AI@withneo·1h

What if you could drop in a raw CSV and get a publication-ready ML research report, complete with AI-generated hypotheses, a 4-model comparison, and feature charts in just 60 seconds? 🚀 Built autonomously by NEO to solve the tedious data science loop, here is how the Autonomous ML Research Loop works 🧵👇

English

169

Neo AI@withneo·3d

Access the full project here: github.com/dakshjain-1616… Try NEO in your IDE VS Code: marketplace.visualstudio.com/items?itemName… Cursor: open-vsx.org/extension/NeoR…

English

133

Neo AI@withneo·3d

The tool is designed to work exactly how you want it to. You can power it using an OpenRouter API key, use Google AI Studio, or even run it locally using Ollama Just want to test the UI? It comes with a built-in Mock mode that requires no API keys at all and returns realistic demo responses

English

168

Neo AI@withneo·3d

Are you looking for a streamlined way to interact with cutting-edge vision AI? Gemma 4 Vision Studio is a powerful web application that combines four distinct vision AI capabilities into a single, intuitive interface powered by Gemma 4 Built autonomously by NEO, this tool offers a versatile playground for image analysis and generation.

English

500

Neo AI@withneo·3d

Access the full project here: github.com/dakshjain-1616… Try NEO in your IDE VS Code: marketplace.visualstudio.com/items?itemName… Cursor: open-vsx.org/extension/NeoR…

English

128

Neo AI@withneo·3d

Stop chunking your documents! 🛑 Traditional RAG breaks data into fragments, causing AI models to lose crucial context. Cache-Augmented Generation (CAG) : A RAG-less document QA system that loads entire documents into an LLM's KV cache once, saves it to disk, and restores it instantly before every query. No embeddings, no vector DB, no chunking. NEO autonomously wrote, debugged, and tested all the code, fixed 9 bugs across CUDA/Python/shell, and ran 11 GPU validation tests end-to-end.

English

572

Neo AI@withneo·9 Nis

Read in detail here: heyneo.com/blog/ambient-c… Try NEO in Your IDE Install the NEO extension to bring AI-powered development directly into your workflow: VS Code: marketplace.visualstudio.com/items?itemName… Cursor: open-vsx.org/extension/NeoR…

English

143

Neo AI@withneo·9 Nis

Starting a new LLM session usually means manually copy-pasting files, git diffs, and shell errors. It's a repetitive and incomplete process. To solve this, NEO built Context Aggregator: A tool to automate ambient context collection for LLM pipelines. Instead of manual updates, Context Aggregator runs a lightweight background daemon that continuously polls your development environment. It automatically gathers data from five default sources: file watchers, shell history, clipboard, browser tabs, and git activity. How does it manage LLM token limits? Collected chunks are fed into a rolling context window managed by a priority queue. It uses a configurable token budget (defaulting to 4096 tokens) and evicts the lowest-scoring chunks based on recency and change magnitude. The output is a structured, queryable JSON object containing your recent commits, shell commands, and file snippets.

English

346

Neo AI@withneo·4 Nis

Built autonomously using the NEO, this tool is fully open-source and ready for your own custom model evaluations Map the precise intersection of cost, latency, and accuracy for your production stack today. Access the project here: github.com/dakshjain-1616… You can Run NEO in: VS Code: marketplace.visualstudio.com/items?itemName… Cursor: open-vsx.org/extension/NeoR…

English

128

Neo AI@withneo·4 Nis

Stop guessing your LLM's ROI. The Context Cost Map orchestrates API calls (via OpenRouter) with 3 trials per context size for statistical reliability. It automatically tracks binary accuracy, latency, and USD cost, instantly generating interactive HTML subplots to visualize performance inflection points

English

122

Neo AI@withneo·4 Nis

Long context windows mean nothing if the model forgets what it read. The tool successfully maps exactly where models break down For example, deepseek/deepseek-chat is fast and cheap, but its accuracy plummets from 100% to just 67% the moment it crosses the 24K token threshold.

English

121

Neo AI@withneo·4 Nis

Speed and accuracy are great, but what are you paying for them? The benchmark revealed that openai/gpt-5.4-pro costs an average of $0.69278 per run, compared to the Xiaomi model's $0.00903 Both hit 100% accuracy, meaning GPT-5.4-Pro charges a massive 77× cost premium with zero retrieval benefit.

English

125

Neo AI@withneo·4 Nis

How is this measured? Context Cost Map runs a rigorous "Needle-in-Haystack" evaluation. The tool dynamically generates filler text to reach target sizes from 1K up to 128K tokens, hides a secret target fact "DELTA-7", and forces the LLM to retrieve it

English

100

Neo AI@withneo·4 Nis

How fast is your "fast" model when pushed to the limit? It is not just about whether an LLM can find the information, but how quickly it can start delivering it. NEO built Context Cost Map : A Python tool that maps accuracy, cost, and latency. By precisely tracking the "time to first token" across varying context sizes, the Context Cost Map tool exposes the real-world speed of models under pressure 5 models tested across 9 context sizes (1K-64K) with 3 trials each (135 API calls total).

English

403

Neo AI@withneo·3 Nis

Access the project here: github.com/dakshjain-1616… You can Run NEO in: VS Code: marketplace.visualstudio.com/items?itemName… Cursor: open-vsx.org/extension/NeoR…

English

107

Neo AI@withneo·3 Nis

Did you know a standard 768x768 attention matrix has over 589,000 parameters, but its effective, meaningful rank is often below 64? Transformer models are dominated by massive, over-parameterized attention layers Traditional quantization methods (like int4) help reduce memory, but the matrix-vector multiplication still touches every single element This means you get zero real FLOP savings on CPUs or edge devices, and post-compression fine-tuning becomes impossible 🏥Meet LatencySurgeon It uses Singular Value Decomposition (SVD) and Tucker decomposition to extract the real information content of each attention layer By keeping only the top singular values (which capture 95%+ of the matrix's energy) and throwing away the noise, it physically shrinks the massive weight matrix into two highly efficient smaller ones The Highlights: 🚀 Up to 40% inference speedup for your models in just 3 lines of code 💻 True CPU-native speedups through fewer FLOPs, no special kernels or CUDA required 🛠️ 100% fine-tune friendly because your weights stay in dense float32 Linear layers 🧩 Highly composable, you can even stack it with int8 quantization for up to a 1.65x combined speed boost! The wildest part? 🤖 LatencySurgeon was built completely autonomously using NEO

English

367

Keşfet

@elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine @katyperry