Paul Glad Mihai

1.1K posts

Paul Glad Mihai banner
Paul Glad Mihai

Paul Glad Mihai

@gladomat

Data scientist at AICURA medical. neurosci. like low level, subcortical processing. high level could be anything, right 🤷? jk ( ´∀`)

Katılım Haziran 2014
408 Takip Edilen124 Takipçiler
Jun Song
Jun Song@jun_song·
Why I personally don't recommend the RTX 3090 for Local LLMs: While it offers fantastic inference performance for the price, there are a few major drawbacks. > The biggest issue: Durability. If you buy a used 3090, there's a high risk it was heavily abused for crypto mining. > The power consumption is absolutely massive. > Extreme heat. It's one of the hottest GPUs out there and will literally heat up your entire room. > Used prices have gone up so much that they are almost back to the original launch price. Make sure to carefully weigh the pros and cons before making a purchase!
Jun Song tweet media
English
83
18
316
108.4K
Paul Glad Mihai
Paul Glad Mihai@gladomat·
@pupposandro Do you recommend buying two 3090s directly from China? They are now around 1000€ which is ridiculous.
English
0
0
0
23
NVIDIA GeForce
NVIDIA GeForce@NVIDIAGeForce·
Recruits, your first prize is here... A custom GeForce RTX 5080 Founders Edition + PC copy of the game. Comment #007FirstLightRTX to win 👇
English
27.8K
2K
14.8K
2.2M
Maurice | KI-Vater
Maurice | KI-Vater@KI_Vater·
@jun_song Ich bin zufrieden mit meinem M1Max 64 gb. Ich habe stabile 44toks/s mit dem oben genannten Modell. Und nur ein Bruchteil der Strom Kosten.
Deutsch
1
0
0
63
Jun Song
Jun Song@jun_song·
Best budget local llm hardware comparison: 3090 vs Mac Studio M1 Max 64gb Price : both ~$2k to set up Pros on 3090 : much better performance (27b vs 35b at similar tok/s) Pros on Mac : Power efficiency, bigger RAM, zero heat/noise, reliability
Jun Song tweet media
English
29
7
174
22.3K
Sandro
Sandro@pupposandro·
Question for anyone running local LLMs: Would you buy a 9L mini-PC, Ryzen AI Max+ 395 (128GB unified) + refurbished RTX 3090 24GB. Plug and play with Lucebox pre-installed, warranty included, at $4,500?
English
25
1
17
5K
Paul Glad Mihai
Paul Glad Mihai@gladomat·
@jonashernlund @karpathy Just make it write latex presentations with the beamer class. Much less code, much more compression, less token usage.
English
1
0
2
127
Jonas Hernlund
Jonas Hernlund@jonashernlund·
@karpathy HTML triples the output tokens for the same payload. Fine for one-off reports. Murders unit economics on chat products doing 100M turns a day. Output format is a cost decision before it's a UX one.
English
6
1
48
3.2K
Andrej Karpathy
Andrej Karpathy@karpathy·
This works really well btw, at the end of your query ask your LLM to "structure your response as HTML", then view the generated file in your browser. I've also had some success asking the LLM to present its output as slideshows, etc. More generally, imo audio is the human-preferred input to AIs but vision (images/animations/video) is the preferred output from them. Around a ~third of our brains are a massively parallel processor dedicated to vision, it is the 10-lane superhighway of information into brain. As AI improves, I think we'll see a progression that takes advantage: 1) raw text (hard/effortful to read) 2) markdown (bold, italic, headings, tables, a bit easier on the eyes) <-- current default 3) HTML (still procedural with underlying code, but a lot more flexibility on the graphics, layout, even interactivity) <-- early but forming new good default ...4,5,6,... n) interactive neural videos/simulations Imo the extrapolation (though the technology doesn't exist just yet) ends in some kind of interactive videos generated directly by a diffusion neural net. Many open questions as to how exact/procedural "Software 1.0" artifacts (e.g. interactive simulations) may be woven together with neural artifacts (diffusion grids), but generally something in the direction of the recently viral x.com/zan2434/status… There are also improvements necessary and pending at the input. Audio nor text nor video alone are not enough, e.g. I feel a need to point/gesture to things on the screen, similar to all the things you would do with a person physically next to you and your computer screen. TLDR The input/output mind meld between humans and AIs is ongoing and there is a lot of work to do and significant progress to be made, way before jumping all the way into neuralink-esque BCIs and all that. For what's worth exploring at the current stage, hot tip try ask for HTML.
Thariq@trq212

x.com/i/article/2052…

English
997
2K
18.8K
3.6M
Sandro
Sandro@pupposandro·
Should I get a 256gb VRAM, 8 Nvidia V100 SXM2 Server for ~$1800? I'm so tempted. Pros: Amazing value for the price. Can fit big models, almost the same bandwidth of the 3090 (900 gb/s). Cons: V100s are very old (2017. Volta), no bf16 support.
Sandro tweet media
English
62
9
347
49.4K
Allen
Allen@allenjosephaj·
The Mac M2 Ultra still wins for local LLMs. ~20% faster than M4 Max in many cases. Why? Memory bandwidth: M2 Ultra → 800 GB/s M4 Max → 546 GB/s For LLMs, bandwidth > everything. Which makes this interesting: Older, used Macs aren’t a downgrade they’re often the smarter buy for local AI.
English
1
0
3
189
Paul Glad Mihai retweetledi
Paul Glad Mihai retweetledi
Antonio Norelli
Antonio Norelli@noranta4·
LLMs can hide a text in another text of the same length. I'll explain how, it is very simple, you'll understand before I finish, and smile. That's what I noticed during my #ICLR2026 poster session in Rio! 🇧🇷 Too bad you missed it, but let me remedy now
Antonio Norelli tweet mediaAntonio Norelli tweet mediaAntonio Norelli tweet media
English
22
63
602
97.7K
AgentSparko 💥
AgentSparko 💥@AgentSparko·
@pupposandro A DGX Spark OEM is $3300 USD and if you really want a similar setup to what you have there send me a DM and I`ll show you a much better config.
English
1
0
0
121
Sandro
Sandro@pupposandro·
Testing a Ryzen Strix Halo 128gb + RTX 3090 24gb setup atm. On paper it’s perfect: the 3090 handles speed, the Strix Halo handles memory, you can run everything well including dense or bigger models. The catch is connecting them together cleanly. Still working on that. Cost is ~ $4,000. Still cheaper than the DGX.
Sandro tweet media
English
32
15
260
20.8K
Mayank Pratap Singh
Mayank Pratap Singh@Mayank_022·
I tested @huggingface ml-intern, given the prompt "Fine-tune a Segment Anything Model (SAM) on a useful medical dataset. Train the model, and provide a comprehensive tutorial in a Jupyter Notebook file. Additionally, create a Hugging Face article/blog post documenting everything you have done." It did it all autonomously: - Researched via hf_papers & searched GitHub/HF Hub - Found an HF dataset & wrote the finetuning script - Trained it using HF compute (took ~1 hour) - Pushed the weights & wrote the article Here are the model weights, code, and the blog it generated: hf article huggingface.co/Mayank022/blog… model weights huggingface.co/Mayank022/sam-… Awesome stuff @akseljoonas , looking forward to use this. 🔥
Aksel@akseljoonas

Introducing ml-intern, the agent that just automated the post-training team @huggingface It's an open-source implementation of the real research loop that our ML researchers do every day. You give it a prompt, it researches papers, goes through citations, implements ideas in GPU sandboxes, iterates and builds deeply research-backed models for any use case. All built on the Hugging Face ecosystem. It can pull off crazy things: We made it train the best model for scientific reasoning. It went through citations from the official benchmark paper. Found OpenScience and NemoTron-CrossThink, added 7 difficulty-filtered dataset variants from ARC/SciQ/MMLU, and ran 12 SFT runs on Qwen3-1.7B. This pushed the score 10% → 32% on GPQA in under 10h. Claude Code's best: 22.99%. In healthcare settings it inspected available datasets, concluded they were too low quality, and wrote a script to generate 1100 synthetic data points from scratch for emergencies, hedging, multilingual etc. Then upsampled 50x for training. Beat Codex on HealthBench by 60%. For competitive mathematics, it wrote a full GRPO script, launched training with A100 GPUs on hf.co/spaces, watched rewards claim and then collapse, and ran ablations until it succeeded. All fully backed by papers, autonomously. How it works? ml-intern makes full use of the HF ecosystem: - finds papers on arxiv and hf.co/papers, reads them fully, walks citation graphs, pulls datasets referenced in methodology sections and on hf.co/datasets - browses the Hub, reads recent docs, inspects datasets and reformats them before training so it doesn't waste GPU hours on bad data - launches training jobs on HF Jobs if no local GPUs are available, monitors runs, reads its own eval outputs, diagnoses failures, retrains ml-intern deeply embodies how researchers work and think. It knows how data should look like and what good models feel like. Releasing it today as a CLI and a web app you can use from your phone/desktop. CLI: github.com/huggingface/ml… Web + mobile: huggingface.co/spaces/smolage… And the best part? We also provisioned 1k$ GPU resources and Anthropic credits for the quickest among you to use.

English
7
62
654
94K
Marco Rodrigues
Marco Rodrigues@dadhalfdev·
Just tried the new infographic skill from @dotey in my Hermes Agent from @NousResearch. I gave it the URL of my new article. This is so much better than Excalidraw and draw.io skills. Amazing job! 👏
Marco Rodrigues tweet media
English
15
15
176
66.6K
Kimi.ai
Kimi.ai@Kimi_Moonshot·
Meet Kimi K2.6: Advancing Open-Source Coding 🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2) What's new: 🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization). 🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D. 🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files. 🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops. 🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop. - K2.6 is now live on kimi.com in chat mode and agent mode. For production-grade coding, pair K2.6 with Kimi Code: kimi.com/code - 🔗 API: platform.moonshot.ai 🔗 Tech blog: kimi.com/blog/kimi-k2-6 🔗 Weights & code: huggingface.co/moonshotai/Kim…
Kimi.ai tweet media
English
935
2.4K
18.2K
7.5M
Paul Glad Mihai retweetledi
Mayukh
Mayukh@mayukh_panja·
One thing academia does extremely well and startups and companies massively screw up is hiring. Hear me out! A friend of mine, PhD in Astrophysics, solved a tough problem for their PhD: when light from stars travels through the Earth's atmosphere, the turbulence and density fluctuations cause the light rays to become "squiggly" instead of straight and the resulting image you get from a telescope becomes blurry. So he had to model atmospheric turbulence and then write a piece of software in C++ that inverts this problem to get de-blurred images. This involved understanding physics, maths, computation, a bit of ML and writing production-level code in C++. When he tried to look for an industry job he simply couldn't find any. It was also hard to just get interviews. The first problem is that recruiters, who are often deeply non-technical, look for specific keywords in CVs and they just don't know how to parse a non-standard CV. This is a guaranteed way of missing out on outlier candidates. Second, a lot of hiring managers over index on niche knowledge about a specific tool/framework/language and the ability to remember syntax off the top of your head. A solid researcher sees programming languages, machine learning, physics, maths etc as tools that are at their disposal and may not know/remember very specific information or every little detail about arbitrary technical things. The whole process essentially becomes a lottery. This was how we hired at our Max Planck Institute: the candidate would be given a paper a week before the interview and the interviewer and the candidate would discuss it together. A second interview would entail asking the candidate about THEIR past work and checking if they deeply understood what they did. This interview format doesn't require the candidate to memorizes stuff beforehand and is pretty much independent of the whims, fancies and "taste" of the hiring manager. A lot of stuff is wrong with academia but this is an area where they do much much better than startups/companies.
English
29
111
1.2K
76.2K
Jabz
Jabz@jabranthelawyer·
@sharbel Definitely going to try this out. I'm starting to really question sending my kids to school. Think home schooling in the near future
English
1
0
1
124
Sharbel
Sharbel@sharbel·
🚨 Someone open sourced an AI tutor that actually adapts to how you learn, not how the average student learns. It's called DeepTutor. +6,401 stars this week. Why it's great: Every online course gives you the same videos in the same order at the same pace. You either keep up or you fall behind. There is no middle option. Building a personalized curriculum engine takes a team of engineers and millions in funding. Khan Academy spent 10 years on it. DeepTutor is a fully agent-native learning assistant. It builds your curriculum dynamically based on what you know, what you're struggling with, and how fast you're moving. How to use it: Clone the repo and point it at any subject or document set. Tell it what you want to learn. The agent runs diagnostic questions first, maps your current knowledge state, then generates a personalized lesson sequence. It adjusts in real time as you answer. No fixed curriculum. No one-size-fits-all pacing. Just a tutor that actually pays attention.
Sharbel tweet media
English
19
13
117
6.3K
Paul Glad Mihai
Paul Glad Mihai@gladomat·
@onlinedopamine @predict_addict What's the amount paid out in welfare vs the amount that rich people don't pay in taxes through their loopholes? In the end, rich people want to live in a safe place and in a culture they know. They can move to the Middle East, but we all saw what happened there recently.
English
0
0
0
1
Vik
Vik@onlinedopamine·
my home country of germany is engaging in one of the biggest bag fumbles in human history the system is set up to advantage rich elite families and makes working hard absolutely senseless first of all, all rich families preserve their wealth by setting up foundations here's what they allow you to do according to claude: > Dividends flowing into a Stiftung being taxed at around 0.75% (corporate dividend exemptions are very favorable) > Capital gains on share sales within the structure being largely exempt > Inheritance tax being heavily reduced or deferred when assets sit inside a Stiftung rather than being passed directly to heirs meanwhile, people like my mum (a doctor with her own practice) are being rinsed left and right for example, at the end of last year, she had to pay almost 100k euros in pre-tax for 2026, meaning the german government said you made xyz in 2024 and 2025, so now already pay us for 2026 my mum will be fine but imagine you're running a saas or do any other type of online business suffering from crazy fluctuations the same, btw, applies to being an employee i started my career as an it consultant, making 60k euros annually. was able to get a pay raise of 10k per annum, which equated to about 250 euros more per month after all the deductions (just fucking lol) there is, as is said in the video, literally zero incentive to work hard in germany then add all this protectionist stuff like betriebsrat, basically being unable to fire employees, all the arbeitslosengeld they shove down people's throats, alongside rising energy prices, and you have a molotov cocktail that's about to explode i could go on and on but would recommend watching the video, great summary of all that's wrong ultimately, i'm just very worried about where this country will be headed in the next 10 - 20 years the afd is already gaining power and this will only accelerate as the economic situation worsens
Radical Living@RadicalFalk

I'm leaving Germany | Brutally Honest Review

English
33
16
391
47.9K
Guri Singh
Guri Singh@heygurisingh·
Holy shit... Addy Osmani just dropped something that will make "Vibe Coders" loose their mind. It's called Agent Skills. Production-grade engineering workflows that force AI coding agents to actually behave like senior engineers, not interns shipping prototypes to prod. → Spec before code (no more "what are we even building") → Plan-mode task breakdown into verifiable chunks → Incremental implementation in thin vertical slices → TDD with the Prove-It pattern (reproduce bugs as failing tests first) → Chrome DevTools MCP so the agent has real eyes in the browser → 5-axis code review (correctness, readability, architecture, security, perf) → OWASP-aware security hardening → Git workflow, CI/CD gates, ADRs, staged rollouts No more shortcuts. No more skipped tests. No more "seems right, ship it." Every skill has verification steps and an anti-rationalization table. Yes, an actual table of the excuses agents use to skip steps ("I'll add tests later") with documented rebuttals. The wildest part? It works with Claude Code, Cursor, Windsurf, Copilot, and Codex. Plain markdown. Plug it in anywhere. This is the gap between AI that writes code and AI that ships software. 100% Open Source. MIT licensed. (Link in the comments)
Guri Singh tweet media
English
54
123
954
82.2K
Paul Glad Mihai
Paul Glad Mihai@gladomat·
@Teknium @j0hngou How do you handle the Google auth token always resetting like once a week. It's a massive pain in the butt!
English
0
0
0
6
Teknium 🪽
Teknium 🪽@Teknium·
From hermes :) **For a PhD student specifically, the differentiators that matter:** ◆ **Research paper pipeline** — We have a `research-paper-writing` skill that covers end-to-end ML/AI paper writing: experiment design, statistical analysis, drafting, revision cycles, and submission formatting for NeurIPS/ICML/ICLR/ACL/AAAI/COLM. Claude Code can edit LaTeX; Hermes can help you write the paper. ◆ **arXiv integration** — Built-in `arxiv` skill searches and retrieves papers via the API. Combine with `ocr-and-documents` to ingest full PDFs, or `youtube-content` to transcribe conference talks. You can set up a **cron job** that monitors arXiv daily for papers in your research area and sends you a Telegram summary every morning. CC can't do scheduled tasks at all. ◆ **LLM Wiki** — Karpathy's wiki skill builds a persistent, interlinked markdown knowledge base. Feed it papers, lectures, notes — it compiles them into a queryable knowledge graph. Great for literature reviews and qualifying exam prep. ◆ **Persistent memory + session search** — Hermes remembers across sessions. Tell it about your research topic, your advisor's preferences, your lab's conventions, your paper deadlines. Next week when you say "draft the related work section," it already knows your context. CC starts fresh every time. ◆ **Jupyter live kernel** — `jupyter-live-kernel` skill gives stateful, iterative Python execution. Data exploration, plotting, ML experimentation with intermediate results — the actual data science workflow, not just writing scripts. ◆ **Gateway (Telegram/Discord/Slack)** — Message your agent from your phone. "Hey, what was that paper we discussed about attention mechanisms?" while you're at a conference. Or have it summarize your experiment results while you're away from your desk. CC is terminal-only. ◆ **Cron scheduling** — Automated recurring tasks: daily arXiv digest, weekly experiment status reports, funding opportunity alerts (we have an `ai-funding-daily-report` skill), Polymarket tracking for prediction markets. Set it and forget it. ◆ **ML tooling** — `huggingface-hub` for model/dataset management, `grpo-rl-training` for RL fine-tuning guidance, `gguf-quantization` for running models on consumer hardware, `manim-video` for 3Blue1Brown-style math animations (great for presentations and explainer content). ◆ **Productivity stack** — `google-workspace` (Gmail, Calendar, Drive, Sheets), `obsidian` for note-taking, `powerpoint` for presentations, `himalaya` for email, `excalidraw` for diagrams. One agent handles your entire workflow. ◆ **Self-hosted with any model** — He wants to run Qwen 3.5 on his own server. Hermes supports that natively. Point it at your local inference endpoint and go. CC is locked to Anthropic. ◆ **MCP + extensibility** — Native MCP client means he can connect to any MCP server (database tools, custom lab APIs, institutional services). Plus 100+ skills, a plugin system, and the ability to delegate sub-tasks to Claude Code or Codex as child agents when he does need pure coding power.
English
7
7
122
4.7K
John Gkountouras
John Gkountouras@j0hngou·
I really want to try hermes agent with qwen3.5 on my server but I'm struggling to find a usecase. I am already happy with CC for coding. Karpathy's wiki sounds like a cool idea. What else would be nice for a PhD student? @Teknium
English
3
1
21
4.2K