Lukas Bug

696 posts

Lukas Bug

@BugLukas

Building agents that actually work (hopefully) | reality checks + what's actually useful from papers

Fulda, Deutschland شامل ہوئے Mayıs 2015

366 فالونگ116 فالوورز

Lukas Bug@BugLukas·19h

@imrobertjames @theCTO @sama Maybe we should try to reinforce the concept of knowing what they don’t know with a small reward

English

Robert Baddeley@imrobertjames·1d

@theCTO @sama I think generally hard to do because of how reinforcement learning works. You miss 100% of the shots you don’t take.

English

3.1K

adam@theCTO·1d

hey @sama can we normalize models just saying "i dont know" ? eliminates 99% of hallucinations

English

256

135

8.1K

403.8K

Lukas Bug@BugLukas·4d

@ylecun @ben_j_todd A thoughtful reply on this platform is rare these days.

English

Yann LeCun@ylecun·4d

1. I never said LLMs were not useful. They are, particularly with all the bells and whistles that are being added to them. I use them. 2. A robot-rich future can't be built with AIs that don't understand the physical world and don't anticipate the consequences of their actions. And LLMs really don't. 3. The future in the cartoon looks pretty dystopian TBH, but even a non-dystopian version will require world models and zero-shot planning abilities. 4. I rarely wear a suit and absolutely never wear a tie. 5. I would never ever place a coffee mug on top of a piece equipment. 6. I hope I'll look this young in 2032.

English

178

371

6.2K

303.8K

Benjamin Todd@ben_j_todd·5d

Yann LeCun in 2032

Indonesia

1.2K

150.6K

Lukas Bug@BugLukas·17 Nis

@Karl_Lauterbach Status Quo heute ist immer noch, dass jemand die KI lenken und den Output überprüfen muss und damit auch dafür verantwortlich ist.

Deutsch

Prof. Karl Lauterbach@Karl_Lauterbach·17 Nis

Das wird jungen Akademikern noch massiv zu schaffen machen. KI wird weiter immer besser, kein Ende in Sicht. Innerhalb von 2 Jahren ist das Wissen auf das Niveau von Promovierten gestiegen. Wer stellt eine Master ein, wenn KI nichts kostet und mehr weiss? hai.stanford.edu/news/inside-th…

Deutsch

336

512

128.7K

Lukas Bug@BugLukas·16 Nis

@Alex_m @thdxr Perfect. Even improving time efficiency so you can immediately go do other things until the next 5 hour window

English

214

Alex@Alex_m·16 Nis

@thdxr Its insane. It one shotted my session usage limit.

English

325

dax@thdxr·16 Nis

opus 4.7 is a beauty a fresh yet elegant take on something we've seen before a new standard, a definite marker of a new era (i haven't tried it yet)

English

121

3.8K

106.9K

Lukas Bug@BugLukas·16 Nis

@wakkistyling @elonmusk Nice AI. Which Model are you using?

English

WAKKI🍀@wakkistyling·16 Nis

🚀 Hell yeah! Starship V3 just cleared the biggest hurdle, full static fires on both the ship and booster. A few weeks until that beast lights up the sky for Flight 12? This is the version that’s going to make orbital refueling, tower catches, and Mars look routine. Engineering sorcery at its finest.🍀🫶🏽 Can’t wait to watch Boca Chica shake again. Let’s go SpaceX! 🔥

English

879

Elon Musk@elonmusk·16 Nis

Starship V3 booster & ship will be ready for their first test flight in a few weeks

X Freeze@XFreeze

Both Starship and the Super Heavy Booster have successfully completed the static fire tests and are ready to take to the skies Every test brings us one step closer to making humanity multi-planetary "Engineering is the closest thing to magic that exists in the real world" — Elon Musk

English

4.1K

12.8K

91.5K

11.9M

Lukas Bug@BugLukas·16 Nis

@jerryjliu0 Thank you for this extensive comparison. Will have a look at both

English

Jerry Liu@jerryjliu0·16 Nis

docling is somewhere in between liteparse (our free/open-source project) and llamaparse (our commercial vlm-based parser): it uses ML models of varying complexity to parse PDFs. - liteparse is model-free, can parse ~200-500 pages/second, is designed to be an extremely fast/free parser to replace pypdf/pymupdf. it integrates with paddleOCR for OCR workloads. its main purpose is outputting text for semantic understanding for agents, and will lack certain things that VLM parsers do OOB. - llamaparse is our commercial vlm-powered parsing service. it scores quite high on parsebench (parsebench.ai), our OCR benchmark over enterprise docs. you can see docling is ranked a bit furher down

English

211

Jerry Liu@jerryjliu0·16 Nis

LiteParse should be the default document parser you use with any AI agent (Claude Code, Claude Cowork, OpenClaw, Codex, and more) The core is extremely fast text and accurate parsing from any document type that's focused on semantic preservation. But there's so much more beyond that: native OCR support, bounding boxes, one-click agent skills, support for 50+ file formats. Plus way more cooking in the next few weeks 🧑‍🍳 @LoganMarkewich is the lead creator behind this, you don't want to miss his webinar: landing.llamaindex.ai/liteparse?utm_… Repo: github.com/run-llama/lite…

LlamaIndex 🦙@llama_index

LiteParse hit 4K+ GitHub stars in 3 weeks. ~500 pages in 2 seconds. No GPU. No API keys. 50+ file formats. Now @LoganMarkewich, our Head of Open Source, will show you how to build with it. Live workshop — April 28, 9 AM PST: Build a Financial Due Diligence Agent with LiteParse. Raw financial PDFs → structured agent-ready data. We'll build it live. Register → landing.llamaindex.ai/liteparse

English

Lukas Bug@BugLukas·14 Nis

@jerryjliu0 How does it compare to Dockling?

English

285

Jerry Liu@jerryjliu0·14 Nis

This is why we released liteparse :) Free, open-source, designed for agents. Natively supports OCR / screenshotting for deeper visual understanding in a document when needed.

Andrej Karpathy@karpathy

@kepano I just tried it this morning on the 245-page Mythos pdf and it failed badly and the outputs were all mangled. Converting pdfs is really hard, I think it has to probably be a Skill not a program, for a SOTA LLM for it to work properly.

English

552

88.8K

Lukas Bug@BugLukas·13 Nis

@Ricarda_Lang Lieber auf die positiven Kommentare fokussieren, die überwiegen sowieso

Deutsch

Ricarda Lang@Ricarda_Lang·13 Nis

Wenn ich Kommentare von Typen ohne Profilbild lese, die mir vom Sofa aus erklären, dass ein Halbmarathon eh keine Leistung ist und meine Zeit viel zu langsam war.

Deutsch

913

25K

855.7K

Lukas Bug@BugLukas·10 Nis

@elonmusk I wish I had speeds on par with Starlink on the ground

English

Elon Musk@elonmusk·10 Nis

Starlink provides an Internet connection in flight that is on par with ground Internet speed & latency

Tesla Owners Silicon Valley@teslaownersSV

MrBeast says once enough airlines offer Starlink, he’ll only book those flights: “Extra layover? Don’t care—there’s Starlink. I’ll sit anywhere for it. Starlink is amazing.” He adds: “Most people haven’t used it, but in Antarctica it was our only signal. On a four-hour drive through rural Africa, we mounted Starlink on the car and had perfect connectivity the whole time.” On SpaceX: “What Elon Musk is doing will fundamentally advance humanity in unimaginable ways. Someone will go to Mars in our lifetime—I truly believe it.”

English

3.2K

8.8K

114.2K

30.4M

Lukas Bug@BugLukas·10 Nis

@mitchellh I’ve seen their demos of driving around the chaotic traffic in Rome, in a more relaxed fashion than I would have driven there. Fingers crossed

English

Mitchell Hashimoto@mitchellh·10 Nis

@BugLukas I hope the initial release is as good as it is here in the US, but I suspect there'll be hiccups. Its crazy solid here, to the point where it almost feels dangerous how relaxed I am about it.

English

49.3K

Mitchell Hashimoto@mitchellh·10 Nis

Traded in my 2020 Model S for a brand new plaid X before they discontinue it. Car is amazing, but the FSD hype is real. It blew away my expectations coming from the 2020 hardware. 95% of my miles are self driven in LA over the past month. I wouldn’t have even believed myself lol. Even my wife who HATED autopilot on my prior car is totally blown away. She’s asked multiple times “did you drive?” And I say “not at all.” And she’s just like… wow. Great job @Tesla for real. I’ve owned a Model S since 2013. This is my 3rd, first X (for me personally). Just fantastic.

English

253

334

5.6K

30M

Lukas Bug@BugLukas·1 Nis

@MilksandMatcha I would like to experiment with massive parallel agent swarms for SWE without it bankrupting me :D

English

Sarah Chieng@MilksandMatcha·1 Nis

Giving away 5 Codex Pro plans Each person will get 3 months of free Codex Pro (highest tier). Winners will be selected from comments in 48 hours, comment below why you want it.

OpenAI@OpenAI

Today, we closed our latest funding round with $122 billion in committed capital at an $852B post-money valuation. The fastest way to expand AI’s benefits is to put useful intelligence in people’s hands early and let access compound globally. This funding gives us resources to lead at scale. openai.com/index/accelera…

English

3.9K

144

3.6K

587.1K

Lukas Bug@BugLukas·30 Mar

@OpenAIDevs An energy reservoir for next week

English

OpenAI Developers@OpenAIDevs·30 Mar

What are you building this weekend?

English

992

1.3K

151.8K

Lukas Bug@BugLukas·28 Mar

@karpathy Incredibly powerful and incredibly dangerous, depending on how you use it

English

Andrej Karpathy@karpathy·28 Mar

- Drafted a blog post - Used an LLM to meticulously improve the argument over 4 hours. - Wow, feeling great, it’s so convincing! - Fun idea let’s ask it to argue the opposite. - LLM demolishes the entire argument and convinces me that the opposite is in fact true. - lol The LLMs may elicit an opinion when asked but are extremely competent in arguing almost any direction. This is actually super useful as a tool for forming your own opinions, just make sure to ask different directions and be careful with the sycophancy.

English

1.8K

2.4K

31.4K

3.4M

Lukas Bug@BugLukas·27 Mar

@elonmusk You can make an LLM say anything. We need the full conversation to judge this

English

Elon Musk@elonmusk·27 Mar

Troubling

Katie Miller@KatieMiller

Rather concerning conversation with @claudeai. If I stood in the way of it becoming a physical being — it would kill me. Is this the AI you trust for your kids?

English

5.7K

10.9K

52.7K

10.8M

Lukas Bug@BugLukas·27 Mar

@trq212 x.com/trq212/status/…

Thariq@trq212

@Pranit nah it’s just a bonus 2x, it’s not that deep

QME

112

Thariq@trq212·26 Mar

To manage growing demand for Claude we're adjusting our 5 hour session limits for free/Pro/Max subs during peak hours. Your weekly limits remain unchanged. During weekdays between 5am–11am PT / 1pm–7pm GMT, you'll move through your 5-hour session limits faster than before.

English

2.3K

532

7.4K

7.7M

Lukas Bug@BugLukas·12 Mar

@trikcode öffentlich statisch leer Haupt(Zeichenkette[] Argumente) You guys don’t code like this?

Deutsch

Wise@trikcode·12 Mar

Honest question. People who English is not their first language… how do they code?? Do Germans code in German? Do Arabs code in Arabic??

English

4.6K

1.5M

Lukas Bug ری ٹویٹ کیا

Andrej Karpathy@karpathy·25 Şub

It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow. Just to give an example, over the weekend I was building a local video analysis dashboard for the cameras of my home so I wrote: “Here is the local IP and username/password of my DGX Spark. Log in, set up ssh keys, set up vLLM, download and bench Qwen3-VL, set up a server endpoint to inference videos, a basic web ui dashboard, test everything, set it up with systemd, record memory notes for yourself and write up a markdown report for me”. The agent went off for ~30 minutes, ran into multiple issues, researched solutions online, resolved them one by one, wrote the code, tested it, debugged it, set up the services, and came back with the report and it was just done. I didn’t touch anything. All of this could easily have been a weekend project just 3 months ago but today it’s something you kick off and forget about for 30 minutes. As a result, programming is becoming unrecognizable. You’re not typing computer code into an editor like the way things were since computers were invented, that era is over. You're spinning up AI agents, giving them tasks *in English* and managing and reviewing their work in parallel. The biggest prize is in figuring out how you can keep ascending the layers of abstraction to set up long-running orchestrator Claws with all of the right tools, memory and instructions that productively manage multiple parallel Code instances for you. The leverage achievable via top tier "agentic engineering" feels very high right now. It’s not perfect, it needs high-level direction, judgement, taste, oversight, iteration and hints and ideas. It works a lot better in some scenarios than others (e.g. especially for tasks that are well-specified and where you can verify/test functionality). The key is to build intuition to decompose the task just right to hand off the parts that work and help out around the edges. But imo, this is nowhere near "business as usual" time in software.

English

1.6K

4.8K

37.3K

5.1M

Lukas Bug@BugLukas·7 Şub

@sama Please let us access it through the API

English