
Thomas Wilde
1.1K posts

Thomas Wilde
@tdoubleyoo
#entrepreneur #professor #datascience #sceptic #failingminimalist
Munich, Germany เข้าร่วม Mayıs 2008
134 กำลังติดตาม130 ผู้ติดตาม

@CosineAI hey, i‘m stuck in your trial and neither can I delete my account (only team member), nor can i cancel the trial or change the subscription without immediately paying. How to get out of there??
English
Thomas Wilde รีทวีตแล้ว
Thomas Wilde รีทวีตแล้ว
Thomas Wilde รีทวีตแล้ว

Really enjoyed LinkedIn's report on what worked and what didn't when deploying LLM applications. 4 takeaways.
1. Structured outputs
They chose YAML over JSON as the output format because YAML uses less tokens. Initially, only 90% of the outputs are correctly formatted YAML. They used re-prompting (asking the model to fix its YAML responses), which increased the number of API calls significantly.
They then analyzed the common formatting errors, added those hints to the original prompt, and wrote an error fixing script. This reduced their errors to 0.01%.

English

@wholemars Just let all EU FSD owners become registered "Testers", we sign a contract to only use it appropriately for "validation" and Tesla pays us 1 EUR per Month for the effort to make sure no-one can complain. Considering a lot also have shares, we should be able to test "our" product
English
Thomas Wilde รีทวีตแล้ว
Thomas Wilde รีทวีตแล้ว
Thomas Wilde รีทวีตแล้ว

Exciting times ahead as the new UNECE DCAS regulation opens doors for @Tesla's FSD beta in Europe! 🚀 @ElonMusk, @TeslaEurope, with this significant milestone, let's accelerate discussions on Tesla's expansion across Europe. The future of mobility awaits. 🌍🔋#TeslaEurope #FSD
Scrais@Scrin_Mais
BREAKING NEWS The new UNECE DCAS regulation was adopted a few minutes ago. Once it enters force later this year, Tesla will finally be able to expand its FSD beta to Europe and other parts of the world! 🎉
English
Thomas Wilde รีทวีตแล้ว
Thomas Wilde รีทวีตแล้ว

With many 🧩 dropping recently, a more complete picture is emerging of LLMs not as a chatbot, but the kernel process of a new Operating System. E.g. today it orchestrates:
- Input & Output across modalities (text, audio, vision)
- Code interpreter, ability to write & run programs
- Browser / internet access
- Embeddings database for files and internal memory storage & retrieval
A lot of computing concepts carry over. Currently we have single-threaded execution running at ~10Hz (tok/s) and enjoy looking at the assembly-level execution traces stream by. Concepts from computer security carry over, with attacks, defenses and emerging vulnerabilities.
I also like the nearest neighbor analogy of "Operating System" because the industry is starting to shape up similar:
Windows, OS X, and Linux <-> GPT, PaLM, Claude, and Llama/Mistral(?:)).
An OS comes with default apps but has an app store.
Most apps can be adapted to multiple platforms.
TLDR looking at LLMs as chatbots is the same as looking at early computers as calculators. We're seeing an emergence of a whole new computing paradigm, and it is very early.

English
Thomas Wilde รีทวีตแล้ว

Thomas Wilde รีทวีตแล้ว

Wer die linke Grafik von @katjaberlin nicht versteht, der versteht vielleicht die aktuellen Meldungen zu den Folgen den schweren Unwetter in #Italien. #DontLookUp


Deutsch
Thomas Wilde รีทวีตแล้ว
Thomas Wilde รีทวีตแล้ว
Thomas Wilde รีทวีตแล้ว
Thomas Wilde รีทวีตแล้ว
Thomas Wilde รีทวีตแล้ว
Thomas Wilde รีทวีตแล้ว

I suspect GPT-4's performance is influenced by data contamination, at least on Codeforces.
Of the easiest problems on Codeforces, it solved 10/10 pre-2021 problems and 0/10 recent problems.
This strongly points to contamination.
1/4


Horace He@cHHillee
How is it even … possible to have a codeforces rating of 392? That’s very low. Like, my understanding was as long as you participated in a couple of contests (regardless of how you did), you'd have a rating above 392.
English























