turboza 💭

2.5K posts

turboza 💭

@TurboZa

Wisdom & Love are the greatest gifts we can share

Bangkok, Thailand Katılım Ağustos 2009

1.7K Takip Edilen367 Takipçiler

turboza 💭 retweetledi

Oscar Balcells Obeso@OBalcells·9 Eyl

Imagine if ChatGPT highlighted every word it wasn't sure about. We built a streaming hallucination detector that flags hallucinations in real-time.

English

206

616

8.8K

745.2K

turboza 💭@TurboZa·28 May

Our team just did a small side project to create a tool to help us bring back the focus for a day. Currently available on Mac only. Let me know if you have any feedback. Download for Mac: tinkers.tech/fluo

English

turboza 💭 retweetledi

Josie Kins@Josikinz·28 Mar

I asked chatgpt's new image model to script and generate a series of comics starring itself as the main character. The results genuinely gave me chills. I'll post them all in a thread below.

English

1.6K

6.3K

61.3K

10M

turboza 💭@TurboZa·30 Oca

Talking to the Grab bike rider today helped me realize how fortunate I am. While I felt down trying to think of ways to accumulate more wealth, a lot of people are struggling just to make ends meet.

English

turboza 💭 retweetledi

Alex Reibman 🖇️@AlexReibman·5 Kas

OpenAI’s biggest rival is shaking things up. Anthropic invited 200+ elite hackers to their SF headquarters to see what’s possible with Claude Here’s what we saw at the @AnthropicAI x @MenloVentures Builder Day Hackathon (🧵):

English

667

7.5K

1.2M

turboza 💭 retweetledi

Massimo@Rainmaker1973·9 Nis

iPad mini + Mac Studio = Macintosh [📹 bbbigdeer]

Filipino

321

5.7K

41.2K

12.3M

turboza 💭 retweetledi

Peter Yang@petergyang·9 Mar

How @linear builds quality products: 1. Engineers and designers are empowered to make product decisions Relying only on PMs for product decisions is an anti-pattern. Instead, Linear hires the best and Nan (Linear's head of product) supports them by defining the problem clearly and sharing feedback along the way. 2. There are no durable product teams Instead, Linear puts temporary teams together to tackle a customer problem. At the company's current scale, this helps to avoid shipping the org chart. 3. Goals and OKRs only exist at the highest level Many companies introduce OKRs at the team and individual level too early. If you give two teams separate OKRs, they’ll likely go in two directions without care for the overall user experience. 4. Marketing is a product pillar Linear has a "magic team" that builds best-in-class landing pages that communicate why customers should care about a new feature. They put as much care into the marketing as the product. 📌 Learn more about Linear here: creatoreconomy.so/p/nan-yu-insid… P.S. They're hiring a PM!

English

675

146.6K

turboza 💭 retweetledi

elvis@omarsar0·9 Şub

The Gemini Advanced prompting guide is live! I've started to add prompt examples demonstrating Google's Gemini Advanced capabilities. If you are curious about prompts and tasks to try, this guide should be a good starting point. From preliminary experiments, Gemini Advanced shows promising capabilities around reasoning, math word problem solving, education tasks, code generation, image understanding, and a wide range of creative tasks. As I mentioned yesterday, it takes a bit of prompt tuning to get the right results but I think this will improve exponentially in future iterations. The safety guardrails are there and you will more often than not encounter them, especially when you are prompting tricky questions. These are just preliminary tests. We will continue to document capabilities and limitations. An extended model guide coming soon. Stay tuned! Link to guide below ↓

English

482

54.4K

turboza 💭 retweetledi

Crémieux@cremieuxrecueil·8 Şub

The relationship between parental degree choice and their kids' degree choices. It looks like some fields stay in the family much more than others.

English

286

26K

turboza 💭 retweetledi

AI at Meta@AIatMeta·24 Oca

Introducing 'Prompt Engineering with Llama 2' — an interactive guide covering prompt engineering & best practices for developers, researchers & enthusiasts working with large language models. Access the notebook in the llama-recipes repo ➡️ bit.ly/3Hu266D

English

469

236.1K

turboza 💭 retweetledi

Maxime Labonne@maximelabonne·22 Oca

🧑‍🔬👷 The LLM Course is now complete! I added the LLM Engineer Roadmap, a list of high-quality resources to build LLM-powered applications and deploy them. 💻 LLM Course: github.com/mlabonne/llm-c…

English

396

1.9K

162.1K

turboza 💭 retweetledi

Shubham Saboo@Saboo_Shubham_·20 Oca

Drag & Drop to build AI agents 🤯 Meet n8n, a visual UI for building AI agents with LangChain without writing a single line of Python code. Want to build automation around LLM apps with Nocode, get started now:

English

209

1.4K

222.5K

turboza 💭 retweetledi

Angry Tom@AngryTomtweets·14 Oca

The world's largest consumer tech event, CES 2024, has come to an end. Here are 10 reveals you don't want to miss from CES 2024 event: (Part II) 1. $1,400 Moonwalkers X by Swift that go 7 mph

English

449

1.5M

turboza 💭 retweetledi

Andrej Karpathy@karpathy·13 Oca

I touched on the idea of sleeper agent LLMs at the end of my recent video, as a likely major security challenge for LLMs (perhaps more devious than prompt injection). The concern I described is that an attacker might be able to craft special kind of text (e.g. with a trigger phrase), put it up somewhere on the internet, so that when it later gets pick up and trained on, it poisons the base model in specific, narrow settings (e.g. when it sees that trigger phrase) to carry out actions in some controllable manner (e.g. jailbreak, or data exfiltration). Perhaps the attack might not even look like readable text - it could be obfuscated in weird UTF-8 characters, byte64 encodings, or carefully perturbed images, making it very hard to detect by simply inspecting data. One could imagine computer security equivalents of zero-day vulnerability markets, selling these trigger phrases. To my knowledge the above attack hasn't been convincingly demonstrated yet. This paper studies a similar (slightly weaker?) setting, showing that given some (potentially poisoned) model, you can't "make it safe" just by applying the current/standard safety finetuning. The model doesn't learn to become safe across the board and can continue to misbehave in narrow ways that potentially only the attacker knows how to exploit. Here, the attack hides in the model weights instead of hiding in some data, so the more direct attack here looks like someone releasing a (secretly poisoned) open weights model, which others pick up, finetune and deploy, only to become secretly vulnerable. Well-worth studying directions in LLM security and expecting a lot more to follow.

Anthropic@AnthropicAI

New Anthropic Paper: Sleeper Agents. We trained LLMs to act secretly malicious. We found that, despite our best efforts at alignment training, deception still slipped through. arxiv.org/abs/2401.05566

English

211

682

4.9K

906.7K

turboza 💭 retweetledi

Andrej Karpathy@karpathy·9 Ara

# On the "hallucination problem" I always struggle a bit with I'm asked about the "hallucination problem" in LLMs. Because, in some sense, hallucination is all LLMs do. They are dream machines. We direct their dreams with prompts. The prompts start the dream, and based on the LLM's hazy recollection of its training documents, most of the time the result goes someplace useful. It's only when the dreams go into deemed factually incorrect territory that we label it a "hallucination". It looks like a bug, but it's just the LLM doing what it always does. At the other end of the extreme consider a search engine. It takes the prompt and just returns one of the most similar "training documents" it has in its database, verbatim. You could say that this search engine has a "creativity problem" - it will never respond with something new. An LLM is 100% dreaming and has the hallucination problem. A search engine is 0% dreaming and has the creativity problem. All that said, I realize that what people *actually* mean is they don't want an LLM Assistant (a product like ChatGPT etc.) to hallucinate. An LLM Assistant is a lot more complex system than just the LLM itself, even if one is at the heart of it. There are many ways to mitigate hallcuinations in these systems - using Retrieval Augmented Generation (RAG) to more strongly anchor the dreams in real data through in-context learning is maybe the most common one. Disagreements between multiple samples, reflection, verification chains. Decoding uncertainty from activations. Tool use. All an active and very interesting areas of research. TLDR I know I'm being super pedantic but the LLM has no "hallucination problem". Hallucination is not a bug, it is LLM's greatest feature. The LLM Assistant has a hallucination problem, and we should fix it. Okay I feel much better now :)

English

696

2.4K

14.8K

2.4M

turboza 💭 retweetledi

Peter Yang@petergyang·22 Ağu

Do you use slides or docs for making decisions? There's a 3rd choice that's better. A two-way writeup is a doc with 3 big improvements: 1. You know who has read it. A simple "Done reading?" button that readers can interact with tells you exactly who has read the doc. 2. You can gather structured feedback. A table labeled "How do you feel about this plan?" lets people leave their thoughts async and rate the plan out of 5. No more guessing what people think. 3. You can focus on the most important questions. Another table lets people add and upvote questions before the meeting. This way, only the most important topics are discussed live. My interview with @lshackleton (CPO @coda_hq) has a full breakdown of how two-way writeups work along with a template that you can try yourself. Sign up below to get it in your inbox tomorrow.

English

110

42.2K

turboza 💭 retweetledi

Tim Ferriss@tferriss·18 Ağu

If you consistently feel the counterproductive need for volume and doing lots of stuff, put these on a Post-it note: Being busy is a form of laziness—lazy thinking and indiscriminate action. Being busy is most often used as a guise for avoiding the few critically important but uncomfortable actions. And when—despite your best efforts—you feel like you’re losing at the game of life, remember: Even the best of the best feel this way sometimes. When I’m in the pit of despair, I recall what iconic writer Kurt Vonnegut said about his process: “When I write, I feel like an armless, legless man with a crayon in his mouth.”

English

291

1.8K

293.2K

turboza 💭 retweetledi

Lenny Rachitsky@lennysan·16 Ağu

4 ways for validate your B2B startup idea: 1. The do-it-manually path @christinacaci manually created compliance reports for a few companies and noticed (surprisingly) that they all found them very valuable. She then built @TrustVanta (last valued at $1.6B):