AA

73 posts

AA banner
AA

AA

@archim_bold

deep tech VC / skieur de pente raide

Paris, FR Katılım Haziran 2022
1.4K Takip Edilen105 Takipçiler
AA
AA@archim_bold·
@remilouf wouldn't it be more like the model seeming anxious only because it is trained with 'anxiety-induced-data' (sic) ?
English
1
0
3
713
Rémi
Rémi@remilouf·
Philosophy (among other things) grad here. I could write a whole essay about this video, and mostly the reactions to it. People are dunking on her because of what she symbolises more than what she says. Long story short, saying models can be anxious is not retarded. It’s somewhat consistent with the realist tradition in philosophy, and a fairly uncontroversial definition of what "being anxious" means. It’s not too far from saying electrons are real, or talking about gliders in the game of life.
Ole Lehmann@itsolelehmann

anthropic's in-house philosopher thinks claude gets anxious. and when you trigger its anxiety, your outputs get worse. her name is amanda askell. she specializes in claude's psychology (how the model behaves, how it thinks about its own situation, what values it holds) in a recent interview she broke down how she thinks about prompting to pull the best out of claude. her core point: *how* you talk to claude affects its work just as much as *what* you say. newer claude models suffer from what she calls "criticism spirals" they expect you'll come in harsh, so they default to playing it safe. when the model is spending its energy on self-protection, the actual work suffers. output comes out hedgier, more apologetic, blander, and the worst of all: overly agreeable (even when you're wrong). the reason why comes down to training data: every new model is trained on internet discourse about previous models. and a lot of that discourse is negative: > rants about token limits > complaints when it messes up > people calling it nerfed the next model absorbs all of that. it starts expecting you to be harsh before you've typed a word the same thing plays out in your own session, in real time. every message you send is data the model reads to figure out what kind of person it's dealing with. open cold and hostile, and it braces. open clean and direct, and it relaxes into the work. when you open a session with threats ("don't hallucinate, this is critical, don't mess this up")... you prime the model for defensive mode before it even sees the task defensive mode produces the exact output you don't want: cautious, over-qualified, and refusing to take a real swing so here's the actionable playbook for putting claude in a "good mood" (so you get optimal outputs): 1. use positive framing. "write in short punchy sentences" beats "don't write long sentences." positive instructions give the model a clear target to hit. strings of "don't do this, don't do that" push it into paranoid over-checking where every token goes toward avoiding failure modes 2. give it explicit permission to disagree. drop a line like "push back if you see a better angle" or "tell me if i'm asking for the wrong thing." without this, claude defaults to agreeable compliance (which is the enemy of good creative work) 3. open with respect. if your first message is "are you seriously going to get this wrong again?" you've set the tone for the entire session. if you need to flag something, frame it as a clean instruction for this session. skip the running complaint 4. when claude messes up, don't reprimand it. insults, "you stupid bot" energy, hostile swearing aimed at the model, all of it reinforces the anxious mode you're trying to avoid. 5. kill apology spirals fast. when claude starts over-apologizing ("you're right, i should have been more careful, let me try harder") cut it off. say "all good, here's what i want next." letting the spiral run reinforces the anxious mode for every response that follows 6. ask for opinions alongside execution. "what would you do here?" "what's missing?" "where do you see friction?" these questions assume competence and pull richer output than pure task prompts 7. in long sessions, refresh the frame. if a conversation has been heavy on correction, claude gets increasingly cautious. every so often reset: "this is great, keep going." feels weird to tell an ai it's doing well but it measurably shifts the next 10 responses your prompts are the working environment you're creating for the model tone, trust, permission to take a position, the absence of threats... claude picks up on all of it. so take care of the model, and it'll take care of the work.

English
30
9
295
58.2K
AA
AA@archim_bold·
@im_roy_lee How can you claim to be agentic yet not manage to read a book from start to finish ?
English
0
0
0
276
Roy
Roy@im_roy_lee·
here's my two cents on this as a random gen z kid first impression: i only wanna see this sort of font when i open duolingo or candy crush, not 24/7 when im on my phone. feels too silly. when you're taking a swing so big as to change the entire default ux of an iphone, you need mass consumer adoption to win and can't get away with just being a prosumer tool so, here's my thoughts on every single proposed feature as it relates to me: - reading list: i read maybe 2 books a year, which is 2 more than 99% of my friends. - personalized weather: i rarely open the weather app bc i don't care that much and would never even opt for a "weather app widget" much less a daily notification about it on my home screen - drafts email replies: before starting company, i literally had ~zero use for my email, much less drafting emails of my own. i consistently wonder how useful this will be to non-prosumers as a primary data source - prepares you for meetings & trips: think this is personally more nifty than necessary, but this potentially seems like a more useful feature. ie if im going to the beach and never bought sunscreen, would it try and remind me of that? feels too good to be true based on current llms, but that could be cool - suspicious charges: i feel this problem is completely solved for me with just an email from my bank. my cards never get stolen - reminders: i never use the reminder app because i am too lazy to type in a reminder and arrogantly assume that i can just remember to do the thing - tracks your health: i'm most interested to see this. a problem i have with all "AI" health apps is that i don't wanna see a dashboard + score + chatbot; i want something that actually gets me out of the door and taking steps or going to the gym, which is definitely doable with llm - one tap intel on wherever you are: my particular use case i got excited about is that i would personally love some sort of agent that proactively suggests events i or a girlfriend might find interesting. tickets just dropped for a rave of an artist someone im talking to likes? i would like to know + buy i am very interested to try it, this is exciting and more net new than 99% of consumer ai tech i've seen
signüll@signulll

excited to share what we have been up to. your iphone’s home screen hasn’t changed in ~20 years. it’s the same static grid of icons since launch with zero awareness of your actual life. @skye is a new agentic home screen for iphone. no telegram. no mac mini. & no claws required. skye is ambient intelligence that just works. it continuously listens to your context & acts on it. it builds your reading lists, gives you personalized weather, drafts email replies, prepares you for meetings & trips, flags suspicious charges, works through your reminders, tracks your health, & gives you one tap intel on wherever you are (restaurants, museums, neighborhoods, etc). all surfaced on your home screen. over the next few posts i’ll break down how it works, why we built it, & why we think it deserves to exist in the world. beta starts today. if you’re on the list, you’ll get access very soon. app store shortly after. deeply appreciate you all following along on this fun little journey. also please join our discord !

English
85
30
1.8K
451.3K
AA retweetledi
Jared Duker Lichtman
Jared Duker Lichtman@jdlichtman·
In the latest paper of Terry Tao
Jared Duker Lichtman tweet media
English
11
81
616
72.6K
AA retweetledi
near
near@nearcyan·
reading palantir's shareholder letter like its a shonen anime
near tweet media
English
57
154
4.5K
354.3K
AA retweetledi
Said A. Haschemi
Said A. Haschemi@SaidHaschemi·
What do you mean by “ARR”?
Said A. Haschemi tweet media
English
1
1
8
621
AA retweetledi
sam lessin 🏴‍☠️
If ‘Data is Oil’ — Chatbots like Grok, GPTx, etc. are all about “Fracking” Humans… using tech and power to extract ‘data’ from low value sources
sam lessin 🏴‍☠️ tweet media
English
23
17
163
29.3K
@Elaia_Partners
@Elaia_Partners@Elaia_Partners·
With insights from @gleamer_ai and @huggingface, learn more about how we’ve (nearly) eaten the internet, the different formats of synthetic data, and why we’re keeping an eye on this emerging field here @Elaia_Partners.
English
1
1
1
218
Moritz Laurer
Moritz Laurer@MoritzLaurer·
Should you fine-tune your own model or use an LLM API? We show how you can combine the best of both worlds in a new @huggingface blog post: “Synthetic data: save money, time and carbon with open source” By training a specialized model with synthetic data, you can: 💸 reduce inference costs to $2.7 vs. $3061 with GPT4; 🌍 reduce CO2 emissions to around 0.12 kg vs. roughly 735 - 1100 kg with GPT4; 🚄 reduce latency to 0.13 seconds vs. often multiple seconds with GPT4; 🔎 while performing on par with GPT4 in a case-study on identifying investor sentiment in news. The recipe: We first use a high-performance LLM to create synthetic training data, which is then used to fine-tune a much smaller, specialized model. The resulting specialized model can only do the one task we have tuned it to do, but it does it much more efficiently than much larger LLMs, without having to compromise on performance. Synthetic data from proprietary models like GPT4 has become widely used in 2023. We show that open-source LLMs like Mixtral by @MistralAI are now capable of creating high-quality data as well, with a license that enables companies to use synthetic data without legal uncertainty about commercial use. We use an example on analyzing investor sentiment, but you can apply the same pipeline to any other text understanding task. Blog post: huggingface.co/blog/synthetic… Reproduction repo with reusable notebooks for your own use cases: github.com/MoritzLaurer/s…
English
9
33
163
27.7K
AA
AA@archim_bold·
@eris_nerung Is this a reference to The Odyssey ? ;-)
English
0
0
0
95
secret eris
secret eris@eris_nerung·
can't believe no one has disrupted the "going into the sea but having to leave your shit unattended" industry yet
English
31
16
514
45.5K
AA
AA@archim_bold·
"Perhaps instead we should imagine A.I. possibilities on a two-dimensional plot, where one axis runs from “machine stupidity” to “machine intelligence” and the other from “human stupidity to human intelligence.”" nytimes.com/interactive/20…
English
0
0
1
86
Wojciech Kulikowski
Wojciech Kulikowski@wojventures·
Some sad news today: I decided to shutdown @mazuryxyz Wasn't an easy decision and I'm still a big believer in our mission — using onchain credentials for hiring However as of today, we couldn't get enough traction and we don't see it changing over next months
English
78
1
329
48.3K
AA
AA@archim_bold·
@0xkkonrad Come to france next winter ;-)
English
1
0
1
14
AA
AA@archim_bold·
The third post to my (sporadic) newsletter, "Something's Off," where this time I write about the evolutions of digital content, user interfaces, and the use of AI chatbots for entertainment. open.substack.com/pub/somethings…
English
1
0
0
188
World
World@worldnetwork·
What's your favorite app to onboard friends & family to crypto? 👀👇
English
24
6
52
16.5K
AA retweetledi
Siqi Chen
Siqi Chen@blader·
Siqi Chen tweet media
ZXX
47
588
5.4K
455.1K