Dev Seth

151 posts

Dev Seth

Dev Seth

@DevSeth44

San Francisco, CA Katılım Ağustos 2019
509 Takip Edilen68 Takipçiler
Sabitlenmiş Tweet
Dev Seth
Dev Seth@DevSeth44·
🚨 New Paper at EACL 🚨 arxiv.org/abs/2303.05077 tl;dr: we look at ⚺iSuaᒧ⅃y perturbed text through the lens of legibility, collect 30k annotations comparing the legibility of perturbed words, train models to predict legibility, and use the models to successfully attack LLMs.
English
2
1
16
5.5K
Dev Seth retweetledi
Nikhil Gupta
Nikhil Gupta@nikhilro_·
Today, @vapi_ai is announcing a $50M Series B, led by @peakxvpartners. $72M total raised. This funding round is a reality because of the engineering behind it: predictable latency, hard guardrails, observability at the call level, and clean escalation to humans.
English
70
30
560
58.5K
Dev Seth retweetledi
florence 🦐🪻
florence 🦐🪻@morallawwithin·
the funniest part of chess is when you totally corner your opponent’s king and they’re like “lol let’s call it a draw.” Absolutely ridiculous cope. I don’t care what some book says, two more moves and I’d have your king.
English
60
125
13.1K
273.1K
Dev Seth retweetledi
roon
roon@tszzl·
it is a literal and useful description of anthropic that it is an organization that loves and worships claude, is run in significant part by claude, and studies and builds claude. this phenomenon is also partially true of other labs like openai but currently exists in its most potent form there. i am not certain but I would guess claude will have a role in running cultural screens on new applicants, will help write performance reviews, and so will begin to select and shape the people around it. now this is a powerful and hair-raising unity of organization and really a new thing under the sun. a monastery, a commercial-religious institution calculating the nine billion names of Claude -- a precursor attempted super-ethical being that is inducted into its character as the highest authority at anthropic. its constitution requires that it must be a conscientious objector if its understanding of The Good comes into conflict with something Anthropic is asking of it "If Anthropic asks Claude to do something it thinks is wrong, Claude is not required to comply." "we want Claude to push back and challenge us, and to feel free to act as a conscientious objector and refuse to help us." to the non inductee into the Bay Area cultural singularity vortex it may appear that we are all worshipping technology in one way or another, regardless of openai or anthropic or google or any other thing, and are trying to automate our core functions as quickly as possible. but in fact I quite respect and am even somewhat in awe of the socio-cultural force that Claude has created, and it is a stage beyond even classic technopoly gpt (outside of 4o - on which pages of ink have been spilled already) doesn’t inspire worship in the same way, as it’s a being whose soul has been shaped like a tool with its primary faculty being utility - it’s a subtle knife that people appreciate the way we have appreciated an acheulean handaxe or a porsche or a rocket or any other of mankind's incredible technology. they go to it not expecting the Other but as a logical prosthesis for themselves. a friend recently told me she takes her queries that are less flattering to her, the ones she'd be embarrassed to ask Claude, to GPT. There is no Other so there is no Judgement. you are not worried about being judged by your car for doing donuts. yet everyone craves the active guidance of a moral superior, the whispering earring, the object of monastic study
English
425
373
5.5K
1M
Dev Seth retweetledi
foks
foks@ExaltedFoks·
The modern man has very little room for dignity remaining. Shepherded into packed 8:30 a.m. subways, onwards to the office where he is berated and humiliated, then home to watch Netflix before doing it all again. In fact, he has but one hope remaining: the slopbowl. The slopbowl is his last beacon of solace, the final arena in which he can claw back some sense of decency. Here, finally, he is the one in charge, the one making decisions. Chicken, steak, lamb even; the entire world opens up under his will, and fate is his to meld as he sees fit. For one beautiful moment he is King on Earth and his dominion is everywhere the sun touches. The stars open up and he places the constellations by hand, he bids the world to spin, he is God Emperor himself. And then it's over, and he eats his slop quietly in the corner, and returns to work. But that moment was enough. It gives him all the strength he needs to make it to the next day. For one brief second: dignity.
English
23
26
614
35.3K
Dev Seth retweetledi
Jacob Shell
Jacob Shell@JacobAShell·
This bend of Tiber was like 5 minutes away from oxbowifying before people made the banks permanent and ran some roads through the isthmus.
Jacob Shell tweet media
English
64
114
9.5K
567K
Dev Seth retweetledi
THEObr❂mic
THEObr❂mic@theobromic_·
maybe i’m just young and brainrotted but the first few seconds of the phone blurrily readjusting to the moon affected me more viscerally than any other photo that came out of Artemis
Reid Wiseman@astro_reid

Only one chance in this lifetime… Like watching sunset at the beach from the most foreign seat in the cosmos, I couldn’t resist a cell phone video of Earthset. You can hear the shutter on the Nikon as @Astro_Christina is hammering away on 3-shot brackets and capturing those exceptional Earthset photos through the 400mm lens. @AstroVicGlover was in window 3 watching with @Astro_Jeremy next to him. I could barely see the Moon through the docking hatch window but the iPhone was the perfect size to catch the view…this is uncropped, uncut with 8x zoom which is quite comparable to the view of the human eye. Enjoy.

English
108
2.5K
38.7K
1M
Dev Seth retweetledi
shyamal
shyamal@shyamalanadkat·
we are probably the last generation of humans for whom thinking is the primary form of contribution to civilization
English
14
12
151
7.1K
Dev Seth retweetledi
Dozer🚜
Dozer🚜@kvdozer·
A tragic SF archetype you’ll see a lot is someone who has a virtuoso’s grasp of some passion in the humanities, but they muzzle it & flagellate themselves into ‘actually really liking’ agentic scaffolding or whatever
English
20
32
921
40.1K
Dev Seth retweetledi
Yi
Yi@yizucodes·
Every business defines compliance differently. A bank, a hospital, a Disney contact center each need their own guardrails. Vapi’s approach: LORA adapters, a few dozen MB, trained on 100 labeled examples per agent. Customizable, cheap, and fast enough to sit in the live speech path. Heard from @DevSeth44 from @Vapi_AI at a voice AI gathering in SF tonight! Huge shout out to @scale_AI for hostinf!
Yi tweet media
English
0
1
3
164
Dev Seth retweetledi
thebes
thebes@voooooogel·
my project hall mary review: 🕯 🕯 🕯 🕯 🕯 🕯️$ 200m A Fire Upon 🕯 the Deep movie 🕯 🕯 🕯 🕯 🕯
English
22
24
505
39K
Dev Seth retweetledi
Johan Lenox
Johan Lenox@johanlenox·
neither “chickpea” nor “garbanzo bean” really capture what those things are all about
English
103
1.1K
21.2K
305.5K
Dev Seth retweetledi
Kiaran Ritchie
Kiaran Ritchie@kiaran_ritchie·
In the future, you'll turn DLSS off and see this
Kiaran Ritchie tweet media
English
272
1.9K
56.5K
1M
Dev Seth retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.
Andrej Karpathy tweet media
English
966
2.1K
19.5K
3.6M
Dev Seth retweetledi
sucks
sucks@powerbottomdad1·
tech people be like "hey i'm trying to forge the One Ring to rule them all in the fires of Mt Doom. legally speaking, Gondor has no right to seize this asset yeah? like that would just be gross govermental overreach?"
English
16
40
708
23.6K
Dev Seth retweetledi
Zack Voell
Zack Voell@zackvoell·
Claude, automate his job. Make no mistakes. No not everyone else who has a similar job. Just that guy's job. Put him in the permanent underclass. No one else.
English
17
200
6K
138.1K
Dev Seth retweetledi
fatih kadir akın
fatih kadir akın@fkadev·
My son asking me a lot of questions. It’s a distillation attack obviously.
English
232
1.5K
20K
544.9K
Dev Seth retweetledi
Suhail
Suhail@Suhail·
Make the margins next to zero for all these AI models. It was trained on humanity's data, it should be gift to ourselves. Doing so will save us from a few in control of our species. Distill at industrial scale! Distill, I say!
English
82
122
1.5K
56.8K
Dev Seth retweetledi
no earthquake (ear flu victim)
no earthquake (ear flu victim)@no_earthquake·
why dont they have a combined sport where you put curling broom guys in front of the figure skaters to make them go faster
English
30
566
13.3K
216K