xonecas

1.2K posts

xonecas banner
xonecas

xonecas

@xonecas

ʕ⪩ᨓ⪨ʔ

Lisbon Beigetreten Temmuz 2009
78 Folgt112 Follower
xonecas retweetet
Antonio Lupetti
Antonio Lupetti@antoniolupetti·
"Transformers" by Daniel Jurafsky and James H. Martin is one of the clearest and most mathematically grounded introductions to the Transformer architecture I have ever read. Chapter 8 introduces the Transformer as the standard architecture behind modern large language models. What makes this chapter particularly interesting is its step-by-step presentation of the underlying mechanisms: contextual embeddings, self-attention, query, key and value vectors, scaled dot-product attention, multi-head attention, residual streams, feedforward layers, layer normalization, masking, and the parallel matrix formulation of attention. In particular, the treatment of attention as a weighted sum of contextual representations is especially valuable. The chapter first develops an intuitive, simplified view of attention and then gradually derives the full formulation using the Q, K, and V matrices. This approach makes it easier to understand what is actually happening inside the architecture from an algebraic and matrix-based perspective, rather than simply viewing the usual block diagrams. I think it is an excellent resource for anyone interested in understanding how Transformers work from linguistic, mathematical, and computational perspectives. web.stanford.edu/~jurafsky/slp3…
Antonio Lupetti tweet media
English
20
312
2.4K
189.3K
xonecas retweetet
Teamlyzer
Teamlyzer@teamlyzer·
A Unbabel pediu insolvência após receber 13,3M€. Liderava um consórcio de IA com 75M€ de fundos comunitários. A venda por valor baixo gerou perdas totais para investidores e o Estado perde o investimento. 📉💸 #TechPT #Startups #IA
Português
2
9
62
12.3K
xonecas retweetet
Mitchell Hashimoto
Mitchell Hashimoto@mitchellh·
Creator of Sqlite on pull requests: "You say, oh, it's free. No. It's not free. What you're doing is asking me ... to maintain it for you, to to document it for you, to test it for you, to maintain it for you for the next 25 years. That's not free." Yep. Wise words from a wiser man than me. I've told people for the past decade and I have recent posts on here saying the same: the merge button is the easy part. Its the decade+ (Richard says 25 years) that follows where you've accepted the transfer of maintenance thats hard.
English
58
396
6K
270.1K
xonecas
xonecas@xonecas·
The Portuguese government should have payed this guy instead. And check out Bagaco, it's a really well made dataset.
Lisbon AI@lisbonai_

AI can be personalized to almost anything. But unlocking the nuances of a specific language, sometimes a specific region, is genuinely hard. Most of the open data for these "niche" languages is low-grade, and almost nobody is working on it. @duarteocarmo built Bagaço, a pretraining dataset for European Portuguese, and with it a method anyone can copy for their own language: how to find usable data when there isn't much, how to score and filter it, how to build the evals, how to train a model that speaks one specific dialect instead of a flattened average. While Portugal's government spent €5.5M on its own European Portuguese LLM, Amália, Duarte is doing it solo, fully in the open. At LISBON AI he'll run the pipeline live and you can judge who did it better. Come to Lisbon on 23–24 September and watch Duarte teach a machine to speak Portuguese with a dataset named after Portuguese moonshine. That's exactly what this conference is for.

English
0
0
1
9
xonecas retweetet
sophie
sophie@netcapgirl·
me: i would like to buy a launch on a reusable rocket spaceX: ok me: i would also like high speed satellite internet spaceX: got it me: also do you know where i can find a frontier ai model, a social media site, a coding agent, and a 100k GPU supercomputer? spaceX: you’re not gonna believe this
English
114
273
7.6K
308.3K
xonecas retweetet
Martin Fowler
Martin Fowler@martinfowler·
Fragments: enjoying programming with LLMs, four types of LLM conversation, the crevasse between AI enthusiasts and skeptics, AI companies get product/market fit, the need for decentralization martinfowler.com/fragments/2026…
English
2
19
126
14.8K
xonecas
xonecas@xonecas·
@0xSero Thanks for the honest take!
English
0
0
0
84
0xSero
0xSero@0xSero·
What Lisa Su actually held on stage: A mini PC the size of a lunchbox running qwopus-27b-v2-gguf-4bit-iqss++ and qwable-reap-pct-90-iqxl-192-experts+ fully locally with 64k context It pays itself off in 60 years about how long it takes to generate 1 billion tokens in mixed mem
0xSero tweet media
English
34
17
376
34K
xonecas
xonecas@xonecas·
@badlogicgames This was a great episode, Ben nailed it with the comment about how funny it is that it always ends with "use more tokens", so convenient :D
English
0
0
2
140
xonecas
xonecas@xonecas·
@pmigat @arayush01 @mitsuhiko Oh wow there is no UX to get there on the free/$20 plan? I swear I'm not blind! :D Thank you for pointing this out, out will look into making a pi ext for now, and if the Pi maintainers agree, propose a patch.
English
1
0
1
19
Philip Miglinci
Philip Miglinci@pmigat·
I would have liked to give pi.dev a try, with models provided via cursor, but this doesn't seem to be an option atm. Or am I gettings something wrong @mitsuhiko ?
English
7
0
5
7.1K
xonecas
xonecas@xonecas·
@mjovanovictech People pay that for personal computers, why are you so suprised?
English
0
0
0
32
Milan Jovanović
Milan Jovanović@mjovanovictech·
Why the fuck would I spent $4000-5000 on a rig to write essays, mails, excel with local models??? You people are out of your minds
Devjyoti Chatterjee@devx89

@mjovanovictech Do you think everyone does coding with LLMs? Most do trivial tasks like write essay, mail, excel for that local models are more than sufficient. Local LLMs took a back bench due to RAM shortage, otherwise we would hv had 128gb Unified mem systems common by now.

English
39
4
142
36.9K
xonecas
xonecas@xonecas·
@arayush01 @mitsuhiko @pmigat I do not see how to get an api key, I think I searched the dashboard well enough? Is it gated behind the higher tiers?
English
1
0
0
20
xonecas
xonecas@xonecas·
@koroutine @Youssofal_ You're fine mate, don't go with the hype cycle, use what is giving you results. All of this fanatic stances towards frontier models is just AI psychosis.
English
0
0
0
4
🇨🇦
🇨🇦@koroutine·
@Youssofal_ Was too broke to even try Fable, am I missing out? Currently 2 max Codex sub
English
2
0
0
1.2K
xonecas
xonecas@xonecas·
@haydendevs Your posts always crack me up,cheers good luck finding GPUs in stock, youre gonna need a few... Hundred!
English
0
0
1
1.2K
hayden
hayden@haydendevs·
how many gpus do I need to run a gpt 5 equivalent model
English
115
1
523
97.2K
xonecas retweetet
SpaceX
SpaceX@SpaceX·
SpaceX has exercised the option to acquire @cursor_ai in an all-stock transaction with the goal of building the world’s most useful AI models. For the past few months, SpaceXAI has been jointly training a model with Cursor, which will be released in Cursor and Grok Build soon. We look forward to working closely with the Cursor team to advance our frontier AI capabilities
SpaceX@SpaceX

SpaceXAI and @cursor_ai are now working closely together to create the world’s best coding and knowledge work AI. The combination of Cursor’s leading product and distribution to expert software engineers with SpaceX’s million H100 equivalent Colossus training supercomputer will allow us to build the world’s most useful models. Cursor has also given SpaceX the right to acquire Cursor later this year for $60 billion or pay $10 billion for our work together.

English
1.7K
4.3K
36.5K
25.6M
xonecas retweetet
Tibo
Tibo@thsottiaux·
Oy. We are aware that some Codex users are experiencing high error rates with "model at capacity" and are working to bring things back to being stable. status.openai.com
English
560
111
3.7K
911.2K
xonecas retweetet
Sarah
Sarah@araseb_·
Can you call yourself a founder if your entire product was built by Claude?
English
855
17
718
147.9K
xonecas
xonecas@xonecas·
Wow, thanks Mitchell sir, this is awesome
Mitchell Hashimoto@mitchellh

My heuristic is that any diff an agent generates over ~1500 lines is too big and is indicative that the problem needs to be decomposed. This is my general pattern now for feature work: 1. Try to implement the whole feature, loosely guided. I call this the "draw the owl" prompt in reference to the meme. Expect garbage, you're going to get garbage. 2. If the diff is less than 1500 lines, review it and iterate normally. If the diff is more than 1500 lines, prompt the agent to decompose the problem into atomic, incremental, reviewable tasks. Simultaneously, do this yourself. 3. Agents will very often make these tasks way too specific to the shape they solved. You need to massage it into the right general shape. Do that. 4. Kick off new agents to work on those incremental things (as parallelized as possible). Apply the same rules. 5. At a certain, point, repeat the "draw the owl" prompt. At some point, you will get beneath your review-ability threshold. This has been producing consistently high quality, maintainable, reviewable chunks of code that have a good handoff to either merge as-is or human refinement. And with the latest frontier models at xhigh thinking, these are all slow enough that you can usually have multiple going concurrently while you are actively reviewing others or working on your own tasks. HITL (human-in-the-loop) agents are still super important, especially for feature work. Features touch the human boundary in terms of UI, API, etc. And net new stuff can introduce pathologies in the architecture that violate desired invariants (these should be represented in specs or tests but we aren't perfect!). I know a lot of the leading edge agentic discourse is about "loops" and agents driving agents continuously. I do some of that (will report on that later). But, in terms of raw daily get-shit-done type of work, this is my most rewarding pattern at the moment.

English
0
0
0
9