devin ivy
796 posts

devin ivy
@devinivy
@bluesky / I know nothing about routes 🦋 https://t.co/imDd62ld71
Katılım Ağustos 2009
707 Takip Edilen375 Takipçiler

@soyjuanarbol i still prefer CJS. it's just not worth the nuisance anymore though
English

@iavins @laneshetron actually hundreds of thousands! many of our pdses service 500k users each.
English

@laneshetron I had the same confusion!
but apparently it is SQLite DB per user. A PDS will host thousands of them.
You may check this PR which replaced Postgres with SQLite:
github.com/bluesky-social…

English

so... they replaced Postgres with combination of Scylla and SQLite. Because Postgres couldn't scale with the site's growth.
There is one SQLite db per user, which keeps all the data and is source of truth. Scylla acts like a View layer on top of it, providing different views!

v@iavins
I learned that Bue Sky uses one SQLite database per user, as God intended. Where can I read more about its high level architecture?
English

@connerdelights @dan_abramov2 there's also @martinkl's evergreen reminder 💎
> Whether something is decentralised or not is a function of the administrative control of different parts of the system, not a function of the network topology.
bsky.app/profile/martin…
English

@dan_abramov2 true. I like that a lot, "decentralizing the right thing" — maybe use that?
English

@jon_raRaRa @dan_abramov2 @ItamarGronich the hosts scale very well because each repository only needs to scale to 1 person's usage. so it plays nice with single-tenant architectures. we host >300k users on each of our 20 hosts today. apps sync and index/aggregate the data low-latency as needed, as a user you can't tell.
English

@dan_abramov2 @ItamarGronich How does this scale behind the scenes, will each node/shard end up hosting terabytes of data from millions of apps? How reliable would this data be if its fragmented and what are common use cases for using fragmented data? Genuinely curious.
English

@stevekrouse @honojs @tursodatabase @bluesky this is a super active line of work for us! i expect to be releasing SDKs, docs, examples, and dogfooding it in production within the week. there's a tracking discussion for it here: github.com/bluesky-social…
English

Val Town apps need an auth solution so you can import users
import { auth } from 'somewhere'
async function handler(req: Request): Response { ... }
export auth(handler)
1) Works with web-standard req/res, @honojs , etc
2) Works with sqlite (@tursodatabase)
3) No OAuth provider registration or API keys needed
4) Popular & battle-tested
Looking at @nextauthjs, @lucia_auth... where else??
English

@bairun_ @dan_abramov2 @xc1427 the relay maintains a copy of the repository, and as long as the repository can be verified the relay will serve it. if the repo becomes invalid, e.g. by mistake, then it can still become valid again in the future and the relay would continue to serve it.
English

@devinivy @dan_abramov2 @xc1427 So what happens if a proof is invalid/incorrect or data verification fails? Does the relay just not emit data for that record?
English

@bairun_ @dan_abramov2 the repository looks like a tree, and the stream helps transmit incremental changes to that tree as users write to their repos. if a consumer jumps in the middle of the stream and sees writes to a repo it hasn't encountered before, it might fetch the full repo then pick back up.
English

@devinivy @dan_abramov2 This might just be a fundamental misunderstanding of what the stream is actually emitting, but there's _some_ state that the relayer needs in order to emit all the events from the beginning. Every new relayer would need that state I'm assuming
English

@bairun_ @dan_abramov2 no problem 🙏 accurate! though it's a little nicer than the web in this case because my PDS can go offline and if someone has a copy on hand you can still verify it's really my data. i sometimes think of self-authenticating data as being "archival."
English

@devinivy @dan_abramov2 Cool that makes sense! Thanks for answering. I'm assuming there's no data availability guarantees or anything. If the data is at the specified place then great, but if not then oh well, similar to the web.
English

@bairun_ @dan_abramov2 similar google discovering hosts on the web, there's not one but many ways! the data is addressed & contains links, and those links dereference to hosts—so you can crawl links. hosts can also announce themselves to a relay. our PLC identity system helps relays with discovery too.
English

@devinivy @dan_abramov2 Probably a deeper question: if someone needs to spin up a new relayer, how do they know where all the hosted data is? I'm assuming when you post to the stream you include that data somewhere.
English

@bairun_ @dan_abramov2 all the repository data is self-authenticating, so anyone can host it! my pds hosts my repo and the bsky relay rehosts it. if you want to grab my repo, you can get it from my pds, the relay, or someone else who happens to have a copy. hosting is designed to be cheap/commoditized.
English

@dan_abramov2 @devinivy As far as I can tell from the docs the data repositories are on IPLD, so whoever is hosting those nodes right now (assuming it’s bsky) would have to keep running those nodes and continue to pin the repository data indefinitely (or until someone can copy it)
English

@mattpocockuk @ukslim you can stream json value-by-value but it's slow, esp. compared to JSON.parse() which is highly optimized. my approach for many use cases is to break up the stream by array item: github.com/devinivy/clown…
English

@matthewstoller is that inherited from the 2015 rules, or part of the new provisions?
English

This is not net neutrality, the FCC's new rules allow a fast lane for big tech if they pay for it.
FCC@FCC
Today we voted to restore Net Neutrality. This reestablishes a national open internet standard to protect consumers, defend national security, and advance public safety. fcc.gov/document/fcc-r…
English







