Angehefteter Tweet
Lars Strojny
8K posts

Lars Strojny
@lstrojny
@[email protected] Short-term pessimist, long-term optimist.
Munich, Bavaria Beigetreten Mart 2007
1.6K Folgt833 Follower
Lars Strojny retweetet
Lars Strojny retweetet

Hey #PHP,
Casual Saturday morning ...
0.03 seconds on bare metal with avx2, 0.05 (if the timer can be believed) in your browser with wasm simd ...
krakjoe.github.io/ort/
"The Solution" tab is where you run code ...
#php #ai

English
Lars Strojny retweetet

> our implementation was not developed by Mozilla employees, but was contributed entirely by a single volunteer, André Bargull.
Temporal is an absolutely massive spec and complicated implementation - it's been YEARS in the making.
André out here just beating out billion dollar corps

Wes Bos@wesbos
ITS HAPPENING Firefox 127 shipped today making it the first browser to support Temporal - the new API for working with Dates, times, timezones + durations
English

@jessfraz I’ve build a teeny-tiny abstraction so that I can have something I call universal apps in individual files, which is installation plus configuration. So hosts just select „install the git app“ and get git plus config incl. shell integration etc
English

@jessfraz Yes. On the Mac I also use it to drive homebrew installation. GUI packages on Mac are a bit better maintained in homebrew than in nix, so I mix and match between nix and brew as a package source. Brew doesn’t provide the same deterministic guarantees but that’s alright for GUI.
English

@davefarley77 And I feel „semantic diffusion“ puts it way too gently: given enough time, the worst interpretation of any good idea will prevail.
The question is: why is that the case and what is there to be done about it.
English

@davefarley77 I’ve been thinking about exactly this a lot and there are countless more examples: object orientation came to mean writing classes, DevOps came to mean having a DevOps engineer, Observability means having a monitoring tool available.
English

"Semantic Diffusion" is everywhere and why this matters...
a #Thread
1/19
English
Lars Strojny retweetet

I was talking with my friend @nealriley about how much fun I’m having with LLMs lately, including the project I've written about before to interpret screenshots of podcasts and YouTube videos. I made the observation that working with LLMs seems to boil down to a pattern of creating ETL pipelines, over and over again.
He quickly corrected me, suggesting, "Don't you mean ELT pipelines?"
I laughed at his curious and unexpected comment, which seemed unusually pedantic. However, I found myself thinking about his comment for days afterward.
It reminded me of a quip from Rich Hickey, inventor of the Clojure programming language: "Information is simple. The only thing you can do with information is ruin it."
In my experience, what Rich said is so true, and shows that Neal was absolutely correct. What I meant to say is "ELT," not "ETL." In other words, you should Extract, Load/Store, and then Transform data, in that order (ELT). Unless you love misery, you should not Transform before you Load/Store (ETL).
Over the last many decades, I've ruined data in so many many ways, for me, the “T” in “ETL” stands for tarnish, trash, taint, etc.
I’ve accidentally converted date-time strings incorrectly (turning them all into nils), I’ve discarded data fields that I thought were irrelevant, but years later, I wish I had kept them because it is priceless information, impossible to reconstruct. I’ve accidentally overwritten data, deleted data, and so many more terrible things, Thankfully, I've wiped those episodes from my memory, lest I start kicking myself again.
This is because I was Transforming the data (or Trashing it) before storing it.
One of the things I've learned from the Clojure and cloud community is the value of just storing data in its original form, often in storage buckets, deferring any transformation of the data to afterward, often at runtime (i.e., Extract, Load/Store, then Transform).
I almost made this mistake last week: I mentioned how I use @revai and @deepgram and to create podcast audio transcripts. At first, I converted the audio transcripts to match the format used by the python-transcript-api program. That way, I could use the same code to render the transcript, regardless of what the transcript source was.
This is when I once again learned that the only thing I could do to the data is to ruin it.
So instead, I saved the entirety of the transcript in its original format, including the full API return payload. That way, I’m preserving optionality, enabling myself to use that data in new ways in the future. By discarding the data, I destroy that optionality.
Today, I saw one such potential option: I realized that transcript diarization could be helpful (this is when transcripts detect when there are multiple speakers talking, and ideally labels them). The YouTube-transcription-api doesn’t support speaker labels, and I would have discarded that data to match it.
My lessons:
- Store your data in as close to its original form
- Defer data transformation to after you load/store it
English
Lars Strojny retweetet
Lars Strojny retweetet

#HappyNewYear2024! No concrete resolutions but having German permanent residency (received on Dec 30!) means lots of restrictions just got lifted from my life and I’ll have much more freedom to focus on things I love the most this year.
English

@TotherAlistair Any class has two interfaces muddled together. A “production“ interface and a “consumption“ interface. A constructor is part of the production interface but not of the consumption interface. Traditional OOP languages don’t offer a clear separation of the two interfaces
English

If you are handling unicode user input you are likely doing it half wrong without realizing it until it’s pretty costly to fix.
Congrats, you earned yourself a data migration project.
This is what uffff solves, a PHP library to filter unicode user input uffff.readthedocs.io/en/latest/
English

@dan_abramov Next: feed your build process data from your web analytics data store to inform what to pre-generate
English
Lars Strojny retweetet

@channingwalton It is! And he was a very economical composer, reusing patterns wherever he could, @chillygonzales blames it on his many kids 😀
English










