Ritish
85 posts

Ritish
@Ritish_1618
Machine Learning Practitioner | figuring out.
Katılım Temmuz 2021
570 Takip Edilen9 Takipçiler

@civilianBTC Interested | Been working on a game recommendation system. Would be enjoyable to work on something else simultaneously.
English
Ritish retweetledi

The web was never meant to be flattened into text.
Yet most web RAG systems start by parsing HTML --- a complex and lossy process.
🔥 Introducing PixelRAG: the first RAG system that retrieves and reads 30M+ web pages as pixels.
Instead of extracting text, PixelRAG retrieves screenshots and lets a VLM read them directly.
PixelRAG not only preserves visual information, but also outperforms text-based RAG on text-only QA benchmarks by +18.1%.
Why?
(1) HTML-to-text conversion often discards layout, structure, tables, and other useful signals.
(2) We continued pretraining a VLM on web page screenshots and turned it into a surprisingly strong visual retriever.
(3) Recent VLMs are remarkably good at understanding web pages, often with better accuracy and token efficiency than text-only pipelines.
Takeaway: HTML parsing may be one of the biggest self-inflicted bottlenecks in web RAG.
Demo below 👇
Code: github.com/StarTrail-org/…
Paper: github.com/StarTrail-org/…
Playground: pixelrag.ai
English

@Old_But_Gold50s These are music sheets, right?? If so, how can one start to learn them? Even more broadly how to start with music theory? Just all things music for me to be able to do cool stuff at the intersection of AI and music.
English


@freshlimesofa Thoughts, agreed. Checkout Expert Systems. Maybe rings a bell!
English

Just thinking out loud but,
I think the world is bored with LLMs and we're soon going to hit saturation.
Look although I haven't gone really deep into newer architectures that are coming up and have been a little out of touch with deep learning, but I do feel proposals like JEPA and world modelling are more impactful and provide a greater meaning than doubling down on LLMs and post training them to ace a specific benchmark just so it can vomit out probabilistic representations.
How far are we going to get with this ?
The implementation of this into different industries and solutions at an enterprise is already almost done.
In fact we took LLMs and made agents out of them, that's the intersection of the architecture with core engineering principles,
Isn't this saturation already.
However, It isn't the saturation of machine intelligence, it's just that the low hanging fruit of next token prediction is harvested.
We've plateaued.
Tldr : Random thoughts.
English

@dejavucoder Ask questions. Experiment. That's all!! Follow your curiosity is the take away (driven by the understanding of right and wrong). Consequently, definitely try computational thinking.
English
Ritish retweetledi

@Ritish_1618 People are telling us to focus on slop products instead of the foundational work we're trying to build.
English
Ritish retweetledi

Ritish retweetledi

Believing you slept well can boost your cognitive performance even after a bad night.
Mind over matter.
Nicholas Fabiano, MD@NTFabiano
Believing you slept well can boost your cognitive performance even after a bad night. Mind over matter.
English

@Harshit77406528 Suggest some which are specifically good at complex tasks. TIA!
English

@Ritish_1618 Try using chinese models. U get more for the same price 🫡
English

@zuzanna_pathway @adrian_pathway @lukaszkaiser @YesThisIsLion @mlech26l Curios to know if these two fields are comparable? Since Information Retrieval is relatively smaller to Intelligence? However, will listen to the talk.
English

“We have not yet had a PageRank moment for intelligence.”
We’ve got so many comments and questions about this statement delivered by @adrian_pathway during our recent Transformer vs Post-Transformer debate with @lukaszkaiser @YesThisIsLion @mlech26l - thanks!
Let’s dig into it. In the 1990s, web search already existed. We could index information. AltaVista existed. The web was growing fast.
Then PageRank happened.
That moment combined three things:
1. A simple but deep mathematical idea: treat the web as a giant graph and compute a stationary distribution of a *random walk* on that *graph*
2. A scalable implementation: large-scale graph computation on huge clusters
3. A company that integrated and scaled the idea end-to-end: Google
That combination gave search a much clearer center. It stopped being just a pile of heuristics and started to look more like: here is the mathematical object we need to compute, now let’s build the systems needed to compute it well.
Adrian asked Lukasz Kaiser directly whether he sees a PageRank-like idea inside the
Transformer. Lukasz said no.
For intelligence, we still do not have that kind of unifying operator or process. We do not yet have an agreed mathematical object that says: this is the core computation behind it.
That missing unifier is what Adrian meant by the absent “PageRank moment for intelligence.”
That is also the main idea behind our work on BDH, our Post-Transformer architecture. We are after that fundamental “platform discovery” for intelligence.
The full Transformer vs Post-Transformer debate is a good place to go deeper on these topics. Link below.

English










