Suresh
5.9K posts

Suresh
@_Suresh2
MSc Software Engineering @ Chongqing University ’26 | Researching AI x Software Engineering (AI for SE & SE for AI) | 🇵🇰➡️🇨🇳



Today we release all the data sources (and more) in one place, more than 1.4B query-document pairs Plus a new high-quality web dataset built on FineWeb-Edu, replacing the outdated "common crawl" splits most mixtures still rely on (thanks @orionweller) huggingface.co/datasets/light…

























Today at @LightOnIO, we release LateOn 💡 and DenseOn 💃 Two open retrieval models at 149M params that push new SOTA on BEIR! With a blog post packed with insights on pre-training data curation, filtering, ablations, and decontamination. 🧵












wrote a guide on getting compute grants as a student, something I wish I did more at the beginning of my PhD. It's honestly one of the highest ROI things you can do as a student (we've gotten 100k+ gpu hrs for roughly 2 weeks of work writing). nightingal3.github.io/blog/2026/04/1…



we're chatting with @_rajanagarwal and @evan_j_chu in ~1h20min about testing agent on super duper difficult tasks that takes 20h to complete tune in!



