Broncio Aguilar-Sanjuan

259 posts

Broncio Aguilar-Sanjuan

@BroncioS

Oxford, England Katılım Ocak 2019

193 Takip Edilen60 Takipçiler

Broncio Aguilar-Sanjuan retweetledi

Sheppard Lab@sheppard_lab·24 Şub

Lab update after long ! We are growing and doing more wonderful science across many fields. But, equally (maybe more) chill outside, pic from a recent lab Thai dinner outing last week .

English

425

Broncio Aguilar-Sanjuan retweetledi

Demis Hassabis@demishassabis·11 Ara

The UK is an amazing place for science & innovation. Thrilled to deepen our partnership with the UK Government to turbocharge scientific discovery with AI - giving scientists here priority access to our frontier models like AlphaEvolve, AI Co-Scientist, AlphaGenome, WeatherNext & more. We’re also building our first automated lab here in the UK for materials science!

English

103

212

2.1K

183.4K

Broncio Aguilar-Sanjuan@BroncioS·22 Eki

Exciting stuff. Waiting for a Linux/Ubuntu version though 🐧 🤖

Sam Altman@sama

Our new AI-first web browser, ChatGPT Atlas, is here for macOS. Please send feedback! Availability on other platforms to follow.

English

Broncio Aguilar-Sanjuan@BroncioS·22 Eki

@sama Linux version, please 🥺 🐧

English

Sam Altman@sama·21 Eki

Our new AI-first web browser, ChatGPT Atlas, is here for macOS. Please send feedback! Availability on other platforms to follow.

English

3.3K

1.6K

22.9K

2.9M

Broncio Aguilar-Sanjuan retweetledi

Brian Roemmele@BrianRoemmele·20 Eki

BOOOOOOOM! CHINA DEEPSEEK DOES IT AGAIN! An entire encyclopedia compressed into a single, high-resolution image! — A mind-blowing breakthrough. DeepSeek-OCR, unleashed an electrifying 3-billion-parameter vision-language model that obliterates the boundaries between text and vision with jaw-dropping optical compression! This isn’t just an OCR upgrade—it’s a seismic paradigm shift, on how machines perceive and conquer data. DeepSeek-OCR crushes long documents into vision tokens with a staggering 97% decoding precision at a 10x compression ratio! That’s thousands of textual tokens distilled into a mere 100 vision tokens per page, outmuscling GOT-OCR2.0 (256 tokens) and MinerU2.0 (6,000 tokens) by up to 60x fewer tokens on the OmniDocBench. It’s like compressing an entire encyclopedia into a single, high-definition snapshot—mind-boggling efficiency at its peak! At the core of this insanity is the DeepEncoder, a turbocharged fusion of the SAM (Segment Anything Model) and CLIP (Contrastive Language–Image Pretraining) backbones, supercharged by a 16x convolutional compressor. This maintains high-resolution perception while slashing activation memory, transforming thousands of image patches into a lean 100-200 vision tokens. Get ready for the multi-resolution "Gundam" mode—scaling from 512x512 to a monstrous 1280x1280 pixels! It blends local tiles with a global view, tackling invoices, blueprints, and newspapers with zero retraining. It’s a shape-shifting computational marvel, mirroring the human eye’s dynamic focus with pixel-perfect precision! The training data? Supplied by the Chinese government for free and not available to any US company. You understand now why I have said the US needs a Manhattan Project for AI training data? Do you hear me now? Oh still no? I’ll continue. Over 30 million PDF pages across 100 languages, spiked with 10 million natural scene OCR samples, 10 million charts, 5 million chemical formulas, and 1 million geometry problems!. This model doesn’t just read—it devours scientific diagrams and equations, turning raw data into a multidimensional knowledge. Throughput? Prepare to be floored—over 200,000 pages per day on a single NVIDIA A100 GPU! This scalability is a game-changer, turning LLM data generation into a firehose of innovation, democratizing access to terabytes of insight for every AI pioneer out there. This optical compression is the holy grail for LLM long-context woes. Imagine a million-token document shrunk into a 100,000-token visual map—DeepSeek-OCR reimagines context as a perceptual playground, paving the way for a GPT-5 that processes documents like a supercharged visual cortex! The two-stage architecture is pure engineering poetry: DeepEncoder generates tokens, while a Mixture-of-Experts decoder spits out structured Markdown with multilingual flair. It’s a universal translator for the visual-textual multiverse, optimized for global domination! Benchmarks? DeepSeek-OCR obliterates GOT-OCR2.0 and MinerU2.0, holding 60% accuracy at 20x compression! This opens a portal to applications once thought impossible—pushing the boundaries of computational physics into uncharted territory! Live document analysis, streaming OCR for accessibility, and real-time translation with visual context are now economically viable, thanks to this compression breakthrough. It’s a real-time revolution, ready to transform our digital ecosystem! This paper is a blueprint for the future—proving text can be visually compressed 10x for long-term memory and reasoning. It’s a clarion call for a new AI era where perception trumps text, and models like GPT-5 see documents in a single, glorious glance. I am experimenting with this now on 1870-1970 offline data that I have digitalized. But be ready for a revolution! More soon. [1] github.com/deepseek-ai/De…

English

342

1.4K

7.5K

1.8M

Broncio Aguilar-Sanjuan retweetledi

Ankit Singhal@notankitsinghal·18 Eki

Introducing Odyssey—the largest and most performant protein language model ever created. Odyssey enables scientists and researchers to generate and edit proteins, the workhorses of all life on this planet, towards specific functional ends—scaled to over 102 billion parameters. We did it all with just a core team of 6 and an order of magnitude less funding than our next largest competitor. Here's how it works 🧵 (1/6)

English

180

1.4K

181.1K

Broncio Aguilar-Sanjuan@BroncioS·24 Eyl

@CampyUK2025 @sheppard_lab @UniofOxford Feeling welcomed 🦠 🐔

English

Broncio Aguilar-Sanjuan@BroncioS·14 Haz

"In a world full of chaos, be the capybara" #chocolate #unoreverse

English

Broncio Aguilar-Sanjuan@BroncioS·15 May

"Never say never" ~ OT #noodles #eurovision

English

Broncio Aguilar-Sanjuan@BroncioS·10 May

Tuna and cheese is not that bad #jesuschrist #bridge

English

Broncio Aguilar-Sanjuan retweetledi

Oxford Protein Informatics Group (OPIG)@OPIGlets·6 Mar

PoseBusters update: we've added a non-flatness check for non-aromatic rings, e.g. cyclohexane, to the PoseBusters molecular pose checks Can you spot the difference? github.com/maabuu/posebus…

Oxford Protein Informatics Group (OPIG) tweet media

English

747

Broncio Aguilar-Sanjuan@BroncioS·13 Oca

@irrashionalley #london 🐁🐁🐁🐁

QME

Broncio Aguilar-Sanjuan@BroncioS·16 Eki

"Postdoc in the US" #deportation #noregrets

English

Broncio Aguilar-Sanjuan@BroncioS·15 Eki

So much emotion in a piece of plastic and silicon #waitforit #keepwaiting

English

Broncio Aguilar-Sanjuan retweetledi

Nextflow@nextflowio·16 Eyl

It’s time for the #NextflowSummit BCN speaker series! 🎉 Let's welcome our first speaker Ziad Al Bkhetan, Product Manager at @AusBiocommons 🎤 He’ll present "The Australian ProteinFold Service: Interactive Prediction & Visualisation of Protein Structure" summit.nextflow.io/2024/barcelona…

English

4.9K

Broncio Aguilar-Sanjuan@BroncioS·18 Eyl

I have just completed a bioinformatics online tutorial from @EBItraining: Workflows - Combining tools for data analysis ebi.ac.uk/training/onlin… 💻

English

Broncio Aguilar-Sanjuan@BroncioS·15 Eyl

AI rules, software bug breaks it lol @ba13026/amusing-five-word-stories-about-a-world-where-ai-dominates-the-world-f119d41d2cc2" target="_blank" rel="nofollow noopener">medium.com/@ba13026/amusi…

English

Broncio Aguilar-Sanjuan retweetledi

Oxford Protein Informatics Group (OPIG)@OPIGlets·15 Ağu

OPIG postdoc @mijr12 contributed computational analyses to benchmark a new biotechnology: a combination of droplet microfluidics & FACS to isolate pathogen-specific, antibody-secreting B cells that lack B cell receptors. Research led by @hollfelderlab, published @NatureBiotech.

Nature Biotechnology@NatureBiotech

Rapid discovery of monoclonal antibodies by microfluidics-enabled FACS of single pathogen-specific antibody-secreting cells go.nature.com/4dHLaHB

English

8.9K

Broncio Aguilar-Sanjuan retweetledi

Oxford Protein Informatics Group (OPIG)@OPIGlets·16 Ağu

Small Molecule tool update: We added a new chemical sanity check to PoseBusters (github.com/maabuu/posebus…): "InChI convertibility". If the molecule can be converted to a standard InChI key and back, the test passes.... (1/2)

English

Broncio Aguilar-Sanjuan retweetledi

Oxford Protein Informatics Group (OPIG)@OPIGlets·20 Ağu

Our Therapeutic Structural Antibody Database has been updated with the latest clinical trial data and the antibody/nanobody therapies in WHO Proposed INN List 131. With this update, Thera-SAbDab has reached the 1000+ milestone (1063 therapeutics) Explore: opig.stats.ox.ac.uk/webapps/sabdab…

English

1.7K

Keşfet

@sama @sheppard_lab @UniofOxford @irrashionalley @AusBiocommons @EBItraining @mijr12 @hollfelderlab