Mathurin Dorel

12.7K posts

Mathurin Dorel banner
Mathurin Dorel

Mathurin Dorel

@MathSRIsh

Bioinformatician, Cellular Biologist, Techbio Founder. Collecting data in a complex world @[email protected] @mathsrish.bsky.social Also #boardgames

Berlin, Germany Katılım Eylül 2016
1.5K Takip Edilen357 Takipçiler
Sabitlenmiş Tweet
Mathurin Dorel
Mathurin Dorel@MathSRIsh·
Midnight reflexion on to improve scientific evaluation beyond h-index: how about a veting mechanism with pagerank based on e.g ORCID profiles. Each researcher can vouch for any number of other researchers. The importance of a researcher is then simply their pagerank.
English
1
0
1
1.7K
French Response
French Response@FrenchResponse·
5/ 8. Internet access expansion for nearly 3,000 Kenyan public institutions. 9. Agricultural resilience, innovation and climate adaptation projects. 10. Support for Kenyan tea producers to access higher-value global markets.
English
12
20
231
24.3K
Mathurin Dorel
Mathurin Dorel@MathSRIsh·
@shae_mcl @josiezayner Looks like Cas12a2 or something similar. The main challenge is not the enzyme, it's the delivery. And they will most likely target oncogenic mutations rather than somatic ones. Doesn't sound like they want to go the personalise route. nature.com/articles/s4158…
English
0
0
2
100
Shae McLaughlin
Shae McLaughlin@shae_mcl·
@josiezayner My takeaway was that that they want to check cell nuclear genomes against a reference and then…destroy all cells with somatic mutations? I have so many questions…
English
2
0
4
1.3K
Mathurin Dorel
Mathurin Dorel@MathSRIsh·
@_AndrewLeduc Do you have a reference for those numbers? As far as I know state of the art is closer to 1-2k proteins for a few thousand cells (100-300ng). For 10k you need in the 100ug range so closer to a million cells. But granted my reference is 6 years old. nature.com/articles/s4146…
English
3
0
0
15
Andrew Leduc
Andrew Leduc@_AndrewLeduc·
@MathSRIsh You can measure 200-300k distinct peptides today from ~10k proteins with just a few thousand cells of sample. I dont think you appreciate how incredibly challenging it would be for another tech to solve the various challenges inherent to this
English
1
0
0
9
Andrew Leduc
Andrew Leduc@_AndrewLeduc·
Its hilarious the extent to which people dont understand that mass specs actually work amazingly well right now today for protein sequencing I think it is primarily the fault of our community in doing a poor job of applying the technology.
Steve Jurvetson@FutureJurvetson

🐠 Everything we know about biology has been built on an incomplete picture. DNA tells us what a cell might do. Proteins tell us what it’s actually doing. Pumpkinseed announced their $20M Series A today (led by Future Ventures and NfX) to build the platform that reads proteins directly—for the first time. Proteomics has always faced a fundamental constraint: you can only measure what you already know to look for. The current workhorse, mass spectrometry, requires matching protein fragments against reference databases. If a protein isn't in the database, or doesn't ionize reliably, it's invisible. Other approaches rely on fluorescent labels or antibody-based affinity methods, which introduce their own biases and blind spots. The result is a field that has spent decades generating an increasingly detailed map of a small, well-lit corner of the proteome, while biology’s most important data layer remains hidden. This isn't a sensitivity problem. It's a category problem. Existing tools were never designed to read proteins directly de novo. They were designed to find what researchers already suspected was there. Pumpkinseed is built to find everything else. And proteomics is harder than most people outside the field appreciate. When we account for post-translational modifications, non-canonical amino acids, and glycan decorations, there are roughly a thousand distinct chemical monomers in the proteomic alphabet, compared to the four bases of DNA. deSIPHR (de novo Sequencing and Identification of Proteins with High-throughput Raman spectroscopy) is Pumpkinseed's proprietary nanophotonic chip platform, fabricated with semiconducting manufacturing. With over 100 million sensors per square centimeter, it reads proteins, known or unknown, letter by letter — amino acid by amino acid — without a reference catalog of proteins, and at high-throughput. The result is direct, high-resolution proteomic data, including post-translational modifications, non-canonical amino acids, and single-cell detail, that mass spectrometry-based approaches cannot match. What is Raman spectroscopy? Rather than tagging or fragmenting proteins, Raman spectroscopy reads the molecular vibrations of individual molecules. Each amino acid vibrates at a characteristic frequency, producing a unique physical signature that deSIPHR detects directly. This is physics reading biology in the most literal sense. With conventional Raman spectroscopy, only about one in ten million photons interacts with a molecule usefully, far too weak for single-molecule work. Pumpkinseed's answer is a silicon photonic chip patterned with a billion sensors per wafer. Those sensors concentrate light into volumes smaller than a single protein, amplifying Raman scattering efficiency by over 10 million-fold. And their future ventures? “The longer-term ambition is the virtual cell, a computational model that simulates not just how proteins fold but how they interact, respond to drugs, and behave under perturbation inside a living system. AlphaFold demonstrated what structural AI can do once a sequence is known. The gap that cannot be closed is determining the sequence itself from biological samples, particularly for proteins carrying modifications absent from existing databases. Pumpkinseed is designed to supply that input layer. "If the Human Genome Project was the data infrastructure that enabled genomic medicine, we believe the high-resolution proteomic dataset Pumpkinseed is building could be the analogous foundation for AI-driven biological discovery," co-founder Dr. Jen Dionne says. "In our vision, the molecular signatures driving disease, aging, and ecosystem health become fully legible. Medicine shifts from reactive to proactive. Optimal healthspan moves from aspiration to achievable reality." —synbiobeta.com/read/pumpkinse… • The biology mining company: Pumpkinseed.Bio • Today’s News: pumpkinseed.bio/news/pumpkinse…

English
6
2
25
4.3K
Mathurin Dorel
Mathurin Dorel@MathSRIsh·
It's not virtual screening yet, but it's a platform that accurately mimics the patient population for a fraction of the price (and no paperwork). It's better than a model because you're not at risk of making wrong hypothesis. It's pure biology with high translatability.
English
0
0
0
15
Mathurin Dorel
Mathurin Dorel@MathSRIsh·
If you have a drug you're considering for a clinical trial for those diseases you should definitely test it on their plateforme first. They'll tell you if you should go forward and help with patient stratification.
English
1
0
0
18
Mathurin Dorel
Mathurin Dorel@MathSRIsh·
New paper detailing the platforms of @orakldotbio Orakl has built an impressive bank of patient-derived PDAC and CRC organoids in which they screen any drug. This paper shows that they can accurately reproduce clinical trials results with such screens.
Gustave Ronteix@ronfleix

Introducing SCOPE (Screening-to-Clinical Outcome Prediction Engine): a translational platform integrating patient-derived organoid (PDO) drug screening with clinical prognostic features to forecast arm-level efficacy in oncology trials.

English
1
2
2
216
Mathurin Dorel retweetledi
Gustave Ronteix
Gustave Ronteix@ronfleix·
Introducing SCOPE (Screening-to-Clinical Outcome Prediction Engine): a translational platform integrating patient-derived organoid (PDO) drug screening with clinical prognostic features to forecast arm-level efficacy in oncology trials.
Gustave Ronteix tweet media
English
1
2
1
1K
Mathurin Dorel
Mathurin Dorel@MathSRIsh·
@pfau My strategy when there's a trendy thing that's outside my core expertise is to find the ones who can talk about it in a balanced tone and the strongest critics. They will usually follow the hypers, point weaknesses that still need to be overcome and filter relevant signals.
English
0
0
0
793
David Pfau
David Pfau@pfau·
I feel like my general life strategy of "ignore the trendy thing, focus on what you believe in" was predicated on the trendy thing only being, like, 10x bigger than whatever I was doing, and not a world-historically large technological revolution.
English
29
56
1.7K
360K
Mathurin Dorel retweetledi
Clément Molin
Clément Molin@clement_molin·
Many people are still often saying "France 🇫🇷 surrender", which is completly ridiculous. Here is the real version of what happened and why France lost the 6 weeks campaign : 🔹From 1919 to 1939, France was abandonned. It helped central Europe to survive against communists and while Germany was not respecting Versailles Treaty, it was abandonned by the allies (UK, US) 🔹When Hitler entered Rhenania and invaded Austria and Czecoslovakia, no one said anything, France was alone and an intervention was impossible due to internal pacifism. 🔹In 1939, France could have done much better against nazi Germany, especially when it invaded Poland and Norway, that's right 🔹In May 1940, Germany had 2 times more population than France and a way bigger industrial capacity. German military strategy was also younger and better. 🔹However, in May 1940, France was the lone country effectively fighting Germans and Italians in Europe : the french army fought but lacked crucial air power, tanks organisation, communication and movement. 🔹In Dunkirk, the french army fought firmly to allow all the british troops to flee to the UK. Part of the french army was captured to save the british. The UK then refused to send more planes and troops to France. They did help a lot and managed to liberate Western Europe later with other allies 🔹At the same time, the Italian army started attacking the french Alps but got crushed and lost 2 300 men against 37 for France. 🔹What was the Soviet Union doing at the time ? It was an ally of Nazi Germany, it was invading Finland, occupying eastern Poland, Bessarabia and Baltic countries, while supplying Hitler with all the goods he needed. 🔹What were the US doing ? Nothing, they refused to step in the conflict and help their old allies, they waited until 1941-1942 to really step in the conflict. 🔹What were the other european countries doing ? Most were quickly occupied (faster than France), others chose to ally with Hitler and some remained neutral. There were absolut mistakes in the french decision making and war preparation, while the french soldiers did fight a lot, losing 58 000 soldiers, but repeating "French surrender" is simply ridiculous when you know a bit about history. The truth is (americans and others) are saying this for France (and not Poland, Denmark, Yugoslavia, Netherlands, Belgium, Luxembourg...) because they hate the fact France decided to be politically independant from the US after the war, not because it is the reality, and that's a shame...
Clément Molin tweet media
English
282
527
2.9K
223.5K
Mathurin Dorel
Mathurin Dorel@MathSRIsh·
@_AndrewLeduc I think you completely missed my point. Sure you can have good sequence coverage for a specific, and even up to hundreds of enriched peptides. But when you want to quantify between samples, the methods are highly variable. And that is the problem most people are interested in.
English
2
0
0
15
Mathurin Dorel
Mathurin Dorel@MathSRIsh·
@_AndrewLeduc You completely missed my point. Very few care about sequencing exactly a few peptides, most care about accurately quantifying thousands with PTMs. And there it's still a dice roll even for moderately abundant peptides. You dunked on a tech whose main use case you don't get.
English
0
0
0
10
Andrew Leduc
Andrew Leduc@_AndrewLeduc·
@MathSRIsh I dont think you know much about the field the field and the inherent challenges with protein sequencing, The Quant comment is just not true and if you need to detect a specific part of a protein you can use multiple proteases and obtain pretty good sequence coverage these days
English
2
0
0
21
Mathurin Dorel retweetledi
Veera Rajagopal 
Veera Rajagopal @doctorveera·
I recently vibecoded this plot in D3.js for a presentation. The funnel plot visualizes the attrition across different stages of drug development and the costs involved. The data and the concept behind this plot are now fairly common knowledge in the field. However, making this plot helped me find a few insights that may not be readily obvious in discussions about drug development costs and failures. Before I share my insights, let me briefly walk through the plot for context. The plot visualizes the journey of a set of 22 nominated candidates, back-calculated from attrition rates needed to reach one approved drug, through the phases of development. The midline spine marks the stages along with cost labels from DiMasi et al. 2016 (pubmed.ncbi.nlm.nih.gov/26928437/), the industry-standard reference, expressed per approved drug. The upper area curve is the cumulative cost (in 2013 dollars). The lower area curve shows the transition rates based on data from BIO/Informa (2021) (x.com/doctorveera/st…), Paul et al. (2010) (nature.com/articles/nrd30…) & Waring et al. (2015) (nature.com/articles/nrd46…) Nat Rev Drug Discov. Cost of failures, not success The most quoted number in drug development is also the most misunderstood one. When people say it costs billions to develop a drug, they picture a single molecule being shepherded from lab bench to pharmacy shelf at enormous expense. That is not what the number means. The billions are not the cost of one success. They are the cost of failures--all the failures that were necessary to produce that one success. Every candidate that got nominated, tested and quietly abandoned contributed to that figure. The billion-dollar headline is a measure of failures a company must stomach for one success. The invisible part of the funnel Most widely discussed failure rates in drug development start the clock at Phase I. That is actually a generous starting point. Before a drug ever touches a human, it survives a brutal pre-clinical filter that never gets a mention. Based on the limited data available, around 40% of formally nominated drug candidates never make it to human trials. The famously quoted "1 in 10" drug success rate does not count the preclinical attrition. If you factor it in, the odds of success drop from 1 in 10 to 1 in 22. And even 1 in 22 is still optimistic as it starts counting only from formal nomination. Before that there are phases like target exploration, hit identification and lead optimization. That earlier funnel, from first exploration to nominated candidate, is almost impossible to quantify at an industry level. It lives inside company R&D pipelines and remains proprietary. The nominated candidate is already a survivor before it enters the visible funnel. The true odds are therefore likely worse than 1 in 22. Pre-clinical costs rival clinical Clinical trials, especially late-stage, have a reputation for being expensive. DiMasi et al. estimates $255M per candidate entering Phase III versus $59M in Phase II and $25M in Phase I. That steep cliff before Phase III is exactly what makes the "funnel". Every gate before III exists to prevent quarter-billion-dollar mistakes. But here is what that framing misses. Pre-clinical development is invisible in most cost discussions, yet in aggregate it is not cheap. DiMasi reports $430M out-of-pocket pre-clinical spend per approved drug, which is an aggregate cost spanning the entire pre-human pipeline. The data does not allow a per-compound breakdown. Now compare that to our portfolio-level trial costs: 7 entering Phase II at $59M each is $413M, 2 entering Phase III at $255M each is $510 M. The most expensive phase per trial and the most invisible phase in the pipeline cost roughly the same. And nobody talks about the second one. The Phase II graveyard If you look closely at the transition rates, one number will stand out: 29% of drugs from Phase II make it to III, the narrowest part of the funnel. The killer here is not safety, it's efficacy. Waring et al. found that pre-clinical failures are dominated by toxicology (59%) and Phase I failures by safety signals (25%), which makes sense as we have reasonably good early tools for catching dangerous compounds before they cost too much. But Phase II failures are led by efficacy (35%), because there is no pre-clinical substitute for asking whether a drug actually works in humans at therapeutic doses. That question can only be answered in Phase II, expensively, after millions have already been spent getting there. The implication here is to invest disproportionately in early efficacy signals not because safety does not matter, it does. But it usually declares itself early. Efficacy ambushes you late during the most expensive phase before III, and by then the bill is already large. Buying the race, not the winner We often come across news of billion dollar acquisitions in the biotech field, which might make you wonder how all that we discussed so far applies there. A company that began with just one target successfully navigated their way into late stages of trials and got acquired for billions of dollars. On the surface it might look like one company is being bought for their one success. But that's not the full story. That company is just one survivor out of dozens if not hundreds of parallel single-target companies that ran a similar race and quietly failed. They never show up at the deal table, but in reality they are all priced in. The buyer is not paying for what that one company spent. It's also paying for what other failed companies spent in that target space. The truth is the market ran a portfolio experiment across many bets, and this acquisition settles the tab. Whether the winner got there by conviction or pure luck does not matter. What matters is the buyer bought their way to the end of the funnel by paying what it would have cost to run the race themselves across hundreds of candidates. Not broken. By design. It is worth stepping back and wondering if the funnel reflects a broken system that needs fixing. Of course, not. The shape you see is not a failure of the system, it is rather a deliberate design of drug development. The logic is front-loading of attrition: fail cheap, fail fast, and invest heavily only in the survivors. Pre-clinical cuts are inexpensive. Phase I cuts are manageable. By the time you reach Phase III and spend a quarter of a billion per compound, make sure your earlier gates have done brutal and honest work. The funnel is not broken. But its shape does raise an uncomfortable question: are early filters aggressive enough? Every weak candidate that slips through the early gates carries an expensive price tag before it eventually fails anyway. The cost of a leaky funnel is not just the money. It is the time, the patients enrolled in trials for drugs that should not have made it that far, and the opportunity cost of resources not spent on better candidates.
Veera Rajagopal  tweet media
English
3
14
68
8.8K
Mathurin Dorel
Mathurin Dorel@MathSRIsh·
@Ronalfa Beautiful sample queue Ron. Loading by hand in the imager or somehow automated now?
English
1
0
1
284
Ron Alfa
Ron Alfa@Ronalfa·
Today’s cut of arrayed tumor samples ready for processing. Hundreds of patients per week now.
Ron Alfa tweet media
English
6
7
132
10.4K
Mathurin Dorel
Mathurin Dorel@MathSRIsh·
@owl_posting @JasminKaur_ For neoantigens, we're working on something with @gama_search (actually a free lunch from our RNA isoforms research) Hopefully released before the end of the year, will keep you posted.
English
0
0
1
38
Mathurin Dorel
Mathurin Dorel@MathSRIsh·
@owl_posting @JasminKaur_ Oh how I dream of a development atlas of humans from birth to centenarian. Gonna have to raid quite a few morgues. Another good and as tragic source of data are pediatric cancer patients. A few healthy tissue should be systematically collected during surgery.
English
1
1
1
55
owl
owl@owl_posting·
this isn’t to say they aren’t important but theres a *lot* of extremely interesting types of biological data outside of unconditional protein structures, sequences, and small molecules. it is good to leave the PDB bubble sometimes and explore what else is possible
English
5
9
129
8.8K
Mathurin Dorel
Mathurin Dorel@MathSRIsh·
Another approach is physics based models but if you think LLMs need a lot of compute wait until we start simulating all molecules in all cells in a tissue at scale. Interpretability will be tricky too. Probably one of the most promising potential application of quantum computers.
English
0
0
0
26
Mathurin Dorel
Mathurin Dorel@MathSRIsh·
A lot indeed: for example RNA, histology, DNA regulation and variant annotation (which is actually the final boss). Deepmind has actually made some very good models for all of these. But ML models can only be as good as the data because biology is messy.
owl@owl_posting

this isn’t to say they aren’t important but theres a *lot* of extremely interesting types of biological data outside of unconditional protein structures, sequences, and small molecules. it is good to leave the PDB bubble sometimes and explore what else is possible

English
1
0
2
224