SujeevanRatnasingham

242 posts

SujeevanRatnasingham

@DNAdiversity

Principal on @LifeScanApp; Director of Informatics at CBG, University of Guelph, and Director of BOLD (https://t.co/2jK1DcJNYM)

Canada Katılım Ağustos 2008

63 Takip Edilen321 Takipçiler

SujeevanRatnasingham@DNAdiversity·22 Ara

@karlfilho @svpino I am surprised no one has mentioned Bayes theorem as the solution.

English

Karl Richard@karlfilho·21 Ara

@svpino I got this: it’s the Positive Predictive Value, and you have a 0.98% chance of being sick if this test comes back positive. I try to teach this to my medical students, but apparently, either the math is too complex, or I’m a terrible teacher. Or both! 😂🤣

English

437

Santiago@svpino·21 Ara

99% of the people you know will answer this incorrectly. We need to start teaching statistics in middle school. Question: You go to the doctor and get tested for a disease that only 1 in 10,000 people get. The test is 99% effective in detecting both sick and healthy people. Your test comes back positive. Are you sick?

English

116

349

151.7K

SujeevanRatnasingham retweetledi

Prof Lennart Nacke, PhD@acagamic·6 Ağu

10 research gap types and how to bridge them

English

5.6K

910.3K

SujeevanRatnasingham@DNAdiversity·2 Ağu

@buchner_dominik There is and we’re looking at weeks.

English

DominikBuchner@buchner_dominik·2 Ağu

@DNAdiversity Is there already a timeline for this? Are we talking weeks, months, or years?

English

DominikBuchner@buchner_dominik·1 Ağu

BOLD... are you serious?! We will need an alternative soon!

English

814

SujeevanRatnasingham@DNAdiversity·2 Ağu

@buchner_dominik I fully agree. Over 2.5M IPs request data services from BOLD annually. It has been needing a significant upgrade in software and hardware for some time. Recent funding has allowed for it happen. This outage is temporary and is part of the upgrade to BOLD5.

English

DominikBuchner@buchner_dominik·1 Ağu

Don’t get me wrong, their work for the community has been fantastic. But limiting the API to 3 requests per minute, and now this… It’s a serious potential danger for the widespread use of genetic methods.

English

303

SujeevanRatnasingham retweetledi

ACDB lab@acdblab·27 Tem

Thrilled to spotlight the 24 women of the NbS Guinean Forests Project in Côte d'Ivoire! Their efforts with Malaise traps in Divo Botanical Reserve will provide crucial insect data. 🦋🌿 @WorldUniService @CECI_Canada @iBOLConsortium @CIFOR_ICRAF_WCA #Biodiversity #WomenInScience

English

459

SujeevanRatnasingham@DNAdiversity·23 Tem

Didi was a bright young star whose light was extinguished early. She was a north star for many young women in South Africa. Se was a beautiful person and a dedicated scientist. I will miss her. mg.co.za/news/2024-07-1…

English

201

SujeevanRatnasingham@DNAdiversity·20 Tem

@DrRDRattray Congratulations, Ryan! It is nice to have some positive news.

English

SujeevanRatnasingham@DNAdiversity·12 Tem

@acdblab Congratulations, Michelle! Well deserved!

English

SujeevanRatnasingham retweetledi

ACDB lab@acdblab·11 Tem

Prof. Michelle Van der Bank has been awarded the SA Academy of Science and Culture’s Medal of Honour for contribution to science in South Africa. Read more: shorturl.at/2g0zq

English

1.4K

SujeevanRatnasingham retweetledi

Naiara@NaiaraLopezRojo·6 Tem

Extremely happy to share this article!!!!! within in the project @DRYvER_H2020 we analysed the GHG emissions in European drying river networks. Drying had a legacy effect both on CO2 and CH4 & riverbeds represented >50% of total annual C emissions in 3 of the 6 case studies

Grenoble, France 🇫🇷 English

4.7K

SujeevanRatnasingham retweetledi

Michal Rindos@Rindo77·15 Haz

our Nun moth study is out 🔎🧬🌲🦋🌍onlinelibrary.wiley.com/doi/10.1111/zs…

English

160

SujeevanRatnasingham retweetledi

Anish Kirtane@DNAsaur_·2 Tem

My PhD thesis is due in a couple of weeks. With so many things to take care of, I have adopted more effective to-do lists to manage my tasks and I am never going back! Here are my tips for best practices to improve output (1/n)#phdlife #AcademicChatter #phdvoice #AcademicChatter

English

211

2.6K

496.1K

SujeevanRatnasingham@DNAdiversity·4 Haz

@JaneLubchenco46 @kirstenjharper It’s so energizing to see this Whitehouse advance science and technology towards biodiversity monitoring and protection. #sciencewin

English

109

Jane Lubchenco@JaneLubchenco46·3 Haz

I am excited to announce the National Aquatic eDNA Strategy. This strategy will accelerate the progress of fast, low-cost, and effective technologies for studying life in the ocean and how it’s changing. 🌊🧬 whitehouse.gov/ostp/news-upda…

English

113

15.6K

SujeevanRatnasingham retweetledi

Andrew Ng@AndrewYNg·2 May

Inexpensive token generation and agentic workflows for large language models (LLMs) open up intriguing new possibilities for training LLMs on synthetic data. Pretraining an LLM on its own directly generated responses to prompts doesn't help. But if an agentic workflow implemented with the LLM results in higher quality output than the LLM can generate directly, then training on that output becomes potentially useful. Just as humans can learn from their own thinking, perhaps LLMs can, too. For example, imagine a math student who is learning to write mathematical proofs. By solving a few problems — even without external input — they can reflect on what does and doesn’t work and, through practice, learn how to more quickly generate good proofs. Broadly, LLM training involves (i) pretraining (learning from unlabeled text data to predict the next word) followed by (ii) instruction fine-tuning (learning to follow instructions) and (iii) RLHF/DPO tuning to align the LLM’s output to human values. Step (i) requires many orders of magnitude more data than the other steps. For example, Llama 3 was pretrained on over 15 trillion tokens, and LLM developers are still hungry for more data. Where can we get more text to train on? Many developers train smaller models directly on the output of larger models, so a smaller model learns to mimic a larger model’s behavior on a particular task. However, an LLM can’t learn much by training on data it generated directly, just like a supervised learning algorithm can’t learn from trying to predict labels it generated by itself. Indeed, training a model repeatedly on the output of an earlier version of itself can result in model collapse. However, an LLM wrapped in an agentic workflow may produce higher-quality output than it can generate directly. In this case, the LLM’s higher-quality output might be useful as pretraining data for the LLM itself. Efforts like these have precedents: - When using reinforcement learning to play a game like chess, a model might learn a function that evaluates board positions. If we apply game tree search along with a low-accuracy evaluation function, the model can come up with more accurate evaluations. Then we can train that evaluation function to mimic these more accurate values. - In the alignment step, Anthropic’s constitutional AI method uses RLAIF (RL from AI Feedback) to judge the quality of LLM outputs, substituting feedback generated by an AI model for human feedback. A significant barrier to using LLMs prompted via agentic workflows to produce their own training data is the cost of generating tokens. Say we want to generate 1 trillion tokens to extend a pre-existing training dataset. Currently, at publicly announced prices, generating 1 trillion tokens using GPT-4-turbo ($30 per million output tokens), Claude 3 Opus ($75), Gemini 1.5 Pro ($21), and Llama-3-70B on Groq ($0.79) would cost, respectively, $30M, $75M, $21M and $790K. Of course, an agentic workflow that uses a design pattern like Reflection would require generating more than one token per token that we would use as training data. But budgets for training cutting-edge LLMs easily surpass $100M, so spending a few million dollars more for data to boost performance is quite feasible. That’s why I believe agentic workflows will open up intriguing new opportunities for high-quality synthetic data generation. [Original text: deeplearning.ai/the-batch/issu… ]

English

231

1.2K

204.1K

SujeevanRatnasingham@DNAdiversity·19 Nis

Incredible ceremony @ the Franklin Institute

English

SujeevanRatnasingham@DNAdiversity·19 Nis

The festivities continue

English

SujeevanRatnasingham@DNAdiversity·18 Nis

Privileged to be surrounded by so many luminaries at the Franklin Institute’s Committee on Science and the Arts Dinner. @CBG_UofG @iBOLConsortium

Philadelphia, PA 🇺🇸 English

642

SujeevanRatnasingham retweetledi

University of Guelph@uofg·15 Mar

Today, the #UofG community honoured the groundbreaking achievement of evolutionary biologist, Dr. Paul Hebert, for the pioneering work he and the team at the Centre of Biodiversity Genomics have accomplished in DNA barcoding to catalogue life on Earth.

English

3.5K

SujeevanRatnasingham@DNAdiversity·11 Şub

Long-standing frustration: The actions of the Democrats often fall short of their proclaimed ideals and rhetoric. #USpolitics youtube.com/watch?v=hNDgcj…

YouTube

English

Keşfet

@karlfilho @svpino @buchner_dominik @WorldUniService @CECI_Canada @iBOLConsortium @CIFOR_ICRAF_WCA @acdblab