Jasper Obico ๐Ÿ‰

4.2K posts

Jasper Obico ๐Ÿ‰ banner
Jasper Obico ๐Ÿ‰

Jasper Obico ๐Ÿ‰

@jasperidium

Associate Professor @UPManilaOnline. Plant systematics & conservation of Philippine flora. PhD Plant Biology @UCNZ. Views my own. ๐Ÿ‡ต๐Ÿ‡ญ๐Ÿ‡ณ๐Ÿ‡ฟ๐Ÿณ๏ธโ€๐ŸŒˆ

๊ฐ€์ž…์ผ Temmuz 2009
1.9K ํŒ”๋กœ์ž‰576 ํŒ”๋กœ์›Œ
Jasper Obico ๐Ÿ‰ ๋ฆฌํŠธ์œ—ํ•จ
Joachim Schork
Joachim Schork@JoachimSchorkยท
When predictor variables are too closely related, your regression model struggles to determine which one truly matters. This issue, known as multicollinearity, inflates standard errors, distorts coefficient estimates, and weakens model reliability. Variance Inflation Factor (VIF) helps detect and quantify this problem, ensuring more stable and interpretable results. โœ”๏ธ A VIF below 5 suggests low multicollinearity, while values between 5 and 10 indicate moderate correlation that may require attention. A VIF above 10 is considered problematic, as it can significantly distort regression estimates. โœ”๏ธ Addressing high VIF values improves model stability. Strategies include removing redundant variables, combining correlated predictors, using Principal Component Analysis (PCA), or applying regularization techniques like ridge regression. โŒ VIF only detects linear relationships, meaning nonlinear dependencies may go unnoticed. Alternative methods, such as Generalized Additive Models (GAMs) or mutual information, can capture nonlinear correlations. โŒ VIF does not indicate whether collinearity affects the target variable, so it should be used alongside domain knowledge and model evaluation techniques. Even if VIF is high, multicollinearity is only a concern if it negatively impacts model predictions or inference. The image below was created in R and shows a VIF plot categorizing predictor variables into low (green), moderate (blue), and high (red) multicollinearity. Variables X1 and X3 have high VIF values, indicating strong collinearity that should be addressed before interpreting the model. ๐Ÿ”น In R, vif() from the car package computes VIF, while check_collinearity() from performance provides visualization. Ridge regression with glmnet can mitigate multicollinearity by applying regularization. ๐Ÿ”น In Python, variance_inflation_factor() from statsmodels.stats.outliers_influence quantifies multicollinearity, and ridge regression with sklearn.linear_model.Ridge() helps stabilize estimates by penalizing large coefficients. Looking to improve your regression models? Check out my online course on Statistical Methods in R! Further details: statisticsglobe.com/online-course-โ€ฆ #R4DS #DataViz #Statistical #RStats #pythonlearning #Python #datavis #Rpackage
Joachim Schork tweet media
English
6
62
371
16.3K
Jasper Obico ๐Ÿ‰ ๋ฆฌํŠธ์œ—ํ•จ
Journal of Applied Ecology
Journal of Applied Ecology@JAppliedEcologyยท
Urbanised landscape and microhabitat differences can influence flowering phenology and synchrony in an annual herb ๐ŸŒฟ Findings revealed that populations in urban sites exhibited earlier flowering onset compared to rural locations ๐Ÿข ๐Ÿ‘‡ Read here: buff.ly/VWtG2kW
Journal of Applied Ecology tweet media
English
0
4
10
659
Jasper Obico ๐Ÿ‰ ๋ฆฌํŠธ์œ—ํ•จ
Journal of Applied Ecology
Journal of Applied Ecology@JAppliedEcologyยท
Validating habitat suitability models for pine marten (Martes martes) reintroductions to England and Wales ๐Ÿ“ˆ The adaptive methodology can be applied to other species as reintroduction projects continue to grow in popularity ๐Ÿ’ญ ๐Ÿ”— doi.org/10.1111/1365-2โ€ฆ
Journal of Applied Ecology tweet media
English
0
10
31
1.6K
Jasper Obico ๐Ÿ‰ ๋ฆฌํŠธ์œ—ํ•จ
Nature Reviews Biodiversity
Nature Reviews Biodiversity@NatRevBiodivยท
New online! Drivers and solutions to Southeast Asiaโ€™s biodiversity crisis bit.ly/4o4alKA
Nature Reviews Biodiversity tweet media
English
4
106
247
15.1K
Jasper Obico ๐Ÿ‰ ๋ฆฌํŠธ์œ—ํ•จ
Dr. Laura Bertola
Dr. Laura Bertola@LauraDBertolaยท
Negative global-scale association between genetic diversity and speciation rates in mammals nature.com/articles/s4146โ€ฆ
English
2
22
86
5.6K
Jasper Obico ๐Ÿ‰ ๋ฆฌํŠธ์œ—ํ•จ
Joachim Schork
Joachim Schork@JoachimSchorkยท
In statistics, Frequentist and Bayesian approaches are two major methods of inference. While they aim to solve similar problems, they differ in their interpretation of probability and handling of uncertainty. Frequentist Approach: Frequentists interpret probability as the long-run frequency of events. Parameters (like the mean) are fixed but unknown, and inference relies on analyzing repeated samples. โœ”๏ธ Key Concept: Frequentist methods estimate a single, true parameter value based on hypothetical repeated sampling. โœ”๏ธ Confidence Intervals: A 95% confidence interval means that in repeated samples, 95% of intervals would contain the true value, not that there's a 95% chance for a single interval. โœ”๏ธ Hypothesis Testing: P-values measure how likely observed data (or more extreme data) would be under the null hypothesis. If the p-value is low (e.g., < 0.05), we reject the null. โŒ Limitations (Frequentist): P-values can be misinterpreted and do not directly indicate the truth of a hypothesis, and frequentist methods do not incorporate prior knowledge, limiting flexibility when such information is available. Bayesian Approach: Bayesians interpret probability as degrees of belief or certainty about an event, updated as new evidence emerges. โœ”๏ธ Key Concept: Bayesian methods start with a prior belief about a parameter, which is updated with data to produce the posterior distribution, reflecting the updated understanding. โœ”๏ธ Credible Intervals: A 95% credible interval means there's a 95% probability the parameter lies within this range, given the data and prior. โœ”๏ธ Incorporating Prior Knowledge: Bayesian methods incorporate prior information, making them flexible for combining expert opinions or past data. โŒ Limitations (Bayesian): The choice of prior can be subjective and influence results, and Bayesian methods often require intensive computation, especially for complex models like those using MCMC. The graph compares Frequentist Confidence Intervals (blue) from 20 samples with a single Bayesian Credible Interval (red). Frequentist intervals vary across samples, showing where the true parameter would fall in repeated sampling. In contrast, the Bayesian interval shows the 95% probability that the parameter lies within the range, given the data and prior. This highlights their different approaches to uncertainty. For more tips and insights on data science, sign up for my free newsletter! More details are available at this link: eepurl.com/gH6myT #datavis #DataViz #RStats #DataAnalytics
Joachim Schork tweet media
English
1
33
182
6.8K
Jasper Obico ๐Ÿ‰ ๋ฆฌํŠธ์œ—ํ•จ
Florian Altermatt @florianaltermatt.bsky.social
Our study โ€˜The #global #human impact on #biodiversityโ€™ is out @Nature nature.com/articles/s4158โ€ฆ ๐ŸŒ๐ŸŒ๐ŸŸ๐ŸŒฟ๐Ÿชฒ ๐ŸซŽ๐Ÿฆ‹๐Ÿ๐Ÿชฒ Unprecedented #synthesis of >2000 studies shows humans are not only shrinking species numbersโ€”but reshaping entire communities across the planet. ๐Ÿงต1/4
Florian Altermatt @florianaltermatt.bsky.social tweet media
English
4
81
179
8.9K
Richard Heydarian
Richard Heydarian@RichHeydarianยท
(Fellow) MILLENNIALS โ€”- and the Gen Z will DECIDE this ELECTIONS!!!
Richard Heydarian tweet media
English
68
372
1.6K
210.5K
Jasper Obico ๐Ÿ‰ ๋ฆฌํŠธ์œ—ํ•จ
Selรงuk Korkmaz
Selรงuk Korkmaz@selcukorkmazยท
Maximum Likelihood Estimation (MLE) is a fundamental method in statistics for estimating the parameters of a probability distribution based on observed data. The core idea is to determine the parameter values that make the observed data most probable under the assumed statistical model. How MLE Works: 1. Define the Likelihood Function: Given a statistical model with an unknown parameter (or parameters) ฮธ and observed data X, the likelihood function L(ฮธ; X) represents the probability of observing the data X given the parameter ฮธ. For independent observations, this is typically the product of individual probabilities or probability densities. 2. Maximize the Likelihood: MLE seeks the parameter value ฮธฬ‚ that maximizes the likelihood function. In practice, itโ€™s often more convenient to maximize the natural logarithm of the likelihood function, known as the log-likelihood, due to its mathematical properties. 3. Solve for the Parameter Estimates: To find ฮธฬ‚, take the derivative of the log-likelihood function with respect to ฮธ, set it to zero, and solve for ฮธ. This yields the parameter value that maximizes the likelihood of observing the given data. Example: Estimating the Mean of a Normal Distribution Suppose we have a sample of data points assumed to come from a normal distribution with unknown mean ฮผ and known variance ฯƒยฒ. The likelihood function for this sample is: L(ฮผ; X) = โˆ (1 / โˆš(2ฯ€ฯƒยฒ)) exp[-(xi - ฮผ)ยฒ / (2ฯƒยฒ)] Taking the natural logarithm to obtain the log-likelihood: log L(ฮผ; X) = -n/2 log(2ฯ€ฯƒยฒ) - (1 / (2ฯƒยฒ)) โˆ‘ (xi - ฮผ)ยฒ Differentiating with respect to ฮผ and setting the derivative to zero: d(log L) / dฮผ = (1 / ฯƒยฒ) โˆ‘ (xi - ฮผ) = 0 Solving for ฮผ gives: ฮผฬ‚ = (1 / n) โˆ‘ xi Thus, the MLE for the mean ฮผ is the sample mean. Properties of MLE: โ€ข Consistency: As the sample size increases, the MLE converges to the true parameter value. โ€ข Asymptotic Normality: For large samples, the distribution of the MLE approaches a normal distribution centered at the true parameter value. โ€ข Efficiency: MLE achieves the lowest possible variance among unbiased estimators under certain regularity conditions. #Statistics #DataScience #Research #Science
Selรงuk Korkmaz tweet media
English
2
42
256
14.6K
Jasper Obico ๐Ÿ‰ ๋ฆฌํŠธ์œ—ํ•จ
Science Magazine
Science Magazine@ScienceMagazineยท
In the 20 years since the term โ€œmicroplasticsโ€ was first coined, a rapidly growing body of research has consistently shown how pervasive and problematic the pollutants have become. A new #ScienceReview provides an overview of this research and the progress made in understanding #microplastics. bit.ly/4doTICB
Science Magazine tweet media
English
10
284
610
74.9K
Jasper Obico ๐Ÿ‰ ๋ฆฌํŠธ์œ—ํ•จ
Anna Brรผniche-Olsen
Anna Brรผniche-Olsen@AnnaBrunicheยท
Interested in how genomic data can help improve conservation assessments and population monitoring? Check out our paper in @EvolAppJournal where we outline background, provide a bioinformatic pipeline, and suggest an analytical framework๐Ÿงตdoi.org/10.1111/eva.70โ€ฆ
Anna Brรผniche-Olsen tweet media
English
1
28
86
6.5K
Jasper Obico ๐Ÿ‰ ๋ฆฌํŠธ์œ—ํ•จ
Prof Lennart Nacke, PhD
Prof Lennart Nacke, PhD@acagamicยท
Basic statistics concepts (that all researchers should know)
Prof Lennart Nacke, PhD tweet media
English
14
570
3K
360.1K
Jasper Obico ๐Ÿ‰ ๋ฆฌํŠธ์œ—ํ•จ
Jasper Obico ๐Ÿ‰ ๋ฆฌํŠธ์œ—ํ•จ
Science Magazine
Science Magazine@ScienceMagazineยท
A new #MachineLearning analysis has revealed the most effective climate policies out of 1500 implemented worldwide over the last two decades. Learn more in Science: scim.ag/89A
Science Magazine tweet media
English
10
55
152
35.3K
Jasper Obico ๐Ÿ‰ ๋ฆฌํŠธ์œ—ํ•จ
Emerson Del Ponte
Emerson Del Ponte@edelponteยท
The training tool used in this study is an online app made with R Shiny, currently featuring nine plant diseases. It is freely available here: delponte.shinyapps.io/traineR2/
Emerson Del Ponte tweet media
Ignacio Cazon@IgnacioCazon1

๐Ÿ“ฃNew preprint๐Ÿ“ฃ Optimizing visual estimation of peanut late leaf spot severity with online training sessions and standard area diagrams. doi.org/10.31219/osf.iโ€ฆ Thanks @Juan_A_Paredes, @edelponte, Cinthia Conforto, Noelia Gonzรกlez and Lautaro Suรกrez!

English
0
33
130
9.8K