rohan
219 posts

rohan
@bali2ro
earth observation machine learning and uncertainty under shift , asking when benchmarks fool us.



This is the best ML meme for me.





In Cowork, you give Claude access to a folder on your computer. Claude can then read, edit, or create files in that folder. Try it to create a spreadsheet from a pile of screenshots, or produce a first draft from scattered notes.



Claude Code and its ilk are coming for the study of politics like a freight train. A single academic is going to be able to write thousands of empirical papers (especially survey experiments or LLM experiments) per year. Claude Code can already essentially one-shot a full AJPS-style survey experiment paper (with access to Prolific API). We'll need to find new ways of organizing and disseminating political science research in the very near future for this deluge.


CS229 online : day 5 : support vector machines continued naive bayes, tried laplace smoothening, which is nothing but just adding 1 to the numerator and number of features to the denominator in the probabilistic estimation using bayes rule to avoid the breakdown. dived deeper into multivariate bernoulli event model, for the email spam classification example, but it broke down when the new word was provided. in order to resolve the issue, the generative model for the same, called the multinomial event model. started with support vector machines, which in simple words is turning the parameters into a multi dimensional vector matrix, and then applying linear classifier over it. understood the concept of optimal margin classifier, the relation between functional and geometric margin. basically optimal margin classifier is just choosing parameters to maximize the geometric margin.







CS229 online : day 4 : gaussian discriminant analysis and naive bayes studied the difference between discriminative and generative learning algorithms. how we focus on the input parameters instead of target variable. dived deeper into gaussian discriminant analysis, a generative learning algorithm, where we assumed the probability distribution is gaussian and use mean and the covariance as the parameters which are optimized using maximum likelihood. this function then implies to a sigmoid function which is the decision boundary for the classification. turns out all exponential families implies to the same, i.e. sigmoid function but the vice versa isn’t true. basically, generative learning algorithms can be a good fit for the data with clear known distribution. but for random data logistic regression is better. lastly, naive bayes, another generative learning algorithm, focused on the classic email spam classification, where we assumed that the input is conditionally independent i.e. if “Ant” is a spam, “cant” shouldn’t be a spam. so we remove all the other parameters from the probability chain. then, on parameters using maximum likelihood estimation we get the probability/chances/fraction of a word to be in the spam folder.


