Glowing Python
5.1K posts

Glowing Python
@JustGlowing
Tweeting about Python trickeries, Data Visualization and Machine Learning. Half human data scientist, half machine.
Cambridge, UK Katılım Aralık 2010
397 Takip Edilen2.8K Takipçiler

Nice visualization done using MiniSom. From the paper "Cross-regional impact of land and ocean evaporation on extreme precipitation in North China" sciencedirect.com/science/articl…

English

🎉 MiniSom 2.3.6 is out!
🆕 Offline batch training is now implemented!
Faster & more robust PCA init, better numerical stability, and doc fixes.
github.com/JustGlowing/mi…
Thanks to @mariajmolina and @lorenzoferre for their contributions! 🙌
English

🎉 MiniSom now supports batch training with the new method train_batch_offline().
The changes are in the master branch and will soon be released.
github.com/JustGlowing/mi…
English

🚨PSA for #PySpark users:
df.limit(n) is NOT deterministic! ⚠️
It just takes the first n rows from each partition, the final set can vary between runs unless you explicitly orderBy().
If you need consistent results:
👉 df.orderBy("some_col").limit(n)
#BigData #SparkTips
English

Tackling class imbalance? Try imbalanced-learn (imblearn) — works seamlessly with #scikitLearn!
from imblearn.over_sampling import SMOTE
X_res, y_res = SMOTE().fit_resample(X, y)
It also supports under/over sampling, pipelines & ensembles. ⚖️
#Python #ML #DataScience
English

Handling missing data in #ML? #scikitLearn gives you several imputers:
SimpleImputer → mean/median/mode/constant
KNNImputer → use nearest neighbors
IterativeImputer → model-based (like MICE)
MissingIndicator → flag NaNs
Clean data = better models!
English

Got messy numeric data like "N/A", "—", or "??"?
Use pd.to_numeric() to convert safely:
pd.to_numeric(df["col"], errors="coerce")
Non-numeric values become NaN — perfect for cleaning before modeling. 🧹📊
#DataCleaning #Python #ML
English

Since version 1.7.2, sklearn has the function brier_score_loss which measures the mean squared difference between the predicted probability and the actual outcome.
#sklearn.metrics.brier_score_loss" target="_blank" rel="nofollow noopener">scikit-learn.org/stable/modules…
English

In Python, with is elegance in action.
It handles setup and cleanup so you can focus on logic:
with open("data.txt") as f:
data = f.read()
No need to close() — the context takes care of it.
Less noise, more clarity.
English

Ever needed the Kronecker product in #Python?numpy.kron() is your friend!
import numpy as np
A = np.array([[1, 2], [3, 4]])
B = np.array([[0, 5], [6, 7]])
np.kron(A, B)
Expand matrices in a snap!
English

