The Crosstab Kite retweetledi
The Crosstab Kite
84 posts

The Crosstab Kite
@CrosstabKite
A field guide for applied data science, machine learning, and AI
Katılım Ocak 2021
9 Takip Edilen66 Takipçiler
The Crosstab Kite retweetledi

A slightly tidied up version of my survival analysis applications article, a teaser for my upcoming ODSC West talk:
opendatascience.com/when-to-use-su…
English
The Crosstab Kite retweetledi

ODSC West Conference Speaker: hubs.li/H0VfYvR0 #ODSCWest #DataScience #MachineLearning @brian_p_kent @CrosstabKite

English
The Crosstab Kite retweetledi

Interested in survival analysis but not sure it applies to your work? Here's the article for you.
Short version: do you need to make decisions before seeing all the data? If yes, try survival analysis.
crosstab.io/articles/survi…
English
The Crosstab Kite retweetledi

Why don't data scientists use survival analysis more often to understand and optimize how their businesses work? Maybe in part because survival and hazard curves are kinda tricky to estimate with SQL.
Here's how to do it.
crosstab.io/articles/sql-s…
English
The Crosstab Kite retweetledi

I wrote a blog post about how to build at data team in the form of a story (and it ended up being about 4x longer than I anticipated): erikbern.com/2021/07/07/the…
English

Super excited to attend #SciPy2021 next week!
SciPyConf@SciPyConf
T minus 10 days until #SciPy2021
English

Hot off the press! Check out our updated review of Amazon Textract, now with detailed code for converting the output into a Pandas DataFrame.
crosstab.io/articles/amazo…
English
The Crosstab Kite retweetledi

It's here....🤩
Session State has officially landed 🥳 Now you can store information across app interactions and reruns. Upgrade to try it out!
📖 Read more: blog.streamlit.io/session-state-…
🕹 Sample app: share.streamlit.io/streamlit/rele…
#opensource #datascience #python
English
The Crosstab Kite retweetledi

“Test & Roll: Profit-Maximizing A/B Tests” by Feit and Berman statmodeling.stat.columbia.edu/2021/06/30/tes…
English

Another nuts and bolts piece on The Crosstab Kite: How to draw survival curves in Python with Altair and Plotly. Lifelines and Convoys provide good Matplotlib-based convience functions, but they hide all the good stuff!
crosstab.io/articles/survi…
English
The Crosstab Kite retweetledi

"I think problem formulation is even more important than either data or models."
@reddup argues that data scientists should consider business goal-framing a core, crucial skill. buff.ly/3zG4KB8
English

What do pay stubs, invoices, and Covid vaccination cards have in common? They're all forms, they're all valuable sources of data, and they're all hard to work with programmatically.
Form Parser is Google's answer, here's our review and how-to guide.
crosstab.io/articles/googl…
English
The Crosstab Kite retweetledi

Excited to announce a new large-scale data set for research into household energy usage. Lots of applications in studying how people use energy at home, using ML to give predictions and advice, etc... Congrats to the IDEAL team!
Martin Pullinger@DrMPullinger
Our IDEAL Household Energy Dataset now open access for energy research: electricity, gas, boiler & room temp. data, occupant surveys + weather. 255 homes (+ appliance usage for 35). Data: doi.org/10.7488/ds/2836. @ScientificData paper: https://doi.org/10.1038/s41597-021-00921-y.
English

Just realized our staged rollout analysis app is the featured @streamlit app for May - very cool!
blog.streamlit.io/monthly-rewind…
English

Our last article used Python to create duration tables from event logs, for survival modeling. Thing is, event logs are usually stored in databases we query with SQL, so in this post we do the same event-log-to-duration-table conversion with SQL.
crosstab.io/articles/event…
English
The Crosstab Kite retweetledi

We asked our speakers what they're excited about for the future. @CrosstabKite's, @reddup said: "From a methodological level, one thing I am excited about is causal #MachineLearning and causal thinking." #AnacondaCON

English

Based on some feedback and reflection, we've updated this article. Here's the new version.
crosstab.io/articles/event…
English

Time-to-conversion is the most natural way to model many applied data science problems, but creating conversion tables from raw data can be tricky. We show how to do it with event logs, using web browser activity as an example.
crosstab.io/articles/event…
English
The Crosstab Kite retweetledi

Tabular Data: Deep Learning is Not All You Need. Nice comparison betw XGBoost & recent DNNs for tabular data. Surprise?! XGboost comes out on top for most datasets (esp. those not incl in the DNN papers). What's even better? An ensemble of XGBoost & DNNs arxiv.org/abs/2106.03253


English
The Crosstab Kite retweetledi

I finally got around to writing this tweet up 😅 as a post about the importance of understanding data-specific patterns veekaybee.github.io/2021/06/06/has…
vicki@vboykis
If I had to pick a single programming concept where understanding it is like a superpower, it would probably be the hash map (aka in Python, the humble dictionary) because I've seen the pattern come up in almost every kind of data/programming work I've ever done.
English
