I’m writing a talk for PyData NYC
right now, and it’s the first talk I’ve ever written about what I do at
I’ve seen a lot of “training a model with scikit-learn for beginners”
talks. They are not the talk I’m going to give. If you’ve never done any
machine learning it’s fun to realize that there are tools that you can
use to start training models really easily. I made a tiny example of
generating some fake data and training a simple
you can look at.
But honestly how to use scikit-learn is not something I struggle with,
and I wanted to talk about something harder.
I want to talk about what happens after you train a model.
How well does it work?
If you’re building a model to predict something, the first question
anyone’s going to ask you is:
“So, how well does it work?”
I often feel like the only thing I’ve ever learned about machine
learning is how important it is to be able to answer this question, and
how hard it is. If
you read Cathy O’Neil’s blog posts about why models to measure teachers’ teaching are flawed,
you see this everywhere:
we should never trust a so-called “objective mathematical model” when
we can’t even decide on a definition of success
If it were a good model, we’d presumably be seeing a comparison of
current VAM scores and current other measures of teacher success and
how they agree. But we aren’t seeing anything like that.
If your model is actually doing something important (deciding whether
teachers should lose their jobs, or how risky a stock portfolio is, or
what the weather will be tomorrow), you have to measure if it’s