Julia Evans

Fun with machine learning: does your model actually work?

in machinelearning

I'm writing a talk for PyData NYC right now, and it's the first talk I've ever written about what I do at work.

I've seen a lot of "training a model with scikit-learn for beginners" talks. They are not the talk I'm going to give. If you've never done any machine learning it's fun to realize that there are tools that you can use to start training models really easily. I made a tiny example of generating some fake data and training a simple model that you can look at.

But honestly how to use scikit-learn is not something I struggle with, and I wanted to talk about something harder.

I want to talk about what happens after you train a model.

How well does it work?

If you're building a model to predict something, the first question anyone's going to ask you is:

"So, how well does it work?"

I often feel like the only thing I've ever learned about machine learning is how important it is to be able to answer this question, and how hard it is. If you read Cathy O'Neil's blog posts about why models to measure teachers' teaching are flawed, you see this everywhere:

we should never trust a so-called “objective mathematical model” when we can’t even decide on a definition of success

If it were a good model, we’d presumably be seeing a comparison of current VAM scores and current other measures of teacher success and how they agree. But we aren’t seeing anything like that.

If your model is actually doing something important (deciding whether teachers should lose their jobs, or how risky a stock portfolio is, or what the weather will be tomorrow), you have to measure if it's working.

There's no fixed answer to how to do this -- if it were easy, statisticians wouldn't have jobs. If you looked at the notebook I linked to, we looked at the confusion matrix for our classifier:

[[8953 3508]
 [3500 9039]]

We could have instead calculated a score (0.2, 0.8, ...) for each data point and looked at something called the ROC curve (one day maybe I will explain how Steven Noble told me how to read one of these even though I thought I understood them already)

Here's the ROC curve for the model we just built. It's much prettier than a real-life ROC curve will normally be, with no jagged edges.

<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAYoAAAENCAYAAAARyyJwAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz AAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xl8VOXdNvBrJslMMpnsewghbGEJBER20CKUza1USdTy oPUV5QWttW5UJUBZikjV8pSnvALVgvJUDVXLoqCigkHAxgAJu0DCkn2bJLNktnPePyg5ICQzyWSW M7m+nw8fPZmZk18uwvzm3Oc+91GIoiiCiIioFUpvF0BERL6NjYKIiNrERkFERG1ioyAiojaxURAR UZvYKIiIqE2B7tz5yZMnsXnzZgwcOBCzZ89u87mFhYXYunUrACA7OxuDBg1yZ2lEROQktzYKq9WK X/7ylzh9+nSbzxMEAbm5ucjJyQEArFixAhkZGVAoFO4sj4iInODWoafMzExotVqHz6uoqEBSUhJU KhVUKhUSEhJQUVHhztKIiMhJbj2icJZer0doaCg2bdoEANBoNGhqakJSUpKXKyMiIp9oFFqtFgaD AXPmzIEoiti4cSPCw8Nbff6ePXs8WB0Rkf+YNGlSu1/j9kbhzFJSiYmJKC8vb9muqKhAYmJim68Z NmyYy7UREbXFYLGjvNEMg9UOo0VAWaMZdSYrqposaLYLMFkFlDeaUWe0QqlUwGoXERUSiNjQIKgC lFAqgHitCpqgAKREqBEefOWxIKUSEcGB0Kiu/FfpofOxBQUFHXqdWxvFJ598giNHjkCn08FkMuGJ J54AABw4cABqtbrlzV6pVGLmzJlYtmwZACArK8udZfmVvLw8jB8/3ttl+ARmIWEWkmuzEEURJqsA vcWOGoMVDc02VBssuFDfDItdQLXeCl2zDQaLHU1mG+yCiChNEDRBSsSGqhAZHIiI4ECM7hGB0KAA qAOViNIEIkwdCK0qAKoAhV9OwnFro5gxYwZmzJhxw9fHjBlzw9eGDBmCIUOGuLMcIvJjoiiiUm/B RZ0Z52qNaLYJqDVYcbo0GJv/eRJljWYAgEKhQIACCA5SIjFMjR6RwQgPDkRPbQgGJSoQqwlClCYI sZoghAf7xOi81zEFmeOnRgmzkPhzFqIooqHZhos6M87WGlFYrseZaiOMVjuabQJ6RYcgNTIYKRFq DEgIxdi0CKgDlEgIUyFWEwR1oNIvP/W7ExsFEfkcQRRRXGdCeaMFTWYbqg1WnKkxorHZhrJGM5rM diSGqXBrSjjG9IjAXf1j0S9OA606wGPj/V0JG4XMcSxawiwkcshCEEVUNllQ0WRBWZMZOpMNxXUm nKkxoqHZhjB1AGJDVUgJVyNOq8Lo1AikRKjRLUKNWE2Q00cFcsjC17FREJFbXW0IZ2qM0JvtKG00 o7jOhB9Km6BUABkJoYgIDkS3cDVGdg/HvQPjkBqpRmRIkLdLp/9QyPFWqHv27OH0WCIfY7YJqDFY UN5kwY81RpyvNaFE14xLumYEKhXISNAiVKVEuDoQ/eNDMSRJi6RwtbfL7lIKCgp88zoKIvJPBosd /77UiO8uNOByQzPO1poQqgpAr+gQ9I0NwYju4ZiZGY+kMDVnD8kc//ZkjuOvEmYh6cwszDYBZ2uM uKBrxvk6E4rK9bioa4ZdBIYmazEkKQz3DLxyMlkV4Ht3LuDvhevYKIjoBiV1Jhy81IivztahpL4Z qgAFRqdGoGd0CB66JRG9okOQFKZCkA82Bup8PEdBRChrNKOoQo/LDWbsPV+PepMNk/tGY1T3cAxJ DkNwIBuCP+A5CiJyiiiKuKBrxslKAwpKm3CiygBdsw39YjXoGR2CZ8Z3x6AELVRsDvQfbBQyx/FX CbOQXJuFIIo4W2vCiUo9jpbrsb+kAWHqAPSJ0eCWbmH4RUYcBsSHIkDpnxeq8ffCdWwURH7GYLHj nD4AFw5X4HilHkXlekChQP84DQYnavHorclIiVTzCmZyGhuFzPGTkqSrZiGKIuqMNnzxYy0OXmzE iSoDBsbHIMNix6Q+0XhmfCritSpvl+k1XfX3ojOxURDJjCiKOFdrwoGLDThfZ8LxCgNsgogR3cNx 3+A4LE/uBa2a/7Sp8/C3SeY4/irx5yyqDRYcLdPjbK0R+87rEBigQK/oEPSMDsGcEcnoFhF83fP9 OYv2YhauY6Mg8jE2QcTlhmYcLdPjdLUBR8r0qDFakR6rQY+oYLw4oQeGJGm5VDZ5DK+jIPIBgiji ZKUBu87UYveZOkSFBCI1Mhi3JIehT2wIbu0W7rezkshzeB0FkczYBRH5lxuRV6LDdxcaoFQoMC09 Gm9nDUDKT4aSiLyJV9TIXF5enrdL8BlyyKJKb8G2E9XI2X0O9/79KN47XIHEMDWeGZ+KD2cNwmMj u3VKk5BDFp7CLFzHIwoiN9Obbdh1phb7SxpQUt+MwYmhuL1XFH53WyqiNbznAvk+nqMgcoPShmZ8 8WMdjlcacLzSgO4RakzvH4s7+8VwaQzyGp6jIPIincmKL8/W4buSBpyuNsIqiBjbIwLT+sVgyeRe CFUFeLtEog5jo5A5zhGXeDKLqzft2V+iw7k6E0obzLg1JQwT+0TjN+O6o3tkMAK9OEuJvxcSZuE6 NgoiJxktdhRW6PH9pUZ88WMd+sdpMCIlHFPSY3BLtzCvNgYid+I5CqJWWOwCTlYZcLraiG+LdSiu M6FndAiGp4RjSt9o3u+ZZIfnKIg6yckqA3adrkX+5UaoApQYnKTFhF5ReP3uvj55q08id2OjkDmO v0pcyaK4zoS95+tx9D/3gx7ZPRwLJqRhcGKoLJfK4O+FhFm4jo2CuiyT1Y5tJ2qw52wdyhrNuLVb OO7qH4vbekZCzSmsRC14joK6FIPFjsOlTThwsQH/vtSIeK0KDwxJwKjUcA4rkd/jOQqiNtgEEW8d LMW/TlSjb2wIJveNxqxbEpHME9JEDvEjlMxxHRvJzbIwWe34sLAS2e8V4XS1AZsfGIj/mdEfMzLi /bpJ8PdCwixcxyMK8kt2QURuURW2HK5ARkIolk3phYxErbfLIpIlnqMgvyKIIr4racA7+WXQqALw 3O2pSIsK8XZZRD6B5yioyzt4oQGbCspRa7DiwaEJmJERB6UMp7YS+Ro2Cpnr6nPEf6wxYs/ZOuSV 6GA2m/H4mDRM6hPd5e8G19V/L67FLFzHRkGy02y149ClRmz8vgyNZhum94vB78anwlhciNvSY7xd HpHfYaOQua70ScloseOd/HLsPFWD3jEhmDuqG8b0iJCOHlK6ThaOdKXfC0eYhevc2igKCwuxdetW AEB2djYGDRrU6nP37t2L3bt3IyAgAA888ECbz6WuxWIXkFtYhX8dr8bAhFC8nTUAiWH+O7WVyNe4 7ToKQRCQm5uLhQsXYuHChcjNzUVbE6y2b9+O5cuX46WXXsI//vEPd5Xld/x5jnitwYr1h0px/+ZC 7C/R4dU7+2DJ5F6tNgl/zqK9mIWEWbjObUcUFRUVSEpKgkqlAgAkJCS0fO1mUlJScOLECeh0OqSn p7urLJKBGoMFb3x7EfmXm3BH7yi8eU86+sRqvF0WUZfltkah1+sRGhqKTZs2AQA0Gg2amppabRSZ mZnYuXMnbDYbpk6d6nD/185kuPqJoStujx8/3qfqcWW7x6Dh2HmqBjtOVKGf1o6PZt8CrToQeXl5 qPCB+uS2fZWv1OOt7atf85V6vL3dEW674K6srAyffPIJ5syZA1EUsXHjRtx///1ITEy84bmVlZV4 99138fzzzwMAFi9ejFdeeaXlaOSneMGd/7ALIr4t0WH7iRocq9Djtp6ReHhYElKjgr1dGpHf6egF d247R5GYmIjy8vKW7YqKips2CeDK+Qy73Q4AEEURFovFXWX5HbmOv4qiiH3n6/HbbWew5XAFbusZ gQ9mDcLCST073CTkmoU7MAsJs3CdU0NPVVVVKC0txS233AIAaG5uRnBw2/+YlUolZs6ciWXLlgEA srKyWh47cOAA1Gp1y1FBUlIS+vbti5UrV0IQBEydOrXVowmSN5sgoqhcjzX7L6HZascvMuKQnZnQ 5S+QI/JlDoee9u3bh927d8NisWD16tUQRRGLFy/G0qVLPVXjDTj0JE/n60xY9mUxlApgev8Y3D8o XpZ3jyOSK7cNPe3evRtLliyBVntl5U3+w6b2stgFvJNfht98chq/yIjD37IGYubgBP4uEcmEw0YR EBCAoKCglu3m5maeQ/Ahvj7+eqLSgEc/PIFTVUb8+d50zMiIc9v38vUsPIlZSJiF6xyeo+jbty+2 bNkCo9GI/Px8bNu2jZfEk0NWu4BtJ2qw6YdyzB3dDdPSY3gegkimHJ6jEAQBX375JQoLCxEQEIAR I0Z4vVHwHIVv+6G0Eau+voBuEWrMGZmMjATeMIjIF7jtfhRKpRJTpkzBlClTOlQYdR2VTRb875EK 7CvW4fnbUzG2RwTPQxD5gQ5dR2E2mzu7DuogXxh/1Ztt2PB9Kf7P1hMICVLinawBGJcW6fEm4QtZ +ApmIWEWrnPYKHJzc6/bFgQBr7/+utsKInn54XIjZr5XhMJyPf7nF/3wf0enIDIkyPELiUg2HA49 FRUVXXexnFKphMlkcmtR5DxvnS+qNVqx4VApDlxswPwxKbh3oPtmMznL2+fOfAmzkDAL17XaKA4f PozDhw+jsrIS77zzTssS4Q0NDRx66sJEUcSu07VYs/8SRqdG4H8fGoRQVYC3yyIiN2q1UURFRaFX r144evQoevbs2fJ1lUqFwYMHe6Q4csxT9wMWRRHnak1Y+91llNSb8PztPfDzvtFu/77twXsjS5iF hFm4rtVGkZaWhrS0NDQ3N2PChAkeLIl8icUu4IOjlfj8TB0MFjvu6h+DV6f3RnAQjyKIugq3LTPu TryOwjP2nq/HO/nlSA5XITszAZlJWig53ZVIttx2HQV1TW98exEHLjTg2dtSMTo1nNdDEHVhDhtF WVkZduzYgfr6egBXxqobGhqwcuVKtxdHjnX2+KvebMPCz8+jWm/BxpkDEBEsn88SHIuWMAsJs3Cd w+so1qxZg27duiE6OhrDhw9HTEwMbrvtNk/URh5WVKFH1ntFiNEEYfMDGbJqEkTkPg4bhUqlwl13 3YX09HRERUXhscceQ35+vidqIyd0xicli03Aq1+XYOmXxXjmtlTkTOopywX8+KlRwiwkzMJ1DhtF SEgIAKBHjx44ePAgbDYbamtr3V4YecbpagMe23oSVkHEO1kDMDU9xtslEZGPcdgo7rjjDjQ1NSEt LQ0AMHfuXEyePNnddZGTOrqOjcUm4K8HLuP5HT/i/sHxWDgxDVq1vIeauKaPhFlImIXrHL4zjBkz puX/58+f79ZiyDOK60x4edc5RAQH4C+/6Ie06BBvl0REPozXUXQx+0t0WL6nGPcOjMP/Hd2N016J uhC33TN7//79HSqIfIsoith2ohp/+LIYr9+djnljUtgkiMgpDhvFjh07PFEHdZCz469vfHsR6w+V Yu0v+mFgQqibq/IOjkVLmIWEWbjO4TkKlUoFk8nUMvuJ5EVvtmHV3gsorjPhg1mDudIrEbWbw3MU H330EYqKijBt2rSWpcYVCgVGjRrlkQJvhuconHOmxoiXPjuLESnh+M247mwSRF2c29Z6Ki8vR2xs LH744Yfrvu7NRkFtq2yyYO2BSygq1+OxEcm4xwduKkRE8uWwUTz55JOeqIM66Kfr2FTpLXjio5P4 eZ9ovPtgBsJkfm1Ee3BNHwmzkDAL13Wdd5EuoLTBjGe2n0F2ZgJ+NTSBs5qIqFOwUcjc1U9KpQ3N +N32H3F7z0jMuiXRy1V5Bz81SpiFhFm4jo3CDxgtdiz64jympkfjsZHdvF0OEfkZh9dRkG/bszcP f/iyGCnhwfj18GRvl+NVnC8vYRYSZuE6NgoZE0UR/3s5GOpABV66o4cslwYnIt/HoSeZami2YU3e RQSFaLFwYk+oAtnzORYtYRYSZuE6vrvIUEFpI57450moApRYfVcfNgkiciu+w8iIKIr4fwcvY8VX JZg3OgW/vyMNR/990Ntl+QyORUuYhYRZuM7poSe9Xg+tVuvOWqgNNQYLVn1zAXUmK9bfNwAxoUHe LomIugiHRxTnz5/HCy+8gIULFwIABEHAunXr3F4YSfKKdXj0wxOIDAnEG3enX9ckOP4qYRYSZiFh Fq5zeESxadMmvPjii/jrX/8KAFAqlSgvL3dq54WFhdi6dSsAIDs7G4MGDWr1ubW1tVi7di3sdjt6 9+6NRx55xKnv4e+OlDXhzbyLWDm9DwYl8oiOiDzP4RGFQqFAXNz1i8pZrVaHOxYEAbm5uVi4cCEW LlyI3NxctLVQ7bvvvosHH3wQS5cuZZP4j2/O1ePFT89iwYS0VpsEx18lzELCLCTMwnUOG0VYWBgO Hz4MURRhMpmwadMm9OzZ0+GOKyoqkJSUBJVKBZVKhYSEBFRUVNz0uYIgoLKyEv369Wv/T+Cnvr/U gNX7LmDNvekY2T3c2+UQURfm8H4UjY2NePvtt1FUVASlUokRI0bg4YcfRnBwcJs7PnPmDA4cONCy LYoixo4di/T09Bueq9PpsGzZMiQmJsJoNGL69OkYOXJkq/v25/tR2AURb+eX4Z9FVfjD5F4YlRrh 7ZKIyE+47X4U4eHheOaZZ9q9Y61WC4PBgDlz5kAURWzcuBHh4Tf/ZKzVaqHRaPDcc89BEATk5ORg 6NChUKlUre7/2qWDrx5ayn07bfBwLN9TDKPBgKd7mVuahK/Ux21uc1v+2x3h8IiiowRBwOLFi5GT kwNRFLF8+XIsW7as1eevWbMGs2fPRnR0NHJycpCTk9Nqo/DHI4rPz9TiT/su4te3JuGBIQlOL8fB tfYlzELCLCTMQuK2I4qXX34ZEydOxPjx4x0ON11LqVRi5syZLc0hKyur5bEDBw5ArVZf92Y/a9Ys vPXWWzAajRgzZkybRxP+pqC0EW8dKsVvxqbwbnRE5HMcHlFcuHAB+/btQ35+PtLT0zFx4kQMGDDA U/XdlL8cUdgEEbtP12LN/ktYMrknxvaI9HZJROTH3HZE0aNHD8yePRuzZs3CsWPHsGXLFjQ1NWHN mjUdKpSusAsiFn9+DpV6K1ZO641bUziziYh8k1NrPTU2NmLXrl14//33ERISguzsbHfX5ddsgoin /3UaTWY7/mdGP5eaBOeIS5iFhFlImIXrHB5RrFy5EhUVFbjtttvw7LPPIjY21hN1+S27IOK5HWeg UQVg1Z19oOR9rYnIxzk8R1FUVITBgwd7qh6nyPUchdUu4I1vL+Kirhn/fW8/3miIiDyqo+coHA49 +VqTkCtRFLH+UCmq9Bb86a6+bBJEJBu8H4WHvFtQgUOXGvHyxJ4ICQrotP1y/FXCLCTMQsIsXNdq o7Db7Z6sw69dqDfhvcMVWD6lN2I0vI8EEclLqyez169fj3nz5uHhhx++4TGFQoFNmza5tTB/YbUL ePHTs/jV0ASkRjl/waKzeMWphFlImIWEWbiu1UYxd+5cAEBaWhqWLl3qsYL8iclqxzPbzqBnVAhm D0vydjlERB3S6tCTUnnloW7dunmsGH9itQuY//FpJIWr8cfpvd128prjrxJmIWEWEmbhOofXUVw9 siDn1ZusWPZlMVIi1Fj8855Q8FoJIpIxznrqZPUmKx798ATitSosntzL7U2C468SZiFhFhJm4bp2 NwpRFHH27Fl31OIX/vztJaTHabBgQg8E8loJIvIDDhvF6tWrr9tWKBR4//333VaQnL2+7wKK601Y ONFzw00cf5UwCwmzkDAL1zk8R9HU1HTdtiAIaGhocFtBcvXFj3X47kID1v2yP8KDHcZKRCQbrb6j ff7559i9ezeqqqrw3HPPtXxdr9dj4MCBHilOLnQmK/477yKWT+uNeK1nb7jE8VcJs5AwCwmzcF2r jWL8+PEYOnQo3nzzTTz77LO4unagSqVCZCRvsHOVXRCxcPd5zMiIw5CkMG+XQ0TU6Vo9R6HRaBAf H49HH30UcXFxiI+PR3x8PJvET+w4WQOLXcCvhyd75ftz/FXCLCTMQsIsXOfwZHZ6eron6pClw6VN WHfwMuaPSeFqsETkt3gdRQcZLHasO3gZjw5PxtBk7w05cfxVwiwkzELCLFzHRtFBr+w+h35xGmRn xnu7FCIit2q1URw8eBAAsH379hv+7Nixw2MF+qL3j1ai1mDFk2O7e315Do6/SpiFhFlImIXrHB5R fPbZZ2hubr7uj8lk8kRtPumrs3V4r6Acq+7sg+BAHpARkf9rdXrs6NGjAQCxsbHIysryWEG+zGoX 8Oa3F/HSHWlIDld7uxwAHH+9FrOQMAsJs3Cdw4/E//Vf/+WJOmTh/aOVSIsOwbg0ThEmoq6D02Od ZBNEbDtRg1m3JHq7lOtw/FXCLCTMQsIsXMdBdifl/GeW06ju4d4uhYjIoxw2ipqampb/P3jwILZs 2YLGxka3FuVrvrugw7EKPRZOTPP6LKef4virhFlImIWEWbjOYaN47bXXAAClpaX46KOPEBoairfe esvthfkKQRSx6psLeOa2VAQHBXi7HCIij3PYKIKDgwEA3333He6//37MmDGjSy0zvu1ENZLD1ZjY O8rbpdwUx18lzELCLCTMwnUOG4UoiigpKUFBQQGGDh0KAD43/OJOX5+rxyO3JnWpn5mI6FoOG8XM mTOxbt063HHHHVCr1RAEAX369PFEbV5X2mDGmWojBidqvV1Kqzj+KmEWEmYhYRauc3grtiFDhmDI kCEt20qlEo888ohbi/IVm34oxz0D4xCq4rkJIuq6OD22FcV1Jnxzvh6/zIjzdilt4virhFlImIWE WbjO4RGFxWLBxx9/jCNHjgAAhg0bhhkzZiAoKMjtxXmLwWLH09vOYPawRCT5yFIdRETeohCv3uO0 FRs2bEBQUBCmTp0KURTx2WefQRAEPP74456q8QZ79uzBsGHD3Lb/FV8VwyaIWDSpJ09iE5HfKCgo wKRJk9r9OodDTyUlJfj1r3+NpKQkJCcn47HHHkNJSUlHapSF4joTvi3W4akx3l9CnIjIFzg1PdZm s7VsW61WCILg1M4LCwuxaNEiLFq0CMeOHXP4fKvVivnz52PXrl1O7d8d8kp0GJ4SjphQeQytcfxV wiwkzELCLFzn8BzF2LFjsXTpUkyYMAGCIGDv3r1OTTcTBAG5ubnIyckBAKxYsQIZGRltfkr/4osv 0KtXL699kjfbBHxYWIU/Tuvtle9PROSLHDaKu+++G6mpqThy5AgUCgWys7MxePBghzuuqKhAUlIS VCoVACAhIaHlazdjNptRWFiI0aNHo7m5uZ0/Ruf44Ggl+sdpfPq6iZ/iHHEJs5AwCwmzcJ3DRgEA mZmZyMzMbNeO9Xo9QkNDsWnTJgCARqNBU1NTq43is88+w7Rp06DT6dr1fTqLyWrHv05UY+X0rnEx IRGRs5y6jqKgoAD/+Mc/sHXrVpw6dcqpHWu1WhgMBjz00EN48MEHYTAYEB5+8yW6jUYjTp061bJE iDOuHXfMy8tzeXv1jnwMTAhFeqymU/bnqe2r/+8r9Xhz+6eZeLseb26vW7fOp+rx5va6det8qh5v b3eEw+mxb7/9Ni5cuIBRo0ZBFEXs378fQ4cORXZ2dps7FgQBixcvRk5ODkRRxPLly7Fs2bKbPreg oAA7d+5EeHg4qqqqYLfb8dRTTyElJeWmz+/s6bGlDc2Y/8lprJreB/3jQzttv56Ql5fHQ+v/YBYS ZiFhFpKOTo912CheeOEFrFq1CkrllYMPm82G3//+9/jTn/7kcOdHjx7F1q1bAQBZWVktw1cHDhyA Wq2+6Zv9N998A7PZjKlTp7a6385uFG//uwxmm4B5Y27emIiI/EFHG4XDcxRRUTcurx0TE+PUzn+6 TtRVY8aMafU1EyZMcGrfncUmiPjsdC1WcqYTEdFNOTxHERkZiddeew07duzA9u3bsWzZMkRHR2P7 9u3YsWOHJ2p0qy9/rENKhBp9YjXeLqVDXB179CfMQsIsJMzCdQ6PKOLi4hAXFweTyQQAGDRoEAB4 bQprZxJFEdtPVmN6v1hvl0JE5LMcNoqsrCxP1OEVR8v1aGi2YWp6tLdL6TCepJMwCwmzkDAL13XZ ZcatdgF/+LIYT4zshqCALhsDEZFDXfYd8pPj1egbG4Lbe/nmvbCdxfFXCbOQMAsJs3Bdl2wUTWYb 3j9aiQcyE7xdChGRz+uSjeK/919CZpIWt6bc/EpxOeH4q4RZSJiFhFm4rss1iuI6Ew5caMB8XlxH ROSULtco/nW8GlPTYxAXqvJ2KZ2C468SZiFhFhJm4bou1SjMNgFfnq3DFBlPhyUi8rQu1Sj2nq/H gPhQ9IuT18J/beH4q4RZSJiFhFm4zmGjaGxsxLp167BixQoAV65m9uatSl3xxY91uKO3vKfDEhF5 msNG8dZbb+GWW26BxWIBACgUCuzfv9/thXW2MzVGHC3XY2If/xp24virhFlImIWEWbjOYaPQ6/UY PXp0yzLjwJWjCrk5dLEBE3tHITiwS422ERG5zOG7plKpRH19fcv2999/j9BQ+Y3x7zpdi5/39a+j CYDjr9diFhJmIWEWrnO4KODs2bPxxz/+ETU1NViwYAGsVitefPFFT9TWaeqMVtQZrbglOczbpRAR yY7DRtGrVy+sXLkSpaWlCAgIQHJy8nXDUHKQf7kRqZHBCFAqvF1Kp+NtHiXMQsIsJMzCdQ4bBQAE BgaiR48e7q7FbY6W6/1y2ImIyBMcNort27ff8DWFQoG7777bLQV1NqtdQF6JDg8N7eftUtyCn5Qk zELCLCTMwnUOG8VP72R37tw5WZ3M/vpcPbpHBCMlItjbpRARyVK773Bns9nw7rvvuq2gznaq2ojb ekZ6uwy34firhFlImIWEWbiu3WelAwMD0djY6I5aOp3VLmD36VqM7RHh7VKIiGTL4RHFqlWrrttu aGhAdLQ8TgwfuNCAntEh6B7pv8NO/KQkYRYSZiFhFq5z2Ch+etJaq9XKZgbU1+frMbK7/G9ORETk TQ6HnjJ+AHyZAAAQ0UlEQVQyMq77I5cmUaW3YH9JA+7sH+PtUtyK69hImIWEWUiYhescNorq6mpP 1NHpjpQ1oW9MCGL95AZFRETe4rBRvPbaa56oo9N9da4ev8iI83YZbsfxVwmzkDALCbNwncNGoVLJ 7xP55YZmHK80cLYTEVEncNgoJk6ciM2bN0Ov11/3x5ftOVuP23tGQqt2aoUSWeP4q4RZSJiFhFm4 zuE76UcffQQAOHToUMvXFAoF1q5d676qXHSm2oDRqTyaICLqDApRhnch2rNnD4YNG3bTx87VmjDv 41P45+zBCOsCRxRERM4qKCjApEmT2v06ea0X7oRdp2swLi2CTYKIqJO02ig+/vhjT9bRaQ5cbMDP /ey+2G3h+KuEWUiYhYRZuK7VRnH48GFP1tEpfqwxwmITMYaznYiIOk2r4zN2u73N2U1ardYtBbli X7EOE3pHQanwvzvZtYZzxCXMQsIsJMzCda02ipKSEixYsOCmj/nqrKeCy42YO7qbt8sgIvIrrTaK 3r17Y+nSpZ6sxSWlDWaU1DdjQLx8bqrUGbjWvoRZSJiFhFm4zu1TgwoLC7F161YAQHZ2NgYNGtTq czds2ICysjIIgoD58+cjISHB6e9T2tiMXjEhCArwu4lcRERe1WqjuPPOO13euSAIyM3NRU5ODgBg xYoVyMjIgKKVcwiPP/44AODYsWPYtm1by7Yzvjpbj4Fd7GgC4PjrtZiFhFlImIXrWv34PXr0aJd3 XlFRgaSkJKhUKqhUKiQkJKCiosLh64KDgxEY2L6DnVPVRkzsE9XRUomIqBVuHXrS6/UIDQ3Fpk2b AAAajQZNTU1ISkpq83Vff/21wyOaa8cdv9mXh7LGUPSMDml5DJA+Sfjz9rVzxH2hHm9u/zQTb9fj ze2ioiLMmzfPZ+rx5va6deswePBgn6nH29sd4dYlPMrKyvDJJ59gzpw5EEURGzduxP3334/ExMRW X5Ofn4/KykrcddddrT7np0t4nKg0IOfzc/jn7MxOrV8OeKJOwiwkzELCLCQ+uYRHYmIiysvLW7Yr KirabBLnz5/HyZMn22wSN7PzVA2m9PXvO9m1hv8AJMxCwiwkzMJ1bh16UiqVmDlzJpYtWwYAyMrK annswIEDUKvV1x0ZvPHGG4iJicEf/vAHpKam4tFHH3Xq+5Q1mvGzoTw/QUTkDm6fHjtkyBAMGTLk hq+PGTPmhq915CI+ncmK45UG9I/TdKg+ueNhtYRZSJiFhFm4TvYXHRwt12NQQijCg7laLBGRO8i+ URTXmdA9MtjbZXgNPylJmIWEWUiYhetk3yiazHYkhau9XQYRkd+SfaMorjMhrQsfUXCtfQmzkDAL CbNwnawbhclqx7FKAzISu97SHUREniLrRnHoUiP6xoZ06duecvxVwiwkzELCLFwn60bxbbEOfWO7 5rRYIiJPkXWjqNJb0CemazcKjr9KmIWEWUiYhetk3SiabQL6ddEL7YiIPEW2jcJsE3BZ19ylr6EA OP56LWYhYRYSZuE62TaK8kYztOpABAfK9kcgIpIF2b7LXmxoRmxokLfL8DqOv0qYhYRZSJiF62Tb KHQmG5J5RTYRkdvJtlHUGKyI5EKAHH+9BrOQMAsJs3CdbBtFtcGCOA49ERG5nWwbRZPZjsQwDj1x /FXCLCTMQsIsXCfbRtHQbOOMJyIiD5DtO21Dsw3dInhEwfFXCbOQMAsJs3CdbBtFRRPPURAReYJs GwUAqDn0xPHXazALCbOQMAvXyfadNloTCIVC4e0yiIj8nmwbRa/oEG+X4BM4/iphFhJmIWEWrpNt o0iLYqMgIvIEGTeKrr1q7FUcf5UwCwmzkDAL18m2UcRwxhMRkUcoRFEUvV1Ee+3ZsweBSenITNJ6 uxQiItkoKCjApEmT2v06+R5RaHhEQUTkCbJtFHFaNgqA46/XYhYSZiFhFq6TbaNQBci2dCIiWZHt OYphw4Z5uwwiIlnpcucoiIjIM9goZI7jrxJmIWEWEmbhOjYKIiJqE89REBF1ETxHQUREbsFGIXMc f5UwCwmzkDAL1wW6c+eFhYXYunUrACA7OxuDBg3qlOcSEZHnuK1RCIKA3Nxc5OTkAABWrFiBjIyM m95sqD3PpetxrX0Js5AwCwmzcJ3bhp4qKiqQlJQElUoFlUqFhIQEVFRUuPxcIiLyLLc1Cr1ej9DQ UGzatAmbNm2CRqNBU1OTy8+l63H8VcIsJMxCwixc57ahJ61WC4PBgDlz5kAURWzcuBHh4eEuP/eq goICd5QtOxqNhln8B7OQMAsJs3Cd2xpFYmIiysvLW7YrKiqQmJjo8nMBdGgeMBERdYxbL7g7evRo y0ymrKwsZGZmAgAOHDgAtVp93UVzrT2XiIi8S5ZXZhMRkefwgjsiImoTGwUREbXJrVdmdxSv6L5e e37GDRs2oKysDIIgYP78+UhISPBUmW7X3r9rq9WK3/72t7j33nsxbdo0T5ToMe3Jora2FmvXroXd bkfv3r3xyCOPeKpMj2hPFnv37sXu3bsREBCABx54wO/eL06ePInNmzdj4MCBmD17dpvPbde/J9HH 2O12ceHChaLZbBbNZrO4aNEiURAEl58rVx39GYuKisT169d7oELP6EgOO3fuFFevXi3u2rXLQ1V6 RnuzePPNN8VTp055sELPaW8Wzz33nGi320WDwSC+/PLLHqzUM44ePSoeOnRI3Lx5c5vPa29uPjf0 xCu6r9fRnzE4OBiBgT55wNgh7c3BbDajsLAQw4cPh+hn8zXak4UgCKisrES/fv08XKVntPf3IiUl BSdOnEBBQQHS09M9WKlnZGZmQqvVOnxee3PzuXeSa6/SBtBylXZSUpJLz5Wrjv6MX3/9Ne68805P lOgR7c3hs88+w7Rp06DT6TxZpke0J4vGxkZYLBasXr0aRqMR06dPx8iRIz1dstu09/ciMzMTO3fu hM1mw9SpUz1Zqk9pb24+d0Rx9Srthx56CA8++CAMBoPDK7qdea5cdeRnzM/PR3JyMrp16+ahKt2v PTkYjUacOnUKQ4cO9XCVntHefyMajQbPPfccXnnlFXz88cewWCwerth92pNFZWUlCgoKsGDBArzy yivYvn27X2XRHu19X/G5Iwp3XtEtR+39Gc+fP4+TJ086PJElN+3J4dSpU7BarVizZg2qqqpgt9sx aNAgpKSkeKpct2pPFoGBgYiNjYVOp0N0dLRfDUcC7ctCEATY7XYAgCiKftsknBlqbe/7ik9ecMcr uq/XnjyeeuopxMTEQKlUIjU1FY8++qhXanaH9uRw1TfffAOz2ex3wwztyaKmpgYbNmyA0WjEmDFj /GpIEmhfFh999BFOnz4NQRAwbtw4TJgwwRslu80nn3yCI0eOQKfTYeDAgXjiiScAuP7e6ZONgoiI fIfPnaMgIiLfwkZBRERtYqMgIqI2sVEQEVGb/GuuHHlFbm4u9u7di+joaABAVFQUfve73zl8nV6v x6pVq1BbW4vp06fjnnvucXepbldWVoa8vDxkZ2ff8NjOnTsxefJkqFSqm742NzcX48aNQ3JysrvL dInRaEReXh6mTJni7VLIQzjriVyWm5uLkJAQ3H333R1+fXBwsF80irY8+eSTePXVVxEWFubtUlxS VVWFVatW4fXXX/d2KeQhPKKgTtHa5w2TyYR33nkHdXV1qK6uxujRo/HQQw85tc+r87wVCgWam5vx 4osvIjY2FsCVpSk2bNiApqYmiKKIRx55BL169XK4z+PHj+ODDz5AQkICLl++DK1Wi9/+9rct6+MU FRXhww8/BHBlWYPHH3+85XtWVVVh/fr1sFgsMJvNuP/++1uWw7BYLFi+fDmMRiPi4uKwYMGClu9p sViwbNky6HQ6vPrqqwgICMDTTz/dst/du3dj//79uHjxIhYtWtTyc5hMJjz77LP4y1/+gsDAQNjt dvzmN7/B6tWrERoaCkEQsGXLFvz444+w2+2YOnUqbr/9dqey/eabb3DixAmYzWbU1NSgX79+ePjh hwEAFy9exAcffACj0Yi6ujrMmjWr5ec8c+YM3nnnHVRVVWHRokUICwvDCy+80LLfffv24YsvvgAA 9OnTx+9Wqu2yOmXJQurSPvzwQ/Gpp54SlyxZIi5ZskTcuXPndY83NTWJoiiKZrNZfOKJJ8S6urob Xr9t27Yb9vviiy+KxcXFN/2eb775plhQUCCKoihWVVWJzz//vFO1Hjt2TJw/f75YX18viqIobtmy pWWlzYaGBnH+/PlibW2tKIqieOjQIXHRokUtr/373/8ubt++vc39Hz9+XHz11Vdv+tj8+fNbsriZ JUuWiOfOnbvua2vXrhUPHTokiqIo/vDDD+KaNWtaHtu9e7f43nvviaIoihaLRXz55ZfFysrKNuu7 6uuvvxbnzp0rXrp06YbHTCaTaLVaRVEUxeLiYvHpp5++7vGqqirx2WefveF1Fy9eFBcvXizabDZR FEXxb3/7m7h3716n6iHfxiMK6hRTp05tdehJqVTihx9+QHV1NYKCgqDT6RAVFeVwn5MmTcJbb72F YcOGYezYsdetXVVUVASdTodt27YBuHLvCb1e79TKmampqYiMjAQAjBs3Dhs3bgRw5dNy//79W861 jBw5Em+//Taam5sRHByMMWPGYMOGDaiursbIkSORkZFxw77FTh7JveOOO/Dpp59i5MiR2LdvHyZN mtTyWGFhIaqrq3H27FkAV45cSktLER8f79S+R40addNlTYKDg1FTU4OzZ8+iurr6hoUVW/sZi4qK UFNTg+XLlwO4soKvM38f5PvYKKhTtPbmceHCBaxduxaTJ09GWloawsPDnX4znTJlCn72s5/hyJEj WLNmDe677z6MHj0awJXms2DBAoSEhLhc99X1jxQKxQ21iaIIhUIBAEhPT8eqVatw+vRpfPrpp/j+ ++/dvkTKgAED8Le//Q1VVVW4cOECBg8e3PJYQEAAsrKyMHz48A7tu7W/h6+++gp79+7F1KlTMXDg QKf/vgIDAzFixAgON/khTo8ltyoqKsKwYcMwZcoUaDQaVFVVOf1aQRCgVqsxatQojB07FufOnWt5 bMSIES3nEq4+11lXPykDV5Zjv3pnr/T0dJw+fRo1NTUArqyPk5ycDLVa3fI9lEolBgwYgHvuuQdn zpxx+nsCgEqlavl07uybr0KhwLhx47BmzRqMGzfuusdGjBiBbdu2obm5uV37dCQ/Px/33Xcfxo4d i/Ly8hv2q1KpoNfrWzK/+vjQoUNx8ODB6+5r0NlHWOQdPKKgTnH1U/dPjRs3DqtXr8axY8fQrVs3 DBgw4Kb3iNi9ezfy8/ORk5PT8gn/3XffxdmzZyGKIiIiIjB37tyW5z/88MPYvHkzXnrpJQQFBSEp KQnz5s1zqs6UlBR8+OGHuHz5MmJjY/GrX/0KABAWFoZ58+bhz3/+MxQKBTQaDZ588smW1+bl5eHz zz+HUnnl89Vjjz3mdA4AMHnyZLz22muIi4vD2LFjrxtGasvPfvYzbN269YYpx+PHj4dOp8OSJUta pty+/PLLCA4Odmq/rdV61113Yf369YiKisKQIUOg1Wpbht8AIDIyEgMHDsSCBQsQERGBBx98EH36 9EF8fDzmzp2Lv/zlLy0ZzZo1C/3793eqHvJdnB5LXcrx48exY8eO62YlEVHbOPREXUpbn/iJ6OZ4 REFERG3iEQUREbWJjYKIiNrERkFERG1ioyAiojaxURARUZvYKIiIqE3/HzpY+D2g7JdUAAAAAElF TkSuQmCC "

This graph shows you the tradeoffs you're making between catching the stuff you want (true positive rate) and dealing with the stuff you don't want (false positive rate). It's a Very Useful Graph.

You might have a notion of how much money this model would save you, and want to graph that. Or maybe you care that some data is classified correctly more than other data, and you need to express that in some way. Or you're predicting the weather for sailors and you need to make sure that really extreme weather is handled well so that nobody dies.

This also isn't quite what I want to talk about, though! It's more than I feel that I can really do justice to, and I'm still learning how to think about it slowly. Here's what I actually want to talk about:

How well did it work in April?

Right now it's November. I'm working on a project that I started in October or so. I have some metrics we've decided on to measure whether the project is going well, and I want to know that it's making progress, and that the models we're building now are better than they were a month ago.

These are some questions I want to discuss in my talk:

  • How do you design a system where you can look up your model's performance from 6 months ago?
  • What if you change your mind after the fact about what metrics you wish you'd measured?
  • What if you use a lot of different tools to train models? (R! Python! Scala!)
  • How can you make it easy to use so that people, you know, actually use it?
  • And not spend a lot of time on building it.

More on this later, maybe.