Julia Evans

Why Ruby’s Timeout is dangerous (and Thread.raise is terrifying)

This is already documented in Timeout: Ruby’s most dangerous API. And normally I don’t like making blanket statements about language features. But I had a bad day at work because of this issue. So today, we’re talking about Timeout! :)

First! What is Timeout? Let’s say you have a bunch of code, that might be slow. A network request, a long computation, whatever. Ruby’s timeout documentation helpfully says

Timeout provides a way to auto-terminate a potentially long-running operation if it hasn’t finished in a fixed amount of time.

require 'timeout'
status = Timeout::timeout(5) {
  # Something that should be interrupted if it takes more than 5 seconds...

AWESOME. This is so much easier than wrangling network socket options which might be set deep inside some client library. Seems great!

I tried using Timeout at work last week, and it resulted in an extremely difficult to track down bug. I felt awesome about tracking it down, and upset with myself about creating it in the first place. Let’s talk a little more about this (mis)feature.

Timeout: how it works (and why Thread.raise is terrifying)

Its implementation originally seems kind of clever. You can read the code here. It starts a new thread, which sets the original thread to x, sleeps for 5 seconds, then raises an exception on the main thread when it’s done, interrupting whatever it was doing.

  sleep sec
rescue => e
  x.raise e
  x.raise exception, message

Let’s look at the documentation on Thread.raise. It says:

Raises an exception (see Kernel::raise) from thr. The caller does not have to be thr.

This is where the implications get interesting, and terrifying. This means that an exception can get raised:

  • during a network request (ok, as long as the surrounding code is prepared to catch Timeout::Error)
  • during the cleanup for the network request
  • during a rescue block
  • while creating an object to save to the database afterwards
  • in any of your code, regardless of whether it could have possibly raised an exception before

Nobody writes code to defend against an exception being raised on literally any line. That’s not even possible. So Thread.raise is basically like a sneak attack on your code that could result in almost anything. It would probably be okay if it were pure-functional code that did not modify any state. But this is Ruby, so that’s unlikely :)

Timeout uses Thread.raise, so it is not safe to use.

Other languages and Thread.raise

So, how do other languages approach this? Go doesn’t have exceptions, Javascript doesn’t have threads – let’s talk about Python, Java, and C#, and C++.

Java has java.lang.Thread.stop, which does essentially the same thing. It was deprecated in Java 1.2, in 1998, disabled entirely in Java 8, and its documentation reads:

Deprecated. This method is inherently unsafe. See stop() for details. An additional danger of this method is that it may be used to generate exceptions that the target thread is unprepared to handle (including checked exceptions that the thread could not possibly throw, were it not for this method). For more information, see Why are Thread.stop, Thread.suspend and Thread.resume Deprecated?.

Python has thread.interrupt_main(), which does the same thing as Ctrl+C-ing a program from your terminal. I’m not really sure what to say about this – certainly using thread.interrupt_main() also isn’t really a good idea, and it’s more limited in what it can do. I can’t find any reference to anybody considering using it for anything serious.

C# has Thread.Abort() which throws a ThreadAbortException in the thread. Googling it finds me a series of StackOverflow discussions & forum posts about how it’s dangerous and should not be used, for the reasons we’ve learned about.

C++: std::threads are not interruptible.

Not just an implementation issue

This is not just an implementation issue in Ruby. The whole premise of a general timeout method that will interrupt an arbitrary block of code like this is flawed. Here’s the API again:

require 'timeout'
status = Timeout::timeout(5) {
  # Something that should be interrupted if it takes more than 5 seconds...

There is no way to safely interrupt an arbitrary block of code. Anything could be happening at the end of that 5 seconds.

However! All is not lost if we would like to interrupt our threads. Let’s turn to Java again! (you know all the times we say Ruby is more fun than Java? TODAY JAVA IS MORE FUN BECAUSE IT MAKES MORE SENSE.) Java has a Thread.interrupt method, which sends InterruptedException to a thread. But an InterruptedException is only allowed to be thrown at specific times, for instance during Thread.sleep. Otherwise the thread needs to explicitly call Thread.interrupted() to see if it’s supposed to stop.

On documentation

I don’t know. It’s possible that everybody knew that Timeout was a disaster except for me, and that I should have thought more carefully about what the implications of this Thread.raise were. But I’m thinking of making a pull request on the Ruby documentation with slightly stronger language than

Timeout provides a way to auto-terminate a potentially long-running operation if it hasn’t finished in a fixed amount of time.


Raises an exception (see Kernel::raise) from thr. The caller does not have to be thr.

The Java approach (where they deprecated it with a strong warning and then disabled the method entirely) seems more like the right thing.

How I got better at debugging

in debugging

I had a performance review last week where I was told, among other things, that I’m very good at debugging, especially difficult & confusing problems. I thought about this and I was like YEAH I AM. But I didn’t used to be. What happened?!

I sometimes hear advice to be extremely systematic and organized. I think that’s good advice and I told my partner this and he laughed because I am not the most systematic and organized person. But here are some things that I think have helped me anyway:

Remember that the bug is happening for a logical reason

Sometimes when I hit a bug, especially a nondeterministic and difficult to reproduce bug, it’s tempting to think “oh you know, things just happen, who knows”. But everything on a computer does in fact happen for a logical reason (however much the computer may try to convince you otherwise). Reminding myself of that helps me fix bugs. Also known as “OK JULIA IT IS NOT FAIRIES WHAT ACTUAL REASON COULD BE CAUSING THIS?”

Be unreasonably confident in my ability to fix the bug

I recently dealt with a performance problem in a job at work that took me 3 weeks to fix (see a millisecond isn’t fast). If I hadn’t been able to fix it, I would have felt pretty bad and like it was a waste of 3 weeks.

But we were processing a relatively small number of records, and it was taking 15 hours to do it, and it was NOT REASONABLE and I knew that the job was too slow. And I figured it out, and now it’s faster and everyone is happy.

(since I can now often actually fix bugs I tackle, perhaps this confidence is now reasonable :D)

Know more things

This TCP bug I talked about yesterday? I wouldn’t have been able to fix that in my first job out of grad school. I just didn’t understand enough about how computer networks work, or computers (I had an awesome math & theoretical CS degree and I did not learn anything about computers there.). And I didn’t know strace.

There’s a service at work that sometimes takes a long time to respond because of JVM garbage collection pauses. If you don’t know that a common source of latency issues on the JVM is garbage collection pauses (or worse, if you don’t know that garbage collection pauses are even a thing that happen), then you’re going to have a really bad day trying to figure that out.

Understanding the structure of the system I’m trying to debug and what some of the common failure modes are has been really indispensable to me.

Talk to other people

I sometimes just ramble into the Slack channel at work about the problem I’m working on, which sometimes looks like

julia: i have no idea why this bug is happening
julia: i mean I tried X and it is still happening
julia: and also W
julia: and also Z
julia: yayy
someone else: :)

Also sometimes if I start talking about it then someone will come and talk to me and say something helpful! It’s the best.

I got really stuck on that 3 week bug we talked about before and got on the phone to Avi, which was VERY USEFUL because he wrote the code that I was optimizing. So in that case I didn’t just need a rubber duck, I needed to talk to someone who knew more about the code (“oh yeah we haven’t optimized that part at all yet so it’s not a surprise that it’s slow!”).

I’ve gotten way better at figuring out what I don’t understand, articulating it, and asking about it.

Use strace

Seriously I could not fix bugs without strace.

More generally, being able to observe directly what a program is actually doing is incredibly valuable. I was trying to debug recently why a request I was sending to Redis was invalid. And I read the code, and asked other people, and they were like “huh that looks right”. AND THEN I REMEMBERED ABOUT TCPDUMP. (tcpdump shows you the TCP traffic coming in and out of a machine. it’s the best.)

So I ran tcpdump on a machine that I knew was sending (valid) requests to Redis, just looked at it as ASCII in my terminal, and then all the information was right there! And I copied the valid thing into what I was testing, and it totally worked and explained everything.

I like it more

I used to not really like debugging. But I started being able to solve harder bugs, and now when I find a thorny debugging problem it’s way more exciting to me than writing new code. Most of the code I write is really straightforward. A difficult bug is way more likely to teach me something I didn’t know before about how computers can break.

❤ debugging ❤

Why you should understand (a little) about TCP

in networking

This isn’t about understanding everything about TCP or reading through TCP/IP Illustrated. It’s about how a little bit of TCP knowledge is essential. Here’s why.

When I was at the Recurse Center, I wrote a TCP stack in Python (and wrote about what happens if you write a TCP stack in Python). This was a fun learning experience, and I thought that was all.

A year later, at work, someone mentioned on Slack “hey I’m publishing messages to NSQ and it’s taking 40ms each time”. I’d already been thinking about this problem on and off for a week, and hadn’t gotten anywhere.

A little background: NSQ is a queue that you send to messages to. The way you publish a message is to make an HTTP request on localhost. It really should not take 40 milliseconds to send a HTTP request to localhost. Something was terribly wrong. The NSQ daemon wasn’t under high CPU load, it wasn’t using a lot of memory, it didn’t seem to be a garbage collection pause. Help.

Then I remembered an article I’d read a week before called In search of performance - how we shaved 200ms off every POST request. In that article, they talk about why every one of their POST requests were taking 200 extra milliseconds. That’s.. weird. Here’s the key paragraph from the post


Ruby’s Net::HTTP splits POST requests across two TCP packets - one for the headers, and another for the body. curl, by contrast, combines the two if they’ll fit in a single packet. To make things worse, Net::HTTP doesn’t set TCP_NODELAY on the TCP socket it opens, so it waits for acknowledgement of the first packet before sending the second. This behaviour is a consequence of Nagle’s algorithm.

Moving to the other end of the connection, HAProxy has to choose how to acknowledge those two packets. In version 1.4.18 (the one we were using), it opted to use TCP delayed acknowledgement. Delayed acknowledgement interacts badly with Nagle’s algorithm, and causes the request to pause until the server reaches its delayed acknowledgement timeout..

Let’s unpack what this paragraph is saying.

  • TCP is an algorithm where you send data in packets
  • Their HTTP library was sending POST requests in 2 small packets

Here’s what the rest of the TCP exchange looked like after that:

application: hi! Here’s packet 1.
HAProxy: <silence, waiting for the second packet>
HAProxy: <well I’ll ack eventually but nbd>
application: <silence>
application: <well I’m waiting for an ACK maybe there’s network congestion>
HAProxy: ok i’m bored. here’s an ack
application: great here’s the second packet!!!!
HAProxy: sweet. we’re done here

That period where the application and HAProxy are both passive-aggressively waiting for the other to send information? That’s the extra 200ms. The application is doing it because of Nagle’s algorithm, and HAProxy because of delayed ACKs.

Delayed ACKs happen, as far as I understand, by default on every Linux system. So this isn’t an edge case or an anomaly – if you send your data in more than 1 TCP packet, it can happen to you.

in which we become wizards

So I read this article, and forgot about it. But I was stewing about my extra 40ms, and then I remembered.

And I thought – that can’t be my problem, can it? can it??? And I sent an email to my team saying “I think I might be crazy but this might be a TCP problem”.

So I committed a change turning on TCP_NODELAY for our application, and BOOM.

All of the 40ms delays instantly disappeared. Everything was fixed. I was a wizard.

should we stop using delayed ACKs entirely

A quick sidebar – I just read this comment on Hacker News from John Nagle (of Nagle’s algorithm) via this awesome tweet by @alicemazzy.

The real problem is ACK delays. The 200ms “ACK delay” timer is a bad idea that someone at Berkeley stuck into BSD around 1985 because they didn’t really understand the problem. A delayed ACK is a bet that there will be a reply from the application level within 200ms. TCP continues to use delayed ACKs even if it’s losing that bet every time.

He goes on to comment that ACKs are small and inexpensive, and that the problems caused in practice by delayed ACKs are probably much worse than the problems they solve.

you can’t fix TCP problems without understanding TCP

I used to think that TCP was really low-level and that I did not need to understand it. Which is mostly true! But sometimes in real life you have a bug and that bug is because of something in the TCP algorithm. So it turns out that understanding TCP is important. (as we frequently discuss on this blog, this turns out to be true for a lot of things, like, system calls & operating systems :) :))

This delayed ACKs / TCP_NODELAY interaction is particularly bad – it could affect anyone writing code that makes HTTP requests, in any programming language. You don’t have to be a systems programming wizard to run into this. Understanding a tiny bit about how TCP worked really helped me work through this and recognize that that thing the blog post was describing also might be my problem. I also used strace, though. strace forever.

Docker is amazing

I didn’t really understand what Docker was for until yesterday. I mean, containers. Cool. Whatever.

Yesterday I was trying to get an environment for some neural networks experiments (exciting post upcoming featuring THE QUEEN!). And it was, well, horrible. I needed Ubuntu 14.04, and I thought I was going to have reinstall my operating system, and I am apparently past the age when that was fun. I needed all these C++ things and nothing was working and I did not know what to do.


I downloaded an Ubuntu 14.04 image, and suddenly I just had Ubuntu 14.04! And I could INSTALL THINGS ON IT. Without worrying! And if I put them in a Dockerfile, it would let OTHER PEOPLE set up the same environment. Whoa.

And it’s fast! It’s not like a virtual machine which takes forever to start – starting a Docker container is just like, well, starting a program. Fast. Now I can run all my experiments inside the container and it’s not a problem.

Here’s what the Dockerfile looks like! It was really easy! I just do the same rando crap I might do to set up my environment normally (do you see where I install some packages, and then replace the sources.list from, uh, Debian wheezy, with an Ubuntu sources.list in the middle? yeah I did that.). But I don’t have to worry about screwing up my computer while doing it, and then even if it’s a mess, it’s a reproducible mess!

It includes the following gem:

RUN cd /opt/caffe \
   && cp Makefile.config.example Makefile.config \
   | base64 -d \ 
   | patch -u Makefile.config

where I edited a file manually, made a patch, base64 encoded it, and just pasted the string into the Dockerfile so that the edits I needed would work.

The next time I need to compile a thing with horrible dependencies that I don’t have on my computer and that conflict with everything, I’m totally using Docker.

PAPERS ARE AMAZING: Profiling threaded programs with Coz


I usually don’t read papers. I only read one paper so far this year (which I love). I only read it because my friend Maggie printed it out and gave it to me.

SO. Yesterday I got handed 3 (three!) printed out papers from the amazing organizers of Papers We Love Montreal. And I woke up this morning and started reading one (because, Saturday morning). And then I had to tell you all about it because this paper is so cool. Okay, enough backstory.

The paper we’re going to talk about is COZ: Finding Code that Counts with Causal Profiling (pdf). I found it super easy to read. Here is what I got out of it so far!

Profiling threaded applications is hard

Profiling single-threaded applications where everything happens synchronously is pretty straightforward. If one part of the program is slow, it’ll show up as taking 10% of the time or something, and then you can target that part of the program for optimization.

But, when you start to use threads, everything gets way more complicated. The paper uses this program as an example:

void a() { // ˜6.7 seconds
  for(volatile size_t x=0; x<2000000000; x++) {}
void b() { // ˜6.4 seconds
  for(volatile size_t y=0; y<1900000000; y++) {}
int main() {
  // Spawn both threads and wait for them.
  thread a_thread(a), b_thread(b);
  a_thread.join(); b_thread.join();

Speeding up one of a() or b() won’t help you, because they both need to finish in order for the program to finish. (this is totally different from if we ran a(); b(), in which case speeding up a() could give you an up to 50% increase in speed).

Okay, so profiling threaded programs is hard. What next?

Speed up one thread to see if that thread is the problem

The core idea in this paper is – if you have a line of code in a thread, and you want to know if it’s making your program slow, speed up that line of code to see if it makes the whole program faster!

Of course, you can’t actually speed up a thread. But you can slow down all other threads! So that’s what they do. The implemention here is super super super interesting – they use the perf Linux system to do this, and in particular they can do it without modifying the program’s code. So this is a) wizardry, and b) uses perf

Which are both things we love here (omg perf). I’m going to refer you to the paper for now to learn more about how they use perf to slow down threads, because I honestly don’t totally understand it myself yet. There are some difficult details like “if the thread is already waiting on another thread, should we slow it down even more?” that they get into.

omg it works

The thing that really impressed me about this paper is that they showed the results of running this profiler on real programs (SQLite! Memcached!), and then they could use the profiler results to detect

  • a problem with too many hash table collisions
  • unnecessary / inefficient uses of locking (“this is atomic anyway! no need to lock!”)
  • where it would be more efficient to move code from one thread to another

and speed up the program on the workload they were testing by, like, 10%!

They also find out places where speeding up a line of code would introduce a slowdown (because of increased contention around some resource). This paradoxically also helps them make code faster, because that’s a good site for figuring out why there’s a problem with contention and changing the ways the locks are set up.

Also, they claim that the overhead of this profiling is like 20%? How can this be. This seems like literally magic except that THEY EXPLAIN HOW IT WORKS. Papers. Wow.

Actually running the code

You can actually download the code on GitHub. I tried to compile it and it did not work the first time. I suspect this is because perf changes a little between different Linux versions (I get a bunch of errors about perf.h). It seems like this is something they’re working on. Maybe a future project will be to try to get it to compile and run it on a REAL PROGRAM and see if I can reproduce some of the things they talk about in the paper! We’ll see.

Async programming?!

Now I’m really curious about if we could do something similar for profiling single-threaded but asynchronous applications (for all the javascript programmers in the world!). Like, if you identified a function call you were interested in speeding up, you could slow down everything else running in the event loop and see if it slowed down the overall program. Maybe someone has already tried this! If so I want to know about it. (I’m @b0rk on twitter).

Okay, papers are cool. If you know me and want to print a paper you love and give it to me I’d be into it.

A millisecond isn’t fast (and how we made it 100x faster)

in performance

Hi friends! For the first time today I’m going to tell you about my DAY AT WORK (machine learning at Stripe) yesterday. =D. This is a collaboration with Kamal Marhubi who did this profiling with me after work because I was so mad about the performance.

I used to think a millisecond was fast. At work, I have code that runs some VERY_LARGE_NUMBER of times. It’s distributed and split up into tasks, and an individual task runs the code more than 6 million times.

I wrote a benchmark for the Slow Code and found it could process ~1000 records/s. This meant that processing 6 million things would take 1.5 hours, which is Slow. The code is kind of complicated, so originally we all thought this was a reasonable amount of time. But my heart was sad.

Yesterday Avi (who is the best) and I looked at why it was so damn slow (~1 millisecond/record) in some more depth. This code is open source so I can show it to you! We profiled using VisualVM and, after doing some optimizations, found out that it was spending all its time in DenseHLL$x$6. This is mystery Scala speak for this code block from Twitter’s Algebird library that estimates the size of a HyperLogLog:

  lazy val (zeroCnt, z) = {
    var count: Int = 0
    var res: Double = 0

    // goto while loop to avoid closure
    val arr: Array[Byte] = v.array
    val arrSize: Int = arr.size
    var idx: Int = 0
    while (idx < arrSize) {
      val mj = arr(idx)
      if (mj == 0) {
        count += 1
        res += 1.0
      } else {
        res += java.lang.Math.pow(2.0, -mj)
      idx += 1
    (count, 1.0 / res)

from HyperLogLog.scala

This is a little inscrutable and I’m not going to explain what this code does, but arrSize in my case is 4096. So basically, we have something like 10,000 floating point operations, and it takes about 1ms to do. I am still new to performance optimizations, but I discussed it with Kamal and we decided it was outrageous. Since this loop is hardly doing anything omg, the obvious target is java.lang.Math.pow(2.0, -mj), because that looks like the hardest thing. (note: Java is pretty fast. if you are doing normal operations like adding and multiplying numbers it should go REALLY FAST. because computers are fast)

(note: Latency Numbers Every Programmer Should Know is great and useful in cases like this! Many CPU instructions take a nanosecond or something. so 10K of them should be on the order of 10 microseconds or so. Definitely not a millisecond.)

Kamal and I tried two things: replacing Math.pow(2, -mj) with 1.0 / (1 << mj), and writing a lookup table (since mj is a byte and has 256 possible values, we can just calculate 2^(-mj) for every possible value up front).

The final performance numbers on the benchmark we picked were:

math.pow:         0.8ms
1.0 / (1 << mj):  0.017ms (!)
the lookup table: 0.008ms (!!!)

So we can literally make this code 100 times faster by just changing one line. Avi simultaneously came to the same conclusions and made this pull request Speed up HLL presentation by 100x. Hooray!

I’m learning intuitions for when code is slower than it should be and it is THE BEST. Being able to say “this code should not take 10s to process 10,000 records” is amazing. It is even more amazing when you can actually fix it.

If you’re interested in the rest of my day at work for some reason, I

  • worked with someone on understanding which of our machine learning models are doing the most work for us
  • wrote 2 SQL queries to help someone on the Risk team find accounts with suspicious activity
  • wrangled Scala performance (this) so that we can generate training sets for our machine learning models without tearing our hair out

Is machine learning safe to use?

in machinelearning

I’ve been thinking about this a lot because I do ML at work. here are a few of my current thoughts. I’d like to hear what you think on twitter – I think being responsible for the accuracy of a system that you don’t fully understand is scary/ very interesting.


FIRST: Can you understand your model? A regression with 10 variables is easy; a big random forest isn’t.

A model you don’t understand is

  • awesome. It can perform really well, and you can save time at first by ignoring the details.
  • scary. It will make unpredictable and sometimes embarrassing mistakes. You’re responsible for them.
  • only as good as your data. Often when I train a new model I think at some point “NO PLZ DON’T USE THAT DATA TO MAKE DECISION OH NOOOOO”

Some way to make it less scary:

  • have a human double check the scariest choices
  • use complicated models when it’s okay to make unpredictable mistakes, simple models when it’s less okay
  • use ML for research, learn why it’s doing better, incorporate your findings into a less complex system

Some easy statistics: Bootstrap confidence intervals

I am not actually on a plane to Puerto Rico, but I wrote this post when I was :)

Hey friends! I am on a plane to Puerto Rico right now. When is a better time to think about statistics?

We’ll start with a confession: I analyze data, and I rarely think about what the underlying distribution of my data is. When I tell my awesome stats professor friend this, she kind of sighs, laughs, and says some combination of

  • “oh, machine learning people…”
  • “well, you have a lot of data so it probably won’t kill you”
  • “but be careful of {lots of things that could hurt you}!”

So let’s talk about being careful! One way to be careful is, when you come up with a number, to build a confidence interval about how sure you are about that number. I think the normal way to do confidence intervals is that you use Actual Statistics and know what your distribution is. But we’re not going to do that because I’m on a plane and I don’t know what any of my distributions are. (the technical term for not knowing your distributions is “nonparametric statistics” :D)

So, let’s say I have some numbers like: 0, 1, 3, 2, 8, 2, 3, 4 describing the number of no-shows for flights from New York to Puerto Rico. And that I also have no idea what kind of distribution this number should have, but some Important Person is asking me how much it’s okay to oversell the plane by.

And let’s say I think it’s okay to have to kick people off the flight, say, 5% of the time. Great! Let’s take the 5th percentile!

> np.percentile([0, 1, 3, 2, 8, 2, 3, 4], 5)

Uh, great. the 5th percentile is there will be 0.35 people who don’t make the plane. This is a) not really something I can take to management, and b) I have no idea how much confidence I should have in that estimate, given that I only have 8 data points. And I have no distribution to use to reason about it.

Maybe I shouldn’t have switched to CS so I didn’t have to take statistics (true story). Or alternatively maybe I can BOOTSTRAP MY WAY TO A CONFIDENCE INTERVAL WITH COMPUTERS. If you’re paying close attention, this is like the A/A testing post I wrote a while back, but a more robust method.

The way you bootstrap is to sample with replacement from your data a lot of times (like 10000). So if you start with [1,2,3], you’d sample [1,2,2], [1,3,3], [3,3,1], [1,3,2], etc. Then you compute your target statistic on your new datasets. So if you were taking the maximum, you’d get 2,3,3,3, etc. This is great because you can use any statistic you want!

Here is some code to do that! n_bootstraps is intended to be a big number. I chose 10000 because I didn’t want to wait more than a few seconds. More is always better.

from sklearn.utils import resample
def bootstrap_5th_percentile(data, n_bootstraps):
    bootstraps = []
    for _ in xrange(n_bootstraps):
        # Sample with replacement from data
        samples = resample(data)
        # Then we take the fifth percentile!
        bootstraps.append(np.percentile(samples, 5))
    return pd.Series(bootstraps)

So, let’s graph it

data = [0, 1, 3, 2, 8, 2, 3, 4]
bootstraps = bootstrap_5th_percentile(data, 10000)


This is actually way more useful! It’s telling me I can oversell by 0 - 2 people, and I don’t have enough data to decide which one. I don’t know if I’d take this graph to airline executives (though everyone loves graphs right?!?!), but it’s for sure more useful than just a 0.35.

Thankfully in real life I would probably have more flights than just 8 to use to make this decision. Let’s say I actually had, like, 1000! Let’s start by generating some data:

data = np.random.normal(5, 2, 1000)
data = np.round(data[data >= 0]).astype(int)

Here’s a histogram of that data:



Now let’s take the 5th percentile!

np.percentile(data, 5)

Again, I don’t really feel good about this number. How do I know I can trust this more than the 0.35 from before? Let’s bootstrap it!

bootstraps = bootstrap_5th_percentile(data, 10000)


I feel a little better about calling it at 2 here.

The math

I have not explained ANY of the math behind why you should believe this is a reasonable approach, which if you are like me then you are super uncomfortable right now. For instance, obviously if I only have 1 data point, sampling with replacement isn’t going to help me build a confidence interval. But what if I have 2 points, or 5? Why should you take these histograms I’m building seriously at all? And what’s this business with not even caring about what distribution you’re using?

All worthwhile questions that we will not answer here today :).

Be careful

If occasionally 100 people don’t make the flight because they’re all from the same group and that’s important and not represented in your sample, bootstrapping can’t save you.

This method is the bomb though. It is basically the only way I know to get error bars on my estimates and it works great.

AdaCamp Montreal 2015

I went to AdaCamp these last couple of days. I want to talk about some of the awesome stuff that happened!

AdaCamp is an unconference, which means that people decide what the sessions will be about on the first day of the conference. Here are some things I’m thinking about!


I went to a really, really interesting session about software testing by someone who works as a software tester. I work as a developer, and I’ve never worked with a QA team! I didn’t know there were people who specialized in testing software and were really awesome at it who don’t write programs! This was super cool to learn. I still don’t know how to think about separating out the responsiblities of writing the software and verifiying the software – obviously individual developers also need to be responsible for writing correct software, and it still feels strange to me to hand any of that off.

But Camille Fournier told me on twitter about user acceptance testing and how you can have a QA team that checks that the software makes sense to users and, like, talks to them and stuff, not just software that’s theoretically correct. So that’s pretty cool.

Awesome people

I met a lot of really interesting people! I met sysadmins and people who had been programming for a long time and software testing and people who know a lot about science fiction and ham radio and bikes and publishing and zines and Quebec and libraries and Wikipedia (someone wrote their dissertation on Wikipedia. Wow.). I learned SO MUCH about Wikipedia. And almost all of those people identified as women! A++ would meet delightful people again.

Codes of conduct

This session convinced me open spaces are a good idea.

Initially I didn’t want to go because I was interested in some very specific aspects of codes of conduct (deescalating situations + how to make CoCs less intimidating to people who are genuinely good intentioned but not familiar with a given community + when to model behavior implicitly vs writing down explicit rules). And I told someone during a break that I didn’t want to go to the session because I thought people wouldn’t be discussing the thing I wanted to talk about.

And she said AWESOME. THOSE ARE AWESOME THINGS TO TALK ABOUT. COME WITH ME AND WE WILL TALK ABOUT THAT. And we did! And I don’t have answers about any of those things, but I got to hear some new perspectives and stories and now I know a couple more things. And the other people seemed to think the questions I had were interesting <3.

And it made me remember – when I think that I’m the only person who has a given concern or question or experience, I’m usually wrong :)


There were a lot of unstructured discussion sessions at AdaCamp. This was really cool, because it means you can cover a lot of ground. I also was reminded again of how important good moderation + faciliation is, and how much I want to get better at it. I’m working on learning how to:

  • create some explicit structure around a session (“let’s discuss these 4 topics, and spend ~15 minutes on each one. does that sound good?”)
  • tell someone when they’ve said enough <3 (“thanks so much! I’d love to hear from some people who haven’t said as much yet”)
  • move the discussion back on track if it’s veered away (“okay awesome! Does anyone have anything else to say about $topic, or should we move on to $next_thing?”)

People take up really incredibly different amounts of space in discussions, and I really really want to get better at making sure people who are quieter get a chance to say their super interesting things. Interrupting people is hard for me!

After AdaCamp I felt like there are a lot of great people in the world who are trying their best to do what’s right and have a lot of good ideas about how to do that and want to have the same conversations that I want to have. A little more than usual =)