Julia Evans

Diving into concurrency: trying out mutexes and atomics

in concurrency

I hadn’t written any threaded programs before yesterday. I knew sort of abstractly about some concurrency concepts (mutexes! people say compare-and-swap but I don’t totally get it!), but actually understanding a Thing is hard if I’ve never done it. So yesterday I decided to write a program with threads! In this post, we’re going to:

  1. Write a threaded program that gets the wrong answer because of a race condition
  2. Fix that race condition in C and Rust, using 2 different approaches (mutexes and atomics)
  3. Find out why Rust is slower than C
  4. Talk a little about the actual system calls and instructions that make some of this work

Spying on Hadoop with strace

in hadoop, strace

As you may already know, I really like strace. (It has a whole category on this blog). So when the people at Big Data Montreal asked if I wanted to give a talk about stracing Hadoop, the answer was YES OBVIOUSLY.

I set up a small Hadoop cluster (1 master, 2 workers, replication set to 1) on Google Compute Engine to get this working, so that’s what we’ll be talking about. It has one 14GB CSV file, which contains part of this Wikipedia revision history dataset

Let’s start diving into HDFS! (If this is familiar to you, I talked about a lot of this already in Diving into HFDS. There are new things, though! At the end of this we edit the blocks on the data node and see what happens and it’s GREAT.)

1
2
$ snakebite ls -h /
-rw-r--r--   1 bork       supergroup       14.1G 2014-12-08 02:13 /wikipedia.csv

Files are split into blocks

HDFS is a distributed filesystem, so a file can be split across many machines. I wrote a little module to help explore how a file is distributed. Let’s take a look!

You can see the source code for all this in hdfs_fun.py.

1
2
3
4
import hdfs_fun
fun = hdfs_fun.HDFSFun()
blocks = fun.find_blocks('/wikipedia.csv')
fun.print_blocks(blocks)

which outputs

1
2
3
4
5
6
7
8
9
10
 Bytes |   Block ID | # Locations |       Hostnames
 134217728 | 1073742025 |           1 |      hadoop-w-1
 134217728 | 1073742026 |           1 |      hadoop-w-1
 134217728 | 1073742027 |           1 |      hadoop-w-0
 134217728 | 1073742028 |           1 |      hadoop-w-1
 134217728 | 1073742029 |           1 |      hadoop-w-0
 134217728 | 1073742030 |           1 |      hadoop-w-1
 ....
 134217728 | 1073742136 |           1 |      hadoop-w-0
  66783720 | 1073742137 |           1 |      hadoop-w-1

This tells us that wikipedia.csv is split into 113 blocks, which are all 128MB except the last one, which is smaller. They have block IDs 1073742025 - 1073742137. Some of them are on hadoop-w-0, and some are on hadoop-w-1.

Let’s see the same thing using strace!

1
 $ strace -f -o strace.out snakebite cat /wikipedia.csv | head

Part 1: talk to the namenode!

We ask the namenode where /wikipedia.csv is…

1
2
3
4
5
6
connect(4, {sa_family=AF_INET, sin_port=htons(8020),
    sin_addr=inet_addr("10.240.98.73")}, 16)
sendto(4,
    "\n\21getBlockLocations\22.org.apache.hadoop.hdfs.protocol.ClientProtocol\30\1",
    69, 0, NULL, 0) = 69
sendto(4, "\n\16/wikipedia.csv\20\0\30\350\223\354\2378", 24, 0, NULL, 0) = 24

… and get an answer!

recvfrom(4,
"\255\202\2\n\251\202\2\10\350\223\354\2378\22\233\2\n7\n'BP-572418726-10.240.98.73-1417975119036\20\311\201\200\200\4\30\261\t
\200\200\200@\20\0\32\243\1\nk\n\01610.240.146.168\22%hadoop-w-1.c.stracing-hadoop.internal\32$358043f6-051d-4030-ba9b-3cd0ec283f6b
\332\206\3(\233\207\0030\344\206\0038\0\20\200\300\323\356&\30\200\300\354\372\32
\200\240\377\344\4(\200\300\354\372\0320\374\260\234\276\242)8\1B\r/default-rackP\0X\0`\0
\0*\10\n\0\22\0\32\0\"\0002\1\0008\1B'DS-3fa133e4-2b17-4ed1-adca-fed4767a6e6f\22\236\2\n7\n'BP-572418726-10.240.98.73-1417975119036\20\312\201\200\200\4\30\262\t
\200\200\200@\20\200\200\200@\32\243\1\nk\n\01610.240.146.168\22%hadoop-w-1.c.stracing-hadoop.internal\32$358043f6-051d-4030-ba9b-3cd0ec283f6b
\332\206\3(\233\207\0030\344\206\0038\0\20\200\300\323\356&\30\200\300\354\372\32
\200\240\377\344\4(\200\300\354\372\0320\374\260\234\276\242)8\1B\r/default-rackP\0X\0`\0
\0*\10\n\0\22\0\32\0\"\0002\1\0008\1B'DS-3fa133e4-2b17-4ed1-adca-fed4767a6e6f\22\237\2\n7\n'BP-572418726-10.240.98.73-1417975119036\20\313\201\200\200\4\30\263\t
\200\200\200@\20\200\200\200\200\1\32\243\1\nk\n\01610.240.109.224\22%hadoop-w-0.c.stracing-hadoop.internal\32$bd6125d3-60ea-4c22-9634-4f6f352cfa3e
\332\206\3(\233\207\0030\344\206\0038\0\20\200\300\323\356&\30\200\240\342\335\35
\200\240\211\202\2(\200\240\342\335\0350\263\257\234\276\242)8\1B\r/default-rackP\0X\0`\0
\0*\10\n\0\22\0\32\0\"\0002\1\0008\1B'DS-c5ef58ca-95c4-454d-adf4-7ceaf632c035\22\237\2\n7\n'BP-572418726-10.240.98.73-1417975119036\20\314\201\200\200\4\30\264\t
\200\200\200@\20\200\200\200\300\1\32\243\1\nk\n\01610.240.146.168\22%hadoop-w-1.c.stracing-hadoop.inte"...,
33072, 0, NULL, NULL) = 32737

The hostnames in this answer totally match up with the table of where we think the blocks are!

Part 2: ask the datanode for data!

So the next part is that we ask 10.240.146.168 for the first block.

1
2
3
4
connect(5, {sa_family=AF_INET, sin_port=htons(50010), sin_addr=inet_addr("10.240.146.168")}, 16) = 0
sendto(5, "\nK\n>\n2\n'BP-572418726-10.240.98.73-1417975119036\20\311\201\200\200\4\30\261\t\22\10\n\0\22\0\32\0\"\0\22\tsnakebite\20\0\30\200\200\200@", 84, 0, NULL, 0) = 84
recvfrom(5, "title,id,language,wp_namespace,is_redirect,revision_id,contributor_ip,contributor_id,contributor_username,timestamp,is_minor,is_bot,reversion_id,comment,num_characters\nIvan Tyrrell,6126919,,0,true,264190184,,37486,Oddharmonic,1231992299,,,,\"Added defaultsort tag, categories.\",2989\nInazuma Raigor\305\215,9124432,,0,,224477516,,2995750,ACSE,1215564370,,,,/* Top division record */ rm jawp reference,5557\nJeb Bush,189322,,0,,299771363,66.119.31.10,,,1246484846,,,,/* See also */,43680\nTalk:Goranboy (city),18941870,,1,,", 512, 0, NULL, NULL) = 512
recvfrom(5, "233033452,,627032,OOODDD,1219200113,,,,talk page tag  using [[Project:AutoWikiBrowser|AWB]],52\nTalk:Junk food,713682,,1,,210384592,,6953343,D.c.camero,1210013227,,,,/* Misc */,13654\nCeline Dion (album),3294685,,0,,72687473,,1386902,Max24,1156886471,,,,/* Chart Success */,4578\nHelle Thorning-Schmidt,1728975,,0,,236428708,,7782838,Vicki Reitta,1220614668,,,,/* Member of Folketing */  updating (according to Danish wikipedia),5389\nSouthwest Florida International Airport,287529,,0,,313446630,76.101.171.136,,,125", 512, 0, NULL, NULL) = 512
1
2
3
4
5
6
$ strace -e connect snakebite cat /wikipedia.csv > /dev/null
connect(5, {sa_family=AF_INET, sin_port=htons(50010), sin_addr=inet_addr("10.240.146.168")}, 16) = 0
connect(5, {sa_family=AF_INET, sin_port=htons(50010), sin_addr=inet_addr("10.240.146.168")}, 16) = 0
connect(5, {sa_family=AF_INET, sin_port=htons(50010), sin_addr=inet_addr("10.240.109.224")}, 16) = 0
connect(5, {sa_family=AF_INET, sin_port=htons(50010), sin_addr=inet_addr("10.240.146.168")}, 16) = 0
connect(5, {sa_family=AF_INET, sin_port=htons(50010), sin_addr=inet_addr("10.240.109.224")}, 16) = 0

This sequence matches up exactly with the order of the blocks in the table up at the top! So fun. Next, we can look at the message the client is sending to the datanodes:

1
sendto(5, "\nK\n>\n2\n'BP-572418726-10.240.98.73-1417975119036\20\311\201\200\200\4\30\261\t\22\10\n\0\22\0\32\0\"\0\22\tsnakebite\20\0\30\200\200\200@", 84, 0, NULL, 0) = 84

This is a little hard to read, but it turns out it’s a Protocol Buffer and so we can parse it pretty easily. Here’s what it’s trying to say:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
OpReadBlockProto
header {
  baseHeader {
    block {
      poolId: "BP-572418726-10.240.98.73-1417975119036"
      blockId: 1073742025
      generationStamp: 1201
    }
    token {
      identifier: ""
      password: ""
      kind: ""
      service: ""
    }
  }
  clientName: "snakebite"
}

And then, of course, we get a response:

1
2
3
4
5
6
7
recvfrom(5,"title,id,language,wp_namespace,is_redirect,revision_id,contributo
r_ip,contributor_id,contributor_username,timestamp,is_minor,is_bot
,reversion_id,comment,num_characters\nIvanTyrrell,6126919,,0,true,264190184,,
37486,Oddharmonic,1231992299,,,,\"Addeddefaultsorttag,categorie
s.\",2989\nInazumaRaigor\305\215,9124432,,0,,224477516,,2995750,ACSE,12155643
70,,,,/*Topdivisionrecord*/rmjawpreference,5557\nJebBush,1
89322,,0,,299771363,66.119.31.10,,,1246484846,,,,/*Seea

Which is just the beginning of a CSV file! How wonderful.

Part 3: Finding the block on the datanode.

Seeing the datanode send us the data is nice, but what if we want to get even closer to the data? It turns out that this is really easy. I sshed to my data node and ran

1
$ locate 1073742025

with the idea that maybe there was a file with 1073742025 in the name that had the block data. And there was!

1
2
3
$ cd /hadoop/dfs/data/current/BP-572418726-10.240.98.73-1417975119036/current/finalized
$ ls -l blk_1073742025
-rw-r--r-- 1 hadoop hadoop 134217728 Dec 8 02:08 blk_1073742025

It has exactly the right size (134217728 bytes), and if we look at the beginning, it contains exactly the data from the first 128MB of the CSV file. GREAT.

Super fun exciting part: Editing the block on the datanode

So I was giving this talk yesterday, and was doing a live demo where I was ssh’d into the data node, and we were looking at the file for the block. And suddenly I thought… WAIT WHAT IF WE EDITED IT GUYS?!

And someone commented “No, it won’t work, there’s metadata, the checksum will fail!”. So, of course, we tried it, because toy clusters are for breaking.

And it worked! Which wasn’t perhaps super surprising because replication was set to 1 and maybe a 128MB file is too big to take a checksum of every time you want to read from it, but REALLY FUN. I edited the beginning of the file to say AWESOME AWESOME AWESOME instead of whatever it said before (keeping the file size the same), and then a snakebite cat /wikipedia.csv showed the file starting with AWESOME AWESOME AWESOME.

So some lessons:

  • I’d really like to know more about data consistency in Hadoop clusters
  • live demos are GREAT
  • writing a blog is great because then people ask me to give talks about fun things I write about like stracing Hadoop

That’s all folks! There are slides for the talk I gave, though this post is guaranteed to be much better than the slides. And maybe video for that talk will be up at some point.

LD_PRELOAD is super fun. And easy!

On Monday I went to Hacker School, and as always it was the most fun time. I hung out with Chase and we had fun with dynamic linkers!

I’d been hearing for a while that you can override arbitrary function calls in a program using an environment variable called LD_PRELOAD. But I didn’t realize how easy it was! Chase and I started using it and we got it working in, like, 5 minutes!

I googled “LD_PRELOAD hacks”, clicked on Dynamic linker tricks: Using LD_PRELOAD to cheat, inject features and investigate programs, and we were up and running.

The first example on that page has you write a new random function that always returns 42.

1
2
3
int rand(){
    return 42; //the most random number in the universe
}

That is LITERALLY ALL THE CODE YOU HAVE TO WRITE. Then you

1
2
gcc -shared -fPIC unrandom.c -o unrandom.so
export LD_PRELOAD=unrandom.so

and now every program you run will always return 42 for rand()!

We did a bunch of investigations into how tmux works, which was super fun. Chase wrote it up on his blog, and now I understand about daemonization way better.

We very quickly ran into the question of “okay, what if you want to call the original printf?” from your hacked printf? That’s also explained in the “Dynamic linker tricks” article! (in the “Being transparent” section, using dlsym)

Somebody explained to me at some point that if you work for the NSA and you’re trying to spy on what information a program is using internally, tools like LD_PRELOAD are VERY USEFUL.

How it works

There is a very wonderful 20 part series about linkers that I am going to keep recommending to everyone forever.

When you start a dynamically linked program, it doesn’t have all the code for the functions it needs! So what happens is:

  • the program gets loaded into memory
  • the dynamic linker figures out which other libraries that program needs to run (.so files)
  • it loads them into memory, too!
  • it connects everything up

LD_PRELOAD is an environment variable that says “Whenever you look for a function name, look in me first!”. So if you didn’t want your program to be attacked like this, you could:

  • statically link your program
  • check for the LD_PRELOAD environment variable, and complain (though the attacker could also LD_PRELOAD the function that lets you read environment variables… :) )

I’m sure there will be more Exciting Stories about LD_PRELOAD for you all in the future, but this is all the stories I have for today.

PyData NYC (I gave a machine learning talk! yay!)

in conferences, machinelearning, talks

This past weekend I went to PyData NYC. It was super fun! I got to meet some people who do machine learning in different fields than me (THE BEST) and a guy at the bar told me about his classmates’ adventures in hacking login systems in Novell Netware. (Also the best. It turns out that one of the nice things about having a blog is that people occasionally come up to me at conferences and tell me GREAT THINGS).

And I gave my first-ever machine learning talk! I often feel like the main (only?) thing I’ve learned about machine learning so far is how important it is to evaluate your models and make sure they’re actually getting better. So I gave a talk called “Recalling with precision” about that (a terrible pun courtesy of @avibryant).

The basic structure:

  1. You’re building models, and you do lots of experiments
  2. You want to remember all of the results forever, so that you can tell if you’re improving
  3. Some arguments for why it’s important to evaluate whether your models actually work (for an example of what can go wrong, see The Value Added Teacher Model Sucks).
  4. “Just remember everything forever” is much easier said than done
  5. So let’s describe a lightweight system to actually remember some things forever! (basically: store training results in S3 forever; make a webapp to draw lots of graphs using those results)
  6. It exists! It’s open source. http://github.com/stripe/topmodel

(It’s still pretty new. If you try it and have thoughts, let me know?)

People asked lots of questions, so I think it may have been useful. It’s hard to tell, with talks :)

Some things I learned at PyData: (links largely taken from Michael Becker’s great wrap-up post)

Harm reduction for developers

Harm reduction is an idea in public health that says basically: people are going to do risky activities (intravenous drug use, sex, drinking alcohol, maybe abusing alcohol), and instead of saying “just say no to drugs!”, we can choose to help make those activities less risky.

Some examples of measures:

  • needle exchanges and safe injection sites
  • designated drivers and free taxis home
  • safer sex classes

I don’t want to trivialize any of these issues, but I think the idea that switching out YOU’RE WRONG TO DO THAT JUST DON’T DO THAT for doing something to help people be safer is super powerful and useful, and I see it in discussions of software development all the time.

Fun with machine learning: does your model actually work?

in machinelearning

I’m writing a talk for PyData NYC right now, and it’s the first talk I’ve ever written about what I do at work.

I’ve seen a lot of “training a model with scikit-learn for beginners” talks. They are not the talk I’m going to give. If you’ve never done any machine learning it’s fun to realize that there are tools that you can use to start training models really easily. I made a tiny example of generating some fake data and training a simple model that you can look at.

But honestly how to use scikit-learn is not something I struggle with, and I wanted to talk about something harder.

I want to talk about what happens after you train a model.

How well does it work?

If you’re building a model to predict something, the first question anyone’s going to ask you is:

“So, how well does it work?”

I often feel like the only thing I’ve ever learned about machine learning is how important it is to be able to answer this question, and how hard it is. If you read Cathy O’Neil’s blog posts about why models to measure teachers’ teaching are flawed, you see this everywhere:

we should never trust a so-called “objective mathematical model” when we can’t even decide on a definition of success

If it were a good model, we’d presumably be seeing a comparison of current VAM scores and current other measures of teacher success and how they agree. But we aren’t seeing anything like that.

If your model is actually doing something important (deciding whether teachers should lose their jobs, or how risky a stock portfolio is, or what the weather will be tomorrow), you have to measure if it’s working.

What women in technology really think (150 of them, at least)

spoiler: EVERYONE IS DIFFERENT AND PEOPLE THINK LOTS OF DIFFERENT THINGS

I’ve read tons of articles about or by women who work as developers. I read a thing yesterday and it was the best thing I’ve ever read. Yesterday I found a link to a survey on Twitter asking for the experiences of women who work in technology, and did the survey. There’s question at the end that asks

Any other comments you’d like to make?

I didn’t think much of it at the time – I said something (that I’ve had an incredibly good experience, but that it makes me very angry that other people have bad experiences). But then I looked at everyone else’s responses. It’s an amazing example of a wide range of women’s opinions and experiences, and I think you should read it. I’ve formatted all of the responses to that question a little better, or you can see the full survey results.

Working remote, 8 months in (seeing humans is important!)

in remote

I wrote up what it was like to be working remote 3 months after I started. 5 months later, I have some new thoughts!

The worst thing about working remotely so far has just been feeling generally alienated. I talked a litle about motivation in Don’t feel guilty about not contributing to open source, where I mentioned a theory that motivation is made up of competence (I know how to do this!), autonomy (I can make my own decisions!), and relatedness (I know why I’m doing this!).

It turns out that a) this is called self-determination theory, and b) I totally misunderstood what “relatedness” meant. It turns out that relatedness is actually about feeling connected to the people about you (“the universal want to interact, be connected to, and experience caring for others”). It’s the opposite of feeling alienated :)

I didn’t visit the office for 4 months and that was a mistake! Turns out that if I don’t see a group of people for way too long then it’s easy to feel like nobody cares about me and everything I do is terrible, and it has all kinds of strange negative unforeseen consequences. Visiting SF for a while made everything approximately 100x better. In gifs (of course) seeing people can be the difference between:

and

Right now I feel a lot more like gif #2, which is pretty great.

How to set up a blog in 5 minutes

Some people at Hacker School were asking for advice / directions for how to set up a blog. So here are some directions for a simple possible way!

There are lots of ways to set up a blog. This way will let you write posts with Markdown, version them with Git, publish them with git push origin gh-pages, and, most importantly, think for exactly 0 seconds about what your site should look like. You’ll need a working Ruby environment which is the hardest part. I use rbenv to manage my Ruby. I have spent months being confused about how to make Ruby work, so If you also need to set up Ruby it will take more than 5 minutes and this will be a total lie.

But! If you do, there is no excuse for you to not have a blog within 5 minutes (or at least not more than an hour). It took me 40 minutes but that was because I was also writing this blog post.

I used to worry a lot about what my website looked like, and then I realized if I wrote interesting blog posts basically nobody cared!