{ on programming and the internets }


by Louis Brandy

Using genetic algorithms to find Starcraft 2 build orders

UPDATE: Just to make this explicitly clear: I did not write this program. The fourth sentence makes this fairly clear, but some comments indicate that some people are a bit lazy :) . I also added a video of the rush to the bottom.

I’ve been on a bit of starcraft kick recently (see my replay aggregator: replayspider). I also work in computer vision, and general AI has always been an interest (see this post on “evolving” faces). Sometimes strange interests collide. Over on a forum called teamliquid, a user by the name of Lomilar posted a fairly long thread about a program he had written that optimized build orders for the zerg race in starcraft. He eventually cleaned up his code and posted the code to googlecode. The program is called EvolutionChamber (a clever name, as it’s the name of one of the buildings in the game), and it uses genetic algorithms to find build orders.

This I had to see.

A quick Starcraft 2 primer: setting up the rules

Since this is primarily a programming blog, I’m going to assume you know a fair bit about video games. Feel free to skip this section if  you know anything about starcraft 2, in particular. Essentially, SC2 is a real-time strategy game that starts you with a simple base, some workers, and some resources, and asks you to put those workers to work collecting resources that you can use to build things (including other workers), eventually building up an army and killing your opponent. A build order refers to the exact opening steps you take early in the game that best supports the strategy you are trying to conduct. These early games are all about balancing spending money on your economic foundation (making more workers), making units (rush!), or building new buildings or getting new upgrades (tech!).

Build orders generally only cover the very early game because once you’ve scouted the enemy, you have to begin to react to what he’s doing and modifying it as you go. In other words, and a perfect aphorism, no battle plan survives first contact with the enemy. In this way, build orders in real-time strategy are very much akin to openings in chess. They set up the soul of the entire game about to be played, and some players prefer to force certain types of games, depending on what kind of opening they choose to do.

One of the reasons build-order optimization is so important is that you can discover openings that “hard-counter” other openings. If I can get an army of N size into your base when you do opening X, you will always lose.

Enough with the abstract, here’s what you need to know:

  1. The program in question optimizes Zerg build orders (which is one race in starcraft), this is a rather significant choice because the mechanics of the zerg race are arguably the most difficult to manage (esp. for build order optimization).
  2. Of most interest are “rush” build orders. This means “how quickly can I get N of this type of unit?”.
  3. There are two primary resources that workers collect in starcraft: gas and minerals.
  4. Zerg also have a third de factor resource: larva. Larva are used to create ALL zerg units, including workers.  So long as you have less than three,  they regenerate at a fixed rate (note: this means any time spent at three larva delays all future larva production — very bad).
  5. Most units require some building to be constructed in order to be “unlocked” (and many of these buildings require others as prerequisites – this is the so-called tech tree)
  6. Creating a building causes you to lose the worker who creates it (so the longer you can wait, the more resources that worker can collect before building the building)

These “rules” provide for an extremely complicated search space to find optimal build orders. Essentially, you want to make the exact number of workers you need as quickly as possible (and then no more). Losing a worker when you make a building, and delaying all future larva when at three larva make the dynamics extremely complicated.

How EvolutionChamber works

At its core, the program is a genetic algorithm. For those of you who don’t know, a genetic algorithm is a type of optimization algorithm that tries to find optimal solutions using a method analogous to biologic evolution (to be specific: descent with modification & natural selection). Put simply, you take a “population” of initial build orders, evaluate them for fitness, and modify the population according to each element’s fitness. In other words, have the most successful reproduce.

The program’s input is simply the desired game state. In practice, this means “make N units” to determine some rush build order (but it also allows for other types of builds, like make N workers with some defensive structures and a small army). Here are some of the highlights:

  1. It’s written in Java using JGAP.
  2. A ‘chromosome’, in this case, is an array of ‘actions’ that can be done in game. (e.g, 1) Build a drone. 2) Build a drone. 3) Build a spawning pool. 4) Build an overlord. And so on.)
  3. Invalid actions (ie, trying to build a unit you cannot build because you do not have the tech necessary) are ignored (this allows for “junk dna”).
  4. An action that can’t be done YET (not enough minerals!) causes the simulation to wait until it can be done.
  5. It uses some fairly standard mutation types (deletion, insertion, and one strange one called “overlording” — heh)
  6. It uses the “many villages” approach where there are several separate populations evolving independently.
  7. Populations that are deemed to be stagnant are annihilated and replaced by a variant of the most successful.
  8. The fitness function is really a measure of distance from the “desired” state and the current state (this is measured by the difference in resources required to get there), taking into account the time required (less time is always better).

The 7-roach rush

For the starcraft nerds among you, here’s one of the very first builds constructed by the program:

10 extractor-trick to 11
11 overlord
11 spawning pool
15 extractor
16 queen (stop drones here)
18 overlord
18 roach warren
17 overlord (yes, two)
spawn-larva on queen when she pops
roach x7

This is a fairly fascinating build order in a number of respects.

First, from a starcraft perspective: it is incredibly strong. To be clear, I am certain virtually anyone who practiced this build and went onto the ladder and used it in every game would very easily rise to diamond level (currently the highest league). The seven roaches at that time in the game will destroy all but the most well-executed counter-builds. It caused an almost immediete stir on the starcraft forums, and had one player proclaiming that an all-in variant (“all-in” coming from poker and to mean that you win or lose right now, in this case it means attacking with all your workers as well as your army) was completely unstoppable in a certain matchup.

Second, as far as I know, this build was “discovered” by the program (or at least, it’s never been well known). There is a similar build that’s been well known called the 5-roach-rush (the 5-roach-rush comes later, and is a bit more economical). When comparing the two, the 5RR has certain situational advantages, but the 7RR build, above, has two staggeringly obvious advantages: 1) you get two more roaches, 2) you get them almost 45 seconds sooner. I’m not 100% certain this build was “discovered” by the program, but I do know it’s not been extremely popular or considered standard play so my guess is that it’s not been studied in too much detail.

The most interesting part of this build, however, is how counter-intuitive it is. It violates several well-known (and well-adhered-to) heuristics used by Starcraft players when creating builds. Some of this may be lost on you non-starcraft players, but I’ll do my best to explain.

Extractor trick.

The extractor trick is using a drone to build an extractor (remember this removes the drone), then build a replacement drone, then cancel the extractor, giving you the original drone back — this allows you to build one more worker than your supply would allow. The extractor trick, as used above, has been tested and seen to be economically inferior to a more standard play of buying an overlord on 9 supply. The extractor trick + very early spawning pool do some economic damage and induce a small larva delay, so they are almost never seen. In this case, however, the extractor trick + early pool end up speeding up the entire tech path (this is the primary reason why the 7RR produces roaches much sooner than the 5RR).

Double Overlord at 18/17.

First, a quick discussion about supply. Each unit you create costs supply. So long as you have supply, you can make units. Most RTS games have a similar concept. Overlords, for the zerg, are the unit that provides this supply. So for zerg, you have to spend 100 minerals to unlock additional supply. In general, it  is considered “optimal” for you to have just enough supply to not be supply-blocked. It’s generally considered wasteful to buy supply when you don’t need it (since you could spend that money, instead, on units or workers now, and buy the supply later, when you need it).

In almost all cases, it would be extremely wasteful to purchase supply twice in a row. I’m not sure any starcraft player would look at such a build and consider that a good idea. In this case though, it ends up  working out so perfectly that you’d actually have to try it any other way to understand how and why. Because your desired goal is 7 roaches, you will need to construct 2 more overlords at some point, but doing them both so early is certainly surprising. During that particular period of the build (around 18 supply), you end up waiting for your roach warren to finish so you can begin creating roaches.  This causes you to max out on larva, stopping regeneration. By moving the second overlord so far up into the build, this larva ends up being “free” — you’d lose it anyway because the regeneration would stop. So the only penalty to making the second overlord early is minerals, and during this portion of the build, you are not mineral-bound. The net result is that moving that overlord so far up into the build costs nothing and frees up larva regeneration to produce quicker roaches.

This is the type of non-obvious optimization that genetic algorithms excel at.

As for the 7-roach-rush, I’m certain if you are playing starcraft2, you’ll see this build quite a bit. As for whatever hidden and game-breaking builds remain undiscovered, that remains to be seen.

UPDATE:

Here’s a video of someone doing the ‘all-in’ variant of this build against a real player:

Thanks to rockpapershotgun.com for the writeup and pointing me to this video.

Things that I think I think

1. Not everything I think I think can be a “full” blog post.

2. Negative Information Theory

It’s a little appreciated fact that being frequently wrong is (adjusted for a prioris) as difficult as being frequently right. For work, I once wrote a pixel classifier that, given an eye region of an image, would identify red-eye pixels. My classifier worked exactly as I intended it (ie, it was debugged) and yet it was wrong about 80% of the time. It was simultaneously a terrible and terribly effective algorithm.

3. A Trick for Sales Projections

For internal use, halve it. For external use, double it. Twice.

4. The web is such a friggin hack

Imagine if I told you we were going to build a system to run our company. We’d use a domain specific language for the data-store, a second language for the back-end code which is run on a server written in a third language, a fourth xml-based approach to explain the documents, a fifth DSL to explain the styling of said document, a sixth language to write the front-end code, and a seventh “language” that is actually a library that makes the sixth language usable for our purposes. Would anyone think I was crazy?

Not that the web can easily be “fixed” but the existence of things like “reset.css” or tables vs css are not just excellent examples of the huge pile of sand people have built castles on, but more precisely, are a testament to a slaughtered paradigm.

5. The “one-page resume is a myth” myth

I’ve looked at so many resumes. I try not to be biased but I’ll be honest. Long resumes are not as good as focused resumes. There isn’t some rule where over a certain length goes into the trash, it’s just that everyone gets about the same amount of time, and if yours is long, I’m more likely to miss something good.

6. All my best work is done in the shower…

Programming work, that is. This is why I never feel guilty if I leave 15 minutes early. I traded 15 terribly unproductive minutes for 15 terribly productive ones.

Feeling productive…

I realized something important about productivity. There is an important difference between feeling it, and doing it. One of those things that feels productive, but isn’t necessarily, is making a list. If you write down all the things you need to do, somehow, somewhere, that feels like you’ve accomplished something. Except, of course, you haven’t.

Another thing that feels productive, but isn’t, is blogging. That is the primary reason I took a nice long vacation from writing here. I’m the kind of person that always has a side project. Sadly, I found that I was spending a few hours on a Sunday writing and that I felt “productive”, satisfying my need to be doing something. Except I wasn’t. My list was just getting longer.

So I took a few months “off” to actually focus on some side projects, and, lo and behold, things got done. Excellent. One of those side projects (not really “side”) is our work blog (not up yet) that will be focusing on face recognition issues. Face recognition is a fairly fascinating topic and so I’ll probably be writing a fair bit for the company blog on the issue in the coming weeks/months.

As for side projects, here’s one of them: replayspider, a starcraft 2 replay aggregator. If you aren’t a starcraft 2 replay junky, like me, you’ll likely be very uninterested. Scrape all the good sites (esp. the Korean and Chinese ones), download all the replays, extract the most useful metadata, and let people search for replays in a sane way.

It’s not even close to where I want it, but at least it’s launched. I’ll circle back to it when I’ve got a few more minutes. Next feature idea: trying to detect maphacks from the replays themselves.

Me vs Backspace: my war against a rogue character

Amazingly, this is my second story about me and a single character.

A while ago, I began receiving reports from various people in IRC that I was prepending a garbage ascii character to some of my messages. I found this very odd. The complaints didn’t stop. It became a running joke. Eventually, playful mockery. It is a bit demeaning to hang out in programmer corners and not have any idea how, what, or why your computer is misbehaving.

After a bit of digging, I discovered that the rogue in question was the character for backspace. Now, this is an odd character to be inserted into text because most often this character doesn’t actually go anywhere, it informs the program that you want to, you know, backspace. This story gets a bit tricky here because not only does this character rarely make it into text, but most programs won’t render it. Only certain programs, on certain platforms, will actually render the backspace character as anything (usually a box with a little 0×0008, or as ^H). So while other people were telling me about rogue characters, I never saw them.

My first suspicion was that the IRC program I was using, Colloquy, was bugged (spoiler alert: it’s not). I spent a long time asking colloquy/irc people, searching the googles, and otherwise trying to figure out how this rogue character was getting printed. I continued to fail. I continued to be mocked.

The plot thickens…

One day I recieved an instant message from my brother: “What is that weird character at the beginning of your messages?”. Wait? What? My instant messages have the same character? Surely Adium isn’t suffering from the same bug, is it? How have I not heard of this before?

It turns out that most instant messaging programs don’t render the backspace character. Pidgin, however, does. And that day my brother was using Pidgin, and he saw the character of doom. I eventually noticed that I was apparently inserting rogue character using Safari, and later Firefox, all on my work computer (a Mac). It’s important to note that none of my Mac programs — Chrome, Safari, Firefox, Adium, or Colloquy — would render the character, but they were all inserting it at the beginning of lines into random things. (You could see the character if you viewed my text using Firefox on windows or linux, for example).

I decided to wipe my Mac, reinstall the OS, and see if that fixed it. It didn’t. I began to suspect this was a hardware problem having to do with my Mac. I spent a long time asking mac people, searching the googles, and otherwise trying to figure out how this rogue character was getting printed. I continued to fail. I continued to be mocked.

Luckily, my Mac was breaking anyway…

After this went on for close to a year, my Mac began having all kinds of problems including its charging circuitry, so I decided to replace my mac (or rather, lobby to have my work replace my mac). I bought a bright shiny new macbook pro and was good to go. I was pretty sure this would solve the problem.

Very soon thereafter, before I setup my new computer, I went to my Hacker News profile and saw this message: http://news.ycombinator.com/item?id=1531448 A user by the name of ‘davidw’ informed me that the stupid character appeared in a comment I wrote. There’s only one problem: I wrote that message using my wife’s mac. (if my life was a tv show, this is where the screen would go black — end of episode).

So I get my new computer setup, and ask the people in IRC if the character was still happening. It was! Gahh!!! This was good news, and bad news. It finally meant that I’ve reproduced this problem on three different macbooks in a variety of programs. It had to be something I was doing that was causing this.

I spent the next half-hour in IRC, talking to myself, trying to reproduce the problem. I finally figured it out.

TLDR: It’s a Mac bug (I think)

You can see the nasty character, live, in this comment: http://news.ycombinator.com/item?id=1530832. For many of you, however, this comment will look completely normal. That is because the program you are using doesn’t render it. If you, for example, use firefox in linux, you will see the rogue character. I’ve got a screenshot for you:

Here’s how to reproduce the problem, exactly: On a mac laptop (and maybe others), you need to hold down LEFT SHIFT and an arrow key (LEFT ARROW works great). While holding those two buttons, press backspace (or, to be precise, DELETE). The tricky part, however, is verifying that you’ve actually sent a rogue character, since none of the standard mac programs (that I know of) will render the character. The way that I verified the problem was to use Colloquy to send IRC messages to myself, reading those messages using emacs-irc. There are other ways, I am sure.

Why I kept doing it: I often use the shift-<arrow> shortcut to highlight various bits of text. Most often I tend to do shift-UP to highlight an entire line and then hit DELETE to remove it. Obviously, if I do this incorrectly, it ends up replacing the entire line with the backspace character. As I type in the new line, it comes out as a perpended backspace character on my final submission.

The million dollar question: is this a bug, or is there some feature I’m inadvertantly using? Anyone got any bright ideas on how I could fix this (other than unlearning my shift-UP habits)?

Silence is not golden.

So RethinkDB wants to know where all the real programmers are. I do too. While I can commiserate with their post, there is one huge issue I have with it. That is the issue of silence.

Let me start by saying that the primary purpose of RethinkDB’s post wasn’t to discuss their interview philosophy (it was to generate good leads — if you are looking for work, by all means, go apply). That means this post is probably slightly unfair to them in that I’m about to construct a gigantic strawman out of their post. However, I believe this template is so common that it still applies, in general.

As I’ve said here from time to time, I do quite a bit of the HR stuff for our small (startup — when are we no longer a startup?) company. I’ve been doing technical interviews fairly frequently for a few years now. One of the biggest lessons I’ve learned as an interviewer is that silence is often times my fault, not theirs. If an interview is mostly silence, the interviewee was either woefully unqualified (ie, it should have never gotten this far), or the interviewer lost them.

Silence is not a sign of just ignorance, it is almost always a sign of ignorance mixed with confusion or panic. If you ask someone a direct question “do you know what X is?”, they may think for a minute, but they will usually say “no”. Silence, however, usually means “I do not 100% know what they want me to say, and I need to think of the answer that is least wrong”.

Let’s use one of their examples.

How would you implement a read-write lock?

It doesn’t surprise me such a question would result in silence. It’s actually two questions. Do you know what the term “read-write lock” means and how would you actually implement one? For someone who was recently a student, or someone who hasn’t done much concurrency recently, this sounds an awfully lot like an exam question.

On the inside it’s a bunch of panic…

What -is- a read-write lock? Is that what you are asking me? Am I supposed to know what a read-write lock is? What class was that in? The one with the philosopher’s dinner table? Yea, probably. It’s some kind of concurrency thing, probably. It’s probably related to reading and writing? Uhm, that’s not a very good hint… Are they really asking me for the definition? Am I going to get marked down for not knowing the definition?

On the outside, it’s silence.

Say no to silence, interviewer and interviewee alike

You don’t actually have to code it over the phone. Mentioning starvation issues is bonus points. For heaven’s sakes, just give us something.

We try to ask about the difference between cooperative and preemptive multitasking. We try to ask about condition variables. 19 out of 20 times there is silence on the other end.

There’s a good lesson for everyone here. Silence is pure poison in an interview. There should be all of five seconds of silence before you give up and answer their question with a question. If you are being interviewed, say something. “Let me make sure I understand what a read-write lock is, first. Is that when you can have many readers and no writers, or just one writer?” Even if you are wrong, you can at least get them to specify what they mean, and then you can try to answer the question that’s been asked.

Even saying “I don’t know what a read-write lock is.” is almost certainly better than pure silence. Yes, it’s a negative statement, but if I tell you, and then you proceed to discuss it flawlessly, I’ll chalk it up to your textbook using a different term (or you just forgetting the vocabulary). No harm, no foul.

And then there is the flipside of this conversation: the cold, hard, reality that every interviewer faces from time to time. If an inordinate amount of interviewees are troublingly silent, it might be because your standard are really high. There is, however, another explanation.

If you are the interviewer, you have to realize that silence breeds silence. It’s usually easy to break down questions into parts that make it much less likely to be ambiguous and promote that silent, confused, panicked state. First, ask them if they know what a read-write lock is. Then ask them what it is. And once you’ve both agreed on the definition, ask them how they’d implement one. This simple modification results in far more “yes” and “no”, and far less confused silence.


new server, if you see badness, please email me!

Need a new job?