Procedural Generation - How Does It Work?a universe from a single seed

Procedural what?

Procedural generation is a set of techniques used heavily in video game development and computer graphics. It provides methods for generating large amounts of content algorithmically rather than manually: the software itself can generate your game's maps, weather and even music rather than it having to be hand-crafted by artists and designers. Thanks to this the game's size on disk can be much reduced, it can include way more content and bigger worlds, and it can present highly unpredictable and replayable experiences.

As with any tool pros come with cons: whilst procedural generation can bring great benefits in terms of memory usage, artist time and replayability, these experiences are in danger of feeling poorly designed or unnatural. You don't want your RPG's dungeon to be riddled with tediously empty rooms or overpowered monsters, so tuning the generated content to the intended gameplay is just as much as part of the game developer's craft here as creating it in the first place.

In this article we'll go through the basic concepts powering procgen games with some interactive examples. The few code snippets are in Javascript, but it's by no means essential that you understand these in order to follow along.

Flatland

Let's work on a little project. We're going to make a game right here in the browser. Well, a game world at least - all that running about and jumping and animation can wait for another day.

We'll call it Flatland. Flatland is a 2-dimensional place with solid ground and clear skies. Scroll left and right to really drink in the vistas.

I think it's very exciting, but I made it so I would. You probably think it's a little boring. However, Flatland represents pretty much the prototypical, trivial example of procedural generation. It's just generated from some pretty uninspired procedures.

The game running up above is drawing its terrain by following a very simple process: for each x value (that is, horizontal coordinate), it calls a function called getGroundHeight with the value of x. The function returns a number representing how high or low the ground should be at that x value.

function getGroundHeight(x) {
  return 0;
}

This is the getGroundHeight powering Flatland - for any value of x imaginable, it just says "the ground should be at height 0". Hence the flatness. I've drawn in the result of each call to getGroundHeight with black dots to show you where they land.

But don't scoff just yet - this is an infinite world! I could say "hey, getGroundHeight, what's the height of the ground at the x value of 11924294719272?" and it'd be like "still 0". It wouldn't take it any time, it wouldn't need to go off and load a map file from disc or over the internet, it'd just be able to tell me.

Making waves

Obviously the Steam reviews aren't going to be lighting up over Flatland. We need the land to be a little bit less flat. We'll need a new implementation of getGroundHeight (and... I guess a new name).

Introducing Waveland!

function getGroundHeight(x) {
  return Math.sin(x);
}

Now we're cooking! This is maths! By using a slightly more interesting and repeating function like sine (or Math.sin() in Javascript), we've got ourselves some satisfying hills.

Just like in Flatland, this function provides us with an infinite amount of ground for our game - Math.sin(1347108) is nice and bounded somewhere between -1 and +1 just as much as Math.sin(0) is, so we're guaranteed never to get some crazy high hill or deep valley that the player can't traverse. Scroll the game left and right and check it out. Don't go looking for anything more interesting than those wavy hills, though - sine just repeats and repeats to infinity in both directions.

Randomness and unpredictability

Still not good enough, really, is it? Waveland's gently undulating hills might work in some specific cases but its terrain isn't very realistic or gripping.

So what's the problem?

Essentially, it's that the functions we've been using thus far are too predictable. Without even bothering to actually calculate any concrete values of getGroundHeight I can tell you what Flatland or Waveland will look like 10, 100 or 1000000 steps further in the x direction. "Flat" or "wavey", respectively.

With that in mind, we need to introduce some randomness to the equation. I'm going to throw a definition in here that we'll explore more in the coming examples: random noise functions.

A random noise function is a function N that can be given any real value and satisfies two conditions:

(1) N(x) is between -1 and 1 (inclusive) for any value of x.

(2) N has some perceived degree of randomness.

It's hard to find a precise definition that's agreed on by everyone who uses random noise, and obviously point (2) is quite subjectively phrased, but this is a form that I think most would be happy with. I hope you can see that by this definition Flatland and Waveland's generators are not noise functions, since although they satisfy point (1), they fall short of (2).

Let's take a crack with a new ground generator that satisfies both conditions to be our first fully-qualified noise function. Another function means another game name, so say hello to Noiseland!

function getGroundHeight(x) {
  return (Math.random() * 2) - 1;
}

Ah.

No, this won't do. This won't do at all.

What happened? This totally random noise is indeed random, but really it's a little bit too random. Mathematically, it's not continuous, meaning that close-together x values can lead to far-apart height values, and a set of hills you could fall over and impale yourself on.

(A little aside to explain that bit of code: Math.random() gives a random number between 0 and 1, so when we multiply by 2 and subtract 1, all we're doing is getting a random number between -1 and 1 as per our definition of noise).

Thankfully, a man named Ken Perlin did a lot of work on coming up with a continuous, but still random-seeming noise function (as did a load of other people, but his is probably the most widely-used noise function in modern practice). He produced the function now known as Perlin noise for which he achieved an Academy Award for Technical Achievement, such is the usefulness of the function to the digital arts. He actually developed it for Disney's Tron, which is neat.

Perlin noise is a little bit too complex to include in a code snippet like I did above, but all you really need to know for now is that we have some function of x, just like all the ones above, which return a value in [-1, 1], just like all the ones above. It's just that this particular function gives a much more natural feel. Take a look. Click regenerate and scroll around a few times to get a feel for the kind of output we get here.

That's a bit more like it! These hills (on the whole) look a lot more realistic, are unpredictable but guaranteeably continuous. As far as our definition is concerned, this is great noise.

Another noise function named Simplex noise is very widely used. It aims to provide essentially the same effect as Perlin noise but with a few improvements to the speed it can be calculated. The terrain below is generated using Simplex noise: give it a couple of refreshes and you should see that it gives pretty similar results to the above.

Extra credit

The above is the basic concept of procedurally generated terrain for a two-dimensional world. Before we move on to talking about the practicalities of making a replayable game with random numbers, let's look at a few more neat techniques based around noise.

Since we can play arbitrarily with the amplitude (height) and frequency (wide...ness) of the wave shapes produced by these noise functions, we can play one final trick to make our hills even lovelier. In the world below (Perlinland?), we generate a "base" ground map from Perlin noise just like we have been up to this point (that's the black dots). We then generate another, lower-amplitude and higher-frequency Perlin noise (the pink / red dots) and add that noise to the base noise. This gives a cool, weathered-looking landscape with a bit more richness.

Layering like this is a valuable technique in coercing the relatively smooth outputs of noise functions into the various shapes and textures desired for creative effect. I mentioned in the introduction that the tuning and shaping of procedurally generated content is a huge part of the skill of creating games that use it. It's one of the reasons that it's wrong to think of procedural generation as a way for games designers to escape creative, artistic work. It just takes a slightly different form than painting textures and hand-crafting maps.

What about some weather? Let's add some rain. Guess how we'll do it!

Let's add another noise-generated layer, this time for some measure of rain intensity. I've drawn on this layer as blue dots below, along with a rain threshold - when the rain intensity goes above this level, it rains. When it's below the line, it doesn't.

We can keep going on this theme as long as we like. For example, why don't we throw in a temperature function? It could tell us whether our rain should actually be snow! This time we have a yellow series and threshold - when the temperature's below the level, we make the rain into snow.

Get it? We can keep using different noise functions here and there throughout our game to add loot, determine how much foliage should appear on our hillsides, calculate risk of enemy encounters, ... the list goes on.

Find out how to help me write more like this

True randomness

Let's talk about the "regenerate" buttons in those examples above. Every time you click them, you get a different set of Perlin or Simplex noise-powered hills. This works because noise functions such as these rely heavily on generating lots of random numbers, which it turns out are slightly tricky beasts.

To start with, truly random numbers sometimes don't look very random. It's stupid but it's true. A true random number generator (they do exist) can happily spit out a sequence like:

0, 0, 0, 0, 0, 0, 0.2, 0, 0, 0.1, 0, 0, ...

Which is pretty much a recipe for boring hills. Theoretically the above sequence is no less likely to be produced by a true random number generator than any other more interesting sequence.

Secondly, truly random numbers are too random. By which I mean that they're entirely unpredictable, and with unpredictability comes unreproducibility. It's great for your game to be able to spit out any one of millions of different possible worlds, but for many reasons it's very important to games developers (and software developers generally) to be able to have the game do the same thing over and over too. Automatic testing relies heavily on this quality since it's hard to define what should be produced by your code if by its very nature that changes every time it runs.

The players care, too: lots of games give players the ability to share unique IDs, usually referred to as seeds (we're getting to that), which will guaranteeably always generate the exact same world. There are communities dedicated to listing seeds that generate especially good worlds. How is this possible if the algorithms rely on random numbers?

The answer is... they don't. They rely on psuedorandom numbers, which are a different kettle of fish entirely.

Pseudorandom numbers are generated (shock) by pseudorandom number generators, or PRNGs. Avoiding a very formal technical definition, PRNGs are procedures for generating long lists of random-seeming numbers which avoid the two problems with truly random numbers I've spoken about above. Two consecutive numbers generated by a PRNG are vanishingly unlikely to be the same, and I can force a PRNG to generate the exact same sequence of numbers as I once saw it do in the past.

This is achieved by using something called a seed (sometimes called a random seed).

PRNGs are always created with an initial input. This input can take many forms - a number between 0 and 1, or an integer, or a string, whatever. The important thing is that once the PRNG has been seeded with this value, the sequence of pseudorandom numbers it produces after that point will be exactly the same every time.

A scene from No Man's Sky featuring a spaceship on a beach

For a cool example, look no further than No Man's Sky, Hello Games's procedural poster child. In No Man's Sky there are no map seeds distributed to the players, but they're integral to the entire experience. Or I should say, it is. Every copy of No Man's Sky is distributed with the same base seed in the code. This means that the infinite universe of all possibility that is the game will be the same infinite universe of all possibility for every single player. And not a byte of it needs to be stored on some server, or on your hard disk, because every bit of it is procedurally generated.

The same way, every time.

Great, show me more

Gladly! All of the above is just the tip of a huge, infinite iceberg. Let's wrap up with some nice examples and pictures - procedural generation is a rich field and we've just talked about some of the basics.

A lot of this post focussed on what we can do with a one-dimensional noise function such as Perlin or Simplex Noise. In fact, we can generate two-, three-, and any general n-dimensional versions of these noises. If you imagine one-dimensional noise as a single line on a graph that rises and falls just like the hills in our simple games above, two-dimensional noise looks something like a sheet that's been laid over some hills in the real world.

A 2D heightmap drawn as a black and white mesh

The top half of that image is as I was describing, and the bottom half is another way to visualise two-dimensional noise. Usually called a heatmap, in this case low noise values have been coloured in black, high in white and everything in between in various grays. I've set up a live version of this for two-dimensional Simplex noise below:

There's loads of stuff we can do with this kind of noise! We could use it to generate height, temperature and precipitation maps for terrain over a 3-dimensional map's surface as we did in two dimensions above. Visual effects also become available to us - if we continuously generate similar two-dimensional slices like I showed above, we get something that looks pleasantly like the surface of water (press "Play" and excuse the resolution):

If you make it red, it serves as a pretty passable flame. These techniques allow for good graphical textures to be generarated on-the-fly by the game rather than shipping with them in storage! In fact, the ability of Perlin noise to emulate various natural processes and textures is specifically what Ken Perlin won that Academy Award for.

I'll finish up there - I think this is a super interesting topic, and there's so much more to it than I've covered here. I've been deliberately light on procedural generation's application to graphics and sound since I wanted to focus on the example of world generation, but rest assured it's used for those and much more. For example, this little app lets you enter a seed from which to generate a piece of music. A quick search for "procedurally generated graphics" or "procedurally generated music" will throw up all kinds of rabbit holes for you to follow!

Anyway, ta-ra for now! See you next time. Don't forget to follow on Twitter or via RSS for updates in future.

Support me