Computers generate random number for everything from cryptography to video games and gambling. There are two categories of random numbers — “true” random numbers and pseudorandom numbers — and the difference is important for the security of encryption systems.
Computers can generate truly random numbers by observing some outside data, like mouse movements or fan noise, which is not predictable, and creating data from it. This is known as entropy. Other times, they generate “pseudorandom” numbers by using an algorithm so the results appear random, even though they aren’t.
This topic has become more controversial recently, with many people questioning whether Intel’s built-in hardware random number generator chip is trustworthy. To understand why it might not be trustworthy, you’ll have to understand how random numbers are genreated in the first place, and what they’re used for.
What Random Numbers Are Used For
Random numbers have been used for many thousands of years. Whether it’s flipping a coin or rolling a dice, the goal is to leave the end result up to random chance. Random number generators in a computer are similar — they’re an attempt to achieve an unpredictable, random result.
Random number generators are useful for many different purposes. Aside from obvious applications like generating random numbers for the purposes of gambling or creating unpredictable results in a computer game, randomness is important for cryptography.
Cryptography requires numbers that attackers can’t guess. We can’t just use the same numbers over and over. We want to generate these numbers in a very unpredictable way so attackers can’t guess them. These random numbers are essential for secure encryption, whether you’re encrypting your own files or just using an HTTPS website on the Internet.
True Random Numbers
You may be wondering how a computer can actually generate a random number. Where does this “randomness” come from. If it’s just a piece of computer code, isn’t it possible the numbers the computer generates could be predictable?
We generally group the random numbers computers generate into two types, depending on how they’re generated: “True” random numbers and pseudo-random numbers.
To generate a “true” random number, the computer measures some type of physical phenomenon that takes place outside of the computer. For example, the computer could measure the radioactive decay of an atom. According to quantum theory, there’s no way to know for sure when radioactive decay will occur, so this is essentially “pure randomness” from the universe. An attacker wouldn’t be able to predict when radioactive decay would occur, so they wouldn’t know the random value.
For a more day-to-day example, the computer could rely on atmospheric noise or simply use the exact time you press keys on your keyboard as a source of unpredictable data, or entropy. For example, your computer might notice that you pressed a key at exactly 0.23423523 seconds after 2 p.m.. Grab enough of the specific times associated with these key presses and you’ll have a source of entropy you can use to generate a “true” random number. You’re not a predictable machine, so an attacker can’t guess the precise moment when you press these keys. The /dev/random device on Linux, which generates random numbers, “blocks” and doesn’t return a result until it gathers enough entropy to return a truly random number.
Pseudorandom numbers are an alternative to “true” random numbers. A computer could use a seed value and an algorithm to generate numbers that appear to be random, but that are in fact predictable. The computer doesn’t gather any random data from the environment.
This isn’t necessarily a bad thing in every situation. For example, if you’re playing a video game, it doesn’t really matter whether the events that occur in that game are cased by “true” random numbers or pseudorandom numbers. On the other hand, if you’re using encryption, you don’t want to use pseudorandom numbers that an attacker could guess.
For example, let’s say an attacker knows the algorithm and seed value a pseudorandom number generator uses. And let’s say an encryption algorithm gets a pseudorandom number from this algorithm and uses it to generate an encryption key without adding any additional randomness. If an attacker knows enough, they could work backwards and determine the pseudorandom number the encryption algorithm must have chosen in that case, breaking the encryption.
The NSA and Intel’s Hardware Random Number Generator
To make things easier for developers and help generate secure random numbers, Intel chips include a hardware-based random number generator known as RdRand. This chip uses an entropy source on the processor and provides random numbers to software when the software requests them.
The problem here is that the random number generator is essentially a black box and we don’t know what’s going on inside it. If RdRand contained an NSA backdoor, the government would be able to break encryption keys that were generated with only data supplied by that random number generator.
This is a serious concern. In December 2013, FreeBSD’s developers removed support for using RdRand directly as a source of randomness, saying they couldn’t trust it. [Source] The output of the RdRand device would be fed into another algorithm that adds additional entropy, ensuring that any backdoors in the random number generator wouldn’t matter. Linux already worked in this way, further randomizing the random data coming from RdRand so that it wouldn’t be predictable even if there was a backdoor. [Source] In a recent AMA (“Ask Me Anything”) on Reddit, Intel CEO Brian Krzanich did not answer questions about these concerns. [Source]
Of course, this likely isn’t just a problem with Intel chips. FreeBSD’s developers called out Via’s chips by name, too. This controversy shows why generating random numbers that are truly random and aren’t predictable is so important.
To generate “true” random numbers, random number generators gather “entropy,” or seemingly random data from the physical world around them. For random numbers that don’t really need to be random, they may just use an algorithm and a seed value.