fanf | getentropy() vs RAND

https://dotat.at/@/2024-10-01-getentropy.html

A couple of notable things have happened in recent months:

There is a new edition of POSIX for 2024. There's lots of good stuff in it, but today I am writing about getentropy() which is the first officially standardized POSIX API for getting cryptographically secure random numbers.
On Linux the getentropy(3) function is based on the getrandom(2) system call. In Linux 6.11 there is a new vDSO call, vgetrandom(), that makes it possible to implement getentropy() entirely in userland, which should make it significantly faster.

UUID v4 and v7 are great examples of the need for high performance secure random numbers: you don't want the performance of your database inserts to be limited by your random number generator! Another example is DNS source port and query ID randomization which help protect DNS resolvers against forged answers.

I was inspired to play with getentropy() by a blog post about getting a few secure random bytes in PostgreSQL without pgcrypto: it struck me that PostgreSQL doesn't use getentropy(), and I thought it might be fun (and possibly even useful!) to add support for it.

I learned a few things along the way!

what is getentropy()?

A cryptographically secure pseudorandom number generator basically generates bulk random bytes using a stream cipher that is keyed and periodically re-keyed using some source of high-quality randomness. In NIST standards a CSPRNG is often referred to as a DRBG, deterministic random bit generator.

In the kernel, the high-quality randomness comes from things like the unpredictable timing of interrupts, hardware random number generators, or maybe an underlying hypervisor. The word "entropy" has often been used to refer to this distilled essence of randomness. Random bytes are made available to userland via interfaces such as /dev/urandom or getentropy().

A userland CSPRNG such as OpenSSL RAND(7) gets its the high-quality randomness from these kernel interfaces. A notable feature of getentropy() is that it will not produce an arbitrarily large number of bytes: it can provide just enough to securely key a userland CSPRNG.

portability of getentropy()

OpenBSD introduced getentropy() in 2014; it was added to Mac OS X in 2016, glibc in 2017, musl and FreeBSD in 2018, NetBSD and POSIX in 2024.

It's ubiquitous enough now that my code assumes that genentropy() exists without worrying.

There are a couple of issues that you are likely to encounter:

Originally genentropy() was declared in <sys/random.h> but POSIX declares it in <unistd.h>. You need to include both headers to be sure.
POSIX specifies a GETENTROPY_MAX macro in <limits.h> for the largest buffer getentropy() will fill. Most systems don't yet have this macro; if it isn't defined the limit is 256 bytes.

advantages of getentropy()

There are some annoying issues with /dev/random

You have to ensure the special file is present in containers or chroot() jails
It requires multiple system calls and a retry loop to get a few bytes of randomness
It can fail if a process hits its file descriptor limit

Cryptographic algorithms often need nonces that absolutely must never be repeated, otherwise the private key is leaked. So a CSPRNG must also avoid repeating output, which can be difficult when a process fork()s.

While writing this blog post, I discussed this fork() issue with Rich Salz (who re-wrote OpenSSL's RAND to use a NIST FIPS DRBG algorithm). He said RAND_bytes() uses getpid() to detect when the CSPRNG must be re-keyed because the process fork()ed. (Another way not used by OpenSSL is pthread_atfork().)

The kernel has to deal with the similar issue of ensuring its CSPRNG is re-keyed when a VM is cloned. However there isn't a way for a userland process to find out its VM has been cloned.

Unlike a stateful userland CSPRNG, if you call getentropy() directly, you don't have to worry about repeated output due to fork() or VM clones. You also don't have to worry about linking with a cryptographic library.

disadvantages of getentropy()

In princple, once a userland CSPRNG has been keyed it should be able to produce random bytes very fast. getentropy() is designed to do the keying; it is annoying to use for bulk random bytes because it needs a loop to get 256 bytes at a time.

So one might expect getentropy() to be slower than RAND_bytes().

Is that actually the case? Let's measure it!

performance of getentropy() vs RAND_bytes()

I wrote a simple benchmark to measure the performance of getentropy() and RAND_bytes() at a few different buffer sizes. It prints a table of nanoseconds per function call. It can be built with different versions of OpenSSL on a few different systems.

The results are more complicated than I expected!

Apple M1 pro / Mac OS 14 Sonoma

OpenSSL 1.1.1w / 3.3.1

 len entropy   3.3.1  1.1.1w
  16     495     244     626
  64     543     249     591
 256     550     277     613
1024    2183     392     730

The behaviour of getentropy() on Mac OS is curious. The first time I run bentropy after a pause, getentropy() takes about 1µs. If I run it in quick succession, like bentropy && bentropy && bentropy, then genentropy() speeds up to about 0.5µs - which is what you can see in the table above. This speed-up also affects the OpenSSL timings.

OpenSSL 3.3 is substantially faster than 1.1. The OpenSSL timings are dominated by their startup latency, and not much affected by the buffer size for these relatively small lengths.

The time of one call to getentropy() is not much affected by the buffer size, but large buffers require multiple calls, so the time for 1024 bytes is about 4x the time for 256 bytes.

AMD Ryzen 7950 / Debian 11 bullseye

Linux 5.10 / OpenSSL 1.1.1w / BoringSSL

 len entropy openssl  boring
  16     476    1081     393
  64     659    1118     417
 256    1411    1218     548
1024    5466    1593     916

BoringSSL is from Debian's android-libboringssl-dev package. I am mainly using it as a representative of more recent versions of OpenSSL to show that RAND_bytes() is a lot faster than it used to be.

It's weird that getentropy()'s time varies with the buffer size so much. Dunno what's up there!

Intel Xeon E3-1230 / FreeBSD 15 current

OpenSSL 3.0.14

 len entropy openssl
  16     707    1033
  64     697    1100
 256    1131    1080
1024    4416    1289

This mainly shows the performance numbers for OpenSSL 3.0 on FreeBSD are similar-ish to OpenSSL 1.1.1w on Linux.

conclusions

getentropy() and RAND_bytes() are pretty close in performance!
OpenSSL RAND_bytes() generally beats getentropy(), which is what I expected based on the general principles of how the two functions work.
The exception is older versions of OpenSSL are slower than getentropy() for small buffers.
It will be interesting to see how vDSO-based getentropy() compares. I would expect its per-call overhead to be much lower, such that it might beat OpenSSL in more cases. Will it win for larger buffers, I wonder?