fanf: (Default)
[personal profile] fanf

https://dotat.at/@/2024-10-01-getentropy.html

A couple of notable things have happened in recent months:

UUID v4 and v7 are great examples of the need for high performance secure random numbers: you don't want the performance of your database inserts to be limited by your random number generator! Another example is DNS source port and query ID randomization which help protect DNS resolvers against forged answers.

I was inspired to play with getentropy() by a blog post about getting a few secure random bytes in PostgreSQL without pgcrypto: it struck me that PostgreSQL doesn't use getentropy(), and I thought it might be fun (and possibly even useful!) to add support for it.

I learned a few things along the way!

what is getentropy()?

A cryptographically secure pseudorandom number generator basically generates bulk random bytes using a stream cipher that is keyed and periodically re-keyed using some source of high-quality randomness. In NIST standards a CSPRNG is often referred to as a DRBG, deterministic random bit generator.

In the kernel, the high-quality randomness comes from things like the unpredictable timing of interrupts, hardware random number generators, or maybe an underlying hypervisor. The word "entropy" has often been used to refer to this distilled essence of randomness. Random bytes are made available to userland via interfaces such as /dev/urandom or getentropy().

A userland CSPRNG such as OpenSSL RAND(7) gets its the high-quality randomness from these kernel interfaces. A notable feature of getentropy() is that it will not produce an arbitrarily large number of bytes: it can provide just enough to securely key a userland CSPRNG.

portability of getentropy()

OpenBSD introduced getentropy() in 2014; it was added to Mac OS X in 2016, glibc in 2017, musl and FreeBSD in 2018, NetBSD and POSIX in 2024.

It's ubiquitous enough now that my code assumes that genentropy() exists without worrying.

There are a couple of issues that you are likely to encounter:

  • Originally genentropy() was declared in <sys/random.h> but POSIX declares it in <unistd.h>. You need to include both headers to be sure.

  • POSIX specifies a GETENTROPY_MAX macro in <limits.h> for the largest buffer getentropy() will fill. Most systems don't yet have this macro; if it isn't defined the limit is 256 bytes.

advantages of getentropy()

There are some annoying issues with /dev/random

  • You have to ensure the special file is present in containers or chroot() jails

  • It requires multiple system calls and a retry loop to get a few bytes of randomness

  • It can fail if a process hits its file descriptor limit

Cryptographic algorithms often need nonces that absolutely must never be repeated, otherwise the private key is leaked. So a CSPRNG must also avoid repeating output, which can be difficult when a process fork()s.

While writing this blog post, I discussed this fork() issue with Rich Salz (who re-wrote OpenSSL's RAND to use a NIST FIPS DRBG algorithm). He said RAND_bytes() uses getpid() to detect when the CSPRNG must be re-keyed because the process fork()ed. (Another way not used by OpenSSL is pthread_atfork().)

The kernel has to deal with the similar issue of ensuring its CSPRNG is re-keyed when a VM is cloned. However there isn't a way for a userland process to find out its VM has been cloned.

Unlike a stateful userland CSPRNG, if you call getentropy() directly, you don't have to worry about repeated output due to fork() or VM clones. You also don't have to worry about linking with a cryptographic library.

disadvantages of getentropy()

In princple, once a userland CSPRNG has been keyed it should be able to produce random bytes very fast. getentropy() is designed to do the keying; it is annoying to use for bulk random bytes because it needs a loop to get 256 bytes at a time.

So one might expect getentropy() to be slower than RAND_bytes().

Is that actually the case? Let's measure it!

performance of getentropy() vs RAND_bytes()

I wrote a simple benchmark to measure the performance of getentropy() and RAND_bytes() at a few different buffer sizes. It prints a table of nanoseconds per function call. It can be built with different versions of OpenSSL on a few different systems.

The results are more complicated than I expected!

  • Apple M1 pro / Mac OS 14 Sonoma
  • OpenSSL 1.1.1w / 3.3.1

     len entropy   3.3.1  1.1.1w
      16     495     244     626
      64     543     249     591
     256     550     277     613
    1024    2183     392     730
    

The behaviour of getentropy() on Mac OS is curious. The first time I run bentropy after a pause, getentropy() takes about 1µs. If I run it in quick succession, like bentropy && bentropy && bentropy, then genentropy() speeds up to about 0.5µs - which is what you can see in the table above. This speed-up also affects the OpenSSL timings.

OpenSSL 3.3 is substantially faster than 1.1. The OpenSSL timings are dominated by their startup latency, and not much affected by the buffer size for these relatively small lengths.

The time of one call to getentropy() is not much affected by the buffer size, but large buffers require multiple calls, so the time for 1024 bytes is about 4x the time for 256 bytes.

  • AMD Ryzen 7950 / Debian 11 bullseye
  • Linux 5.10 / OpenSSL 1.1.1w / BoringSSL

     len entropy openssl  boring
      16     476    1081     393
      64     659    1118     417
     256    1411    1218     548
    1024    5466    1593     916
    

BoringSSL is from Debian's android-libboringssl-dev package. I am mainly using it as a representative of more recent versions of OpenSSL to show that RAND_bytes() is a lot faster than it used to be.

It's weird that getentropy()'s time varies with the buffer size so much. Dunno what's up there!

  • Intel Xeon E3-1230 / FreeBSD 15 current
  • OpenSSL 3.0.14

     len entropy openssl
      16     707    1033
      64     697    1100
     256    1131    1080
    1024    4416    1289
    

This mainly shows the performance numbers for OpenSSL 3.0 on FreeBSD are similar-ish to OpenSSL 1.1.1w on Linux.

conclusions

  • getentropy() and RAND_bytes() are pretty close in performance!

  • OpenSSL RAND_bytes() generally beats getentropy(), which is what I expected based on the general principles of how the two functions work.

  • The exception is older versions of OpenSSL are slower than getentropy() for small buffers.

  • It will be interesting to see how vDSO-based getentropy() compares. I would expect its per-call overhead to be much lower, such that it might beat OpenSSL in more cases. Will it win for larger buffers, I wonder?

Maybe I should upgrade my Debian box to a newer kernel so I can try it out!

This account has disabled anonymous posting.
(will be screened if not on Access List)
(will be screened if not on Access List)
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

If you are unable to use this captcha for any reason, please contact us by email at support@dreamwidth.org

June 2025

S M T W T F S
1234567
8 91011121314
15161718192021
22232425262728
2930     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated 2025-06-12 08:33
Powered by Dreamwidth Studios