![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
https://dotat.at/@/2023-06-23-random-double.html
Here are a couple of algorithms for generating uniformly distributed
floating point numbers 0.0
<=
n <
1.0
using an unbiased
random bit generator and IEEE 754 double precision arithmetic. Both of
them depend on details of how floating point numbers work, so before
getting into the algorithms I'll review IEEE 754.
The first algorithm uses bit hacking and type punning. The second uses a hexadecimal floating point literal. They are both fun and elegant in their own ways, but these days the second one is better.
ieee 754
A floating point number has three parts:
- a sign bit
- an exponent
- a significand aka mantissa aka fraction
It works like a binary version of decimal scientific notation, so its value is
+/- 1.fff...fff * 2ᴱ
exponent
Instead of using two's complement, the exponent is stored as a biased integer,
E = eeeeeeeeeee - 1023
The maximum value of the exponent bits is 2047, and the bias is half
that, so 2⁰
is represented as 1023 in the exponent bits.
(A consequence of using biased integers instead of two's complement is that IEEE 754 floating point numbers sort correctly if you treat them as sign-magnitude integers.)
fraction
In decimal scientific notation, a number is normalized by adjusting its exponent so its leftmost non-zero digit (most significant digit) is immediately to the left of the decimal point.
Similarly, a floating point number has its leftmost non-zero bit immediately to the left of its binary point. Because a non-zero bit is always 1, there is no need to store it explicitly. So double-precision numbers actually have 53 significant bits, of which 52 appear in their representation.
the problem
To get a uniform distribution of random
floating point numbers 0.0
<=
n <
1.0
:
- half of the random results must be
0.5
<=
n<
1.0
, withE = -1
; - a quarter of the results must be
0.25
<=
n<
0.5
, withE = -2
; - an eighth must be
0.125
<=
n<
0.25
, withE = -3
; - etc. usw.
bithacking solution
Observe that floating point numbers 1.0
<=
n <
2.0
span the same range of values as we want,
but all the numbers have the same exponent E = 0
.
So the idea is to use bitwise integer operations to put together a number with,
- sign bit = 0
eeeeeeeeeee = 1023 = 0x3FF
so thatE = 0
fff...fff
filled with unbiased random bits
Then the bits of the integer are reinterpreted as the bits of a floating point number, subtract 1.0, and that's the result.
However, C's strict aliasing rules make it difficult to do this kind
of reinterpretation. For example, it is often done using a union
,
but the C standard says that is undefined behaviour. But there is a
loophole in the rules.
double pcg64_bithack_double(pcg64_t *rng) {
double d;
uint64_t u = pcg64_random(rng);
/* 52 bits of randomness in the little end */
u = u >> 12;
/* set exponent to 0 + bias = 1023 */
u = u | (1023 << 52);
/* reinterpret as floating point */
memmove(&d, &u, sizeof(d));
return(d - 1.0);
}
Modern compilers will inline the memmove()
so the function ends up
being compiled correctly and efficiently.
In practice, using a union
is OK because too much code would break
if compilers were as strict as they could be, and a union
will be
more efficient with older compilers.
hexadecimal solution
C99 added support for hexadecimal floating point literals, which look like
0xHHHH.HHHHp+dddd
Slightly weirdly, while the fraction is written in hex, the exponent
(p
for power of 2) is written in decimal.
Hex floating literals are useful when it's important to get exact control over the bit pattern in the number, which is much harder if you have to go via decimal.
The idea is to convert a random N-bit integer to floating point,
/* 53 bits of randomness in the little end */
uint64_t u = pcg64_random(rng) >> 11;
then divide it by the maximum range of the integer, 2N, so the result is between 0.0 and 1.0. Or get the same effect by multiplying by 2-N.
/* convert and scale to correct range */
double d = (double)u * 0x1.0p-53;
Note that this solution returns 53 bits of randomness because it is able to make use of the implicit most-significant bit, whereas the bithack solution only has 52 bits of randomness.
As well as using a hex floating literal, this solution assumes that double precision multiplication is fast. Which is true, because it has been the subject of important benchmarks for decades.
Putting it together gives us this beautifully lucid one-liner.
double pcg64_random_double(pcg64_t *rng) {
return((double)(pcg64_random(rng) >> 11) * 0x1.0p-53);
}
no subject
Date: 2023-06-24 12:26 (UTC)Great observations, thanks :-) In fact a few others made similar comments so I learned some more about random floating point numbers and posted a note https://dotat.at/@/2023-06-23-random-more.html which I have not yet cross-posted here.