Re: rand48 replacement - Mailing list pgsql-hackers
| From | Yura Sokolov |
|---|---|
| Subject | Re: rand48 replacement |
| Date | |
| Msg-id | 3136e2672d5ab395521bae05a80df469@postgrespro.ru Whole thread Raw |
| In response to | Re: rand48 replacement (Fabien COELHO <coelho@cri.ensmp.fr>) |
| Responses |
Re: rand48 replacement
|
| List | pgsql-hackers |
Fabien COELHO писал 2021-07-04 23:29:
>> The important property of determinism is that if I set a seed, and
>> then make an identical set of calls to the random API, the results
>> will be identical every time, so that it's possible to write tests
>> with predictable/repeatable results.
>
> Hmmm… I like my stronger determinism definition more than this one:-)
>
>>> I would work around that by deriving another 128 bit generator
>>> instead of splitmix 64 bit, but that is overkill.
>>
>> Not really relevant now, but I'm pretty sure that's impossible to do.
>> You might try it as an interesting academic exercise, but I believe
>> it's a logical impossibility.
>
> Hmmm… I was simply thinking of seeding a new pg_prng_state from the
> main pg_prng_state with some transformation, and then iterate over the
> second PRNG, pretty much like I did with splitmix, but with 128 bits
> so that your #states argument does not apply, i.e. something like:
>
> /* select in a range with bitmask rejection */
> uint64 pg_prng_u64_range(pg_prng_state *state, uint64 range)
> {
> /* always advance state once */
> uint64 next = xoroshiro128ss(state);
> uint64 val;
>
> if (range >= 2)
> {
> uint64 mask = mask_u64(range-1);
>
> val = next & mask;
>
> if (val >= range)
> {
> /* copy and update current prng state */
> pg_prng_state iterator = *state;
>
> iterator.s0 ^= next;
> iterator.s1 += UINT64CONST(0x9E3779B97f4A7C15);
>
> /* iterate till val in [0, range) */
> while ((val = xoroshiro128ss(&iterator) & mask) >= range)
> ;
> }
> }
> else
> val = 0;
>
> return val;
> }
>
> The initial pseudo-random sequence is left to proceed, and a new PRNG
> is basically forked for iterating on the mask, if needed.
I believe most "range" values are small, much smaller than UINT32_MAX.
In this case, according to [1] fastest method is Lemire's one (I'd take
original version from [2])
Therefore combined method pg_prng_u64_range could branch on range value
uint64 pg_prng_u64_range(pg_prng_state *state, uint64 range)
{
uint64 val = xoroshiro128ss(state);
uint64 m;
if ((range & (range-1) == 0) /* handle all power 2 cases */
return range != 0 ? val & (range-1) : 0;
if (likely(range < PG_UINT32_MAX/32)
{
/*
* Daniel Lamire's unbiased range random algorithm based on
rejection sampling
* https://lemire.me/blog/2016/06/30/fast-random-shuffling/
*/
m = (uint32)val * range;
if ((uint32)m < range)
{
uint32 t = -range % range;
while ((uint32)m < t)
m = (uint32)xoroshiro128ss(state) * range;
}
return m >> 32;
}
/* Apple's mask method */
m = mask_u64(range-1);
val &= m;
while (val >= range)
val = xoroshiro128ss(state) & m;
return val;
}
Mask method could also be faster when range is close to mask.
For example, fast check for "range is within 1/1024 to mask" is
range < (range/512 + (range&(range*2)))
And then method choose could like:
if (likely(range < UINT32_MAX/32 && range > (range/512 +
(range&(range*2)))))
But I don't know does additional condition worth difference or not.
[1] https://www.pcg-random.org/posts/bounded-rands.html
[2] https://lemire.me/blog/2016/06/30/fast-random-shuffling/
regards,
Sokolov Yura
pgsql-hackers by date: