Algorithmics / random stuff
Random choice
Random subset
Shuffling
✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.
You'ld think this isn't very hard, but the first thing you'll probably think of, the random-swaps approach, is demonstraby not uniform shuffling. In that all possible outputs do not appear equally. This is okay for some uses, but not for others. Since the better variants is just as fast, at least be aware of it.
Random swaps
for i in 1..length swap two random items
Pro:
- simple code to write
Con:
- not a uniform shuffling - roughly because his demonstrably isn't enough swaps. The number at which it becomes decent depends on the length (apparently n lg n?(verify))
Sort by random
Attach a random number to each entry, then sort by that
Pro:
- uniform shuffling (assuming no duplicate values in what we generate)
Con
- complexity is that of the sort, often O(n lg n)(verify)
Fisher-Yates / Knuth shuffle
There is an original pen-and-paper version which strikes through numbers. The modern computer variant puts them on the end of the list instead - easier bookkeeping.
The modern variant comes down to:
for i in 0..length-1 swap(i, rand_inclusive(0,i))
This can be done from both directions, this is equivalent.
Pro:
- uniform
- linear time
Con:
- Assuming you're using a PRNG, there is a sometimes-viable attack that could predict the sorted array
- related to seeding, and the amount of possible orderings versus the repeat interval.
- this only really matters when security or gambling are involved
Notes:
- common bug: random range needs to be inclusive of the one you're at. This can be thrown off by the implementation of your random function (so check its docs) and even the way you use the value (e.g. consider that int(rand()*length) effectively hides a floor(). And floating point details.)
- common bug: choosing from entire array - is actually the naive random-swaps variant, and works worse (can be explained semi-intuitively)
Notes
Bonus: physical shuffling https://www.youtube.com/watch?v=AxJubaijQbI