Shuffling a list

AMG: I guess RS wanted this page, so here goes... :^)

RS: Oops, sorry, what I meant was Shuffle a list ...

AMG: Ah, well, at least I got to write another un/sorting story page.

This page discusses shuffling algorithms and implementations thereof.

Version 1: sorting random reals edit

Here's a proc, adapted from unsort:

proc shuffle {data} {
    set rand_data [list]
    foreach elem $data {
        lappend rand_data [list [expr {rand()}] $elem]
    }

    set result [list]
    foreach elem [lsort -real -index 0 $rand_data] {
        lappend result [lindex $elem 1]
    }

    return $result
}

It works, but it's seems suboptimal.

Version 2: sorting random integers edit

For starters, the real sort bothers me. I imagine sorting by integer would be more efficient, but then again producing random integers is (currently) slower than producing random reals. Change rand() for int(0x7fffffff * rand()) and -real for -integer and it winds up taking longer. See the benchmark below. (By the way, how am I supposed to portably get the maximum and minimum integer?)

Next, having up to three copies of the list (one of which is augmented) appears to be a waste of RAM.

Version 3: in-place swaps edit

So would it be better to do an in-place shuffle? For each element of $data it could swap with another, randomly-selected element of $data. I'll take a shot at it:

proc shuffle {data} {
    set length [llength $data]
    for {set idx_1 0} {$idx_1 < $length} {incr idx_1} {
        set idx_2 [expr {int($length * rand())}]
        set temp [lindex $data $idx_1]
        lset data $idx_1 [lindex $data $idx_2]
        lset data $idx_2 $temp
    }
    return $data
}

Big improvement.

Benchmarking 1, 2, and 3 edit

Here's my magical test data:

 set data [lrepeat 100000 x y z z y]

If you're trying this at home from your interactive tclsh, you might want to type "; puts -nonewline {}" after the end of the above line or else its printed result will scroll for a very long time.

Now I time the shuffles:

 ..................................................
 :                    :   400 MHz   :  1500 MHz   :
 :....................:.............:.............:
 : 1, random reals    : 21.847024 s :  9.541117 s :
 : 2, random integers : 24.649960 s :  9.857447 s :
 : 3, in-place swaps  :  7.650484 s :  2.328508 s :
 :....................:.............:.............:

Wow, I think I will go with #3! Hold on while I update unsort... :^)

Version 4: in-place swaps, better random behavior edit

AMG: Per Lars H's suggestion, I improved the code again. As Lars explains in unsort, the old code doesn't have a perfectly uniform distribution because, for any given list, the size of the set of permutations from which it chooses is not a multiple of the size of the set of all possible permutations. This is similar to doing "random() % 3" in C and getting 0's more often than 2's.

proc shuffle {data} {
    set length [llength $data]
    for {} {$length > 1} {incr length -1} {
        set idx_1 [expr {$length - 1}]
        set idx_2 [expr {int($length * rand())}]
        set temp [lindex $data $idx_1]
        lset data $idx_1 [lindex $data $idx_2]
        lset data $idx_2 $temp
    }
    return $data
}

That ought to do it! However, it's (theoretically) a little bit slower due to modifying more variables in the inner loop. I also made a less-readable version that incr's idx_1 rather than recalculating it with expr, for comparison's sake. Here are the numbers, using the same benchmark.

 ..................................................
 :                    :   400 MHz   :  1500 MHz   :
 :....................:.............:.............:
 : 4 , more random    :  8.428660 s :  2.286190 s :
 : 4a, less readable  :  7.824165 s :  2.163990 s :
 :....................:.............:.............:

For reasonably-sized lists (under 500,000 elements, I guess), the speed difference between #3, #4, and #4a is negligible. So I choose #4, which has a more uniform distribution and doesn't have grody sources due to silly optimizations.

I wonder why the 400 MHz results for #4 and #4a are slower than for #3 yet the 1500 MHz results are the other way around...? I guess this proc is random in more than one way.

More? edit

Here's another algorithm. Basically you randomly select an element from the input list, remove it from the input list, and place it at the end of the output list. The reason I thought of it is, for unsort it's not necessary to maintain an output list; simply puts the elements as they come.

But this algorithm involves a lot of copying--- lset is nice for in-place modifies, but there's no in-place element removal. So it's liable to be slower than #4.

Category Algorithm

Category Performance