Front page | perl.beginners | Postings from April 2002

## RE: Scripts picks random elements from array , but it repeats sometimes

From:
Bob Showalter
Date:
April 29, 2002 13:03
Subject:
RE: Scripts picks random elements from array , but it repeats sometimes
Message ID:
2E4528861499D41199D200A0C9B15BC031B9D4@taylorwhite.com
```> -----Original Message-----
> From: Wagner-David@vikingfreight.com
> [mailto:Wagner-David@vikingfreight.com]
> Sent: Monday, April 29, 2002 3:18 PM
> To: Jim.Flaherty@cnet.navy.mil; beginners@perl.org
> Subject: RE: Scripts picks random elements from array , but it repeats
> som etimes
>
>
> 	Here is one option:
> #!perl -w
> my @numbers = qw( 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 );
> my \$numofques = scalar(@numbers);
> my @MySeen = ();
>
> while(\$numofques > 0) {
>     \$index   = rand @numbers;
>     next if ( defined\$MySeen[\$index] );
>     \$element = \$numbers[\$index];
>     printf "%-3d ", \$element;
>     \$MySeen[\$index] = 1;
>     \$numofques--;
>  }
> printf "\n";
>
> Output:
> 12  16  5   19  15  14  8   17  7   18  1   13  3   10  6   9
>   2   11  4

This algorithm is essentially:

1. generate a random index
2. if we have already used than index, go back to step 1
3. use the element at that index and record that fact

This algorithm would seem to suffer from a degeneration, due
to the fact that as you use more and more elements from the
list, it becomes increasingly unlikely that the index generated
in step 1 will result in an ununsed index.

Suppose the list has 100 numbers and you extract 99 of them,
leaving, say, index 37 as the only "unused" index. Now you have
to keep calling rand() over and over until it returns 37. If
rand() is giving you a random sequence, then you cannot predict
how long it will take to return 37, other than to say it will
return 37 *on average* once per 100 calls.

This degeneration could become significant as the size of the
list grows.

May I suggest a modification of the algorithm that avoids this
problem:

1. generate a random index from 1 to (number of remaining elements)
2. use the element at that index
3. move the last element into the index generated
4. decrement the number of remaining elements.

The modified code would look something like:

@numbers = 1..100;
\$numofques = @numbers;

while(\$numofques > 0) {
\$index = rand \$numofques;
\$element = \$numbers[\$index];
printf "%-3d ", \$element;
\$numofques--;
\$numbers[\$index] = \$numbers[\$numofques];
}
printf "\n"

This will call rand() exactly 100 times. The last call will
be rand(1), since only 1 item remains, and will definitely
return 0, the index of the only element. By shifting the
"last" element down to fill in each "hole", we avoid having
"misses".

```