Page 1 of 1

Randomize a PGN file

Posted: Mon Feb 27, 2012 7:54 pm
by Jeremy Bernstein
Does anyone have a quick tip for randomizing the order of a PGN file? I don't feel like coding it up...

Re: Randomize a PGN file

Posted: Tue Feb 28, 2012 5:00 am
by BB+
Do you have a way of sorting the games in a PGN file? If so, then just sort based upon a random hash function... :geek:

Re: Randomize a PGN file

Posted: Tue Feb 28, 2012 3:02 pm
by Chris Whittington
Jeremy Bernstein wrote:Does anyone have a quick tip for randomizing the order of a PGN file? I don't feel like coding it up...
I recollect Oxford softworks card games such as Bridge etc, shuffled the card deck with one pass 1 to 52 swapping entry with entry[random(52)]

Re: Randomize a PGN file

Posted: Tue Feb 28, 2012 7:22 pm
by hyatt
And that is a known bad way of sorting a deck of cards. I'll leave it as an exercise to the reader to understand why.

Re: Randomize a PGN file

Posted: Tue Feb 28, 2012 7:25 pm
by hyatt
Chris Whittington wrote:
Jeremy Bernstein wrote:Does anyone have a quick tip for randomizing the order of a PGN file? I don't feel like coding it up...
I recollect Oxford softworks card games such as Bridge etc, shuffled the card deck with one pass 1 to 52 swapping entry with entry[random(52)]


1. Read through the PGN file and look for a specific PGN tag (such as "Event"). write the index to the start of that game into a file. Follow it by a md5sum of the moves from that game.

2. Sort the new file on the basis oc the md5 sum which is a pretty random number.

3. Read that file and take the first "offset" to index into the original PGN file and read the entire game in and write it to the new file as game 1. Continue this until the list of offsets has been used...

Re: Randomize a PGN file

Posted: Tue Feb 28, 2012 8:58 pm
by zwegner
This shouldn't be that hard... I was hoping that "sort -R -t '\n\n['" would do the trick, but sort doesn't allow multi-character separators.

Thus, a python two-liner:

Code: Select all

import random
open('out.pgn', 'w').write( '\n\n['.join(sorted(open('in.pgn').read().split('\n\n['), key=lambda x:random.random())))

Re: Randomize a PGN file

Posted: Tue Feb 28, 2012 9:08 pm
by hyatt
zwegner wrote:This shouldn't be that hard... I was hoping that "sort -R -t '\n\n['" would do the trick, but sort doesn't allow multi-character separators.

Thus, a python two-liner:

Code: Select all

import random
open('out.pgn', 'w').write( '\n\n['.join(sorted(open('in.pgn').read().split('\n\n['), key=lambda x:random.random())))

How is it going to know where the games are separated? I've had at least one big PGN file without a blank line separating the games... My book builder uses a "[" to mark the end of one game and the start of another. That works on all PGN.

Re: Randomize a PGN file

Posted: Wed Feb 29, 2012 1:31 am
by User923005
Assuming that you have a list of N games, address them in this manner {from the Usenet C-FAQ}:

game_number_to_examine = (int)((double)rand() / ((double)RAND_MAX + 1) * N);

No need to actually scramble the games unless you want to.

Some implementations of rand() are iffy, so you might want to consider using the Mersenne Twister instead:
http://www.math.sci.hiroshima-u.ac.jp/~ ... T/emt.html

Re: Randomize a PGN file

Posted: Wed Feb 29, 2012 1:44 am
by Jeremy Bernstein
User923005 wrote:Assuming that you have a list of N games, address them in this manner {from the Usenet C-FAQ}:

game_number_to_examine = (int)((double)rand() / ((double)RAND_MAX + 1) * N);

No need to actually scramble the games unless you want to.

Some implementations of rand() are iffy, so you might want to consider using the Mersenne Twister instead:
http://www.math.sci.hiroshima-u.ac.jp/~ ... T/emt.html
I ultimately just found some software for Android that does what your first suggestion does (although using an apparently iffy rand()): https://market.android.com/details?id=c ... ira.ichess. Unfortunately, SCID on the go doesn't do this yet.

Thanks for the suggestions, though -- if I decide to drill the tactical sets on paper, the various ideas here will definitely help to shuffle the PGN(s) in question.

jb