Randomize a PGN file

General discussion about computer chess...
Post Reply
Jeremy Bernstein
Site Admin
Posts: 1226
Joined: Wed Jun 09, 2010 7:49 am
Real Name: Jeremy Bernstein
Location: Berlin, Germany
Contact:

Randomize a PGN file

Post by Jeremy Bernstein » Mon Feb 27, 2012 7:54 pm

Does anyone have a quick tip for randomizing the order of a PGN file? I don't feel like coding it up...

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: Randomize a PGN file

Post by BB+ » Tue Feb 28, 2012 5:00 am

Do you have a way of sorting the games in a PGN file? If so, then just sort based upon a random hash function... :geek:

User avatar
Chris Whittington
Posts: 437
Joined: Wed Jun 09, 2010 6:25 pm

Re: Randomize a PGN file

Post by Chris Whittington » Tue Feb 28, 2012 3:02 pm

Jeremy Bernstein wrote:Does anyone have a quick tip for randomizing the order of a PGN file? I don't feel like coding it up...
I recollect Oxford softworks card games such as Bridge etc, shuffled the card deck with one pass 1 to 52 swapping entry with entry[random(52)]

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: Randomize a PGN file

Post by hyatt » Tue Feb 28, 2012 7:22 pm

And that is a known bad way of sorting a deck of cards. I'll leave it as an exercise to the reader to understand why.

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: Randomize a PGN file

Post by hyatt » Tue Feb 28, 2012 7:25 pm

Chris Whittington wrote:
Jeremy Bernstein wrote:Does anyone have a quick tip for randomizing the order of a PGN file? I don't feel like coding it up...
I recollect Oxford softworks card games such as Bridge etc, shuffled the card deck with one pass 1 to 52 swapping entry with entry[random(52)]


1. Read through the PGN file and look for a specific PGN tag (such as "Event"). write the index to the start of that game into a file. Follow it by a md5sum of the moves from that game.

2. Sort the new file on the basis oc the md5 sum which is a pretty random number.

3. Read that file and take the first "offset" to index into the original PGN file and read the entire game in and write it to the new file as game 1. Continue this until the list of offsets has been used...

zwegner
Posts: 57
Joined: Thu Jun 10, 2010 5:38 am

Re: Randomize a PGN file

Post by zwegner » Tue Feb 28, 2012 8:58 pm

This shouldn't be that hard... I was hoping that "sort -R -t '\n\n['" would do the trick, but sort doesn't allow multi-character separators.

Thus, a python two-liner:

Code: Select all

import random
open('out.pgn', 'w').write( '\n\n['.join(sorted(open('in.pgn').read().split('\n\n['), key=lambda x:random.random())))

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: Randomize a PGN file

Post by hyatt » Tue Feb 28, 2012 9:08 pm

zwegner wrote:This shouldn't be that hard... I was hoping that "sort -R -t '\n\n['" would do the trick, but sort doesn't allow multi-character separators.

Thus, a python two-liner:

Code: Select all

import random
open('out.pgn', 'w').write( '\n\n['.join(sorted(open('in.pgn').read().split('\n\n['), key=lambda x:random.random())))

How is it going to know where the games are separated? I've had at least one big PGN file without a blank line separating the games... My book builder uses a "[" to mark the end of one game and the start of another. That works on all PGN.

User923005
Posts: 616
Joined: Thu May 19, 2011 1:35 am

Re: Randomize a PGN file

Post by User923005 » Wed Feb 29, 2012 1:31 am

Assuming that you have a list of N games, address them in this manner {from the Usenet C-FAQ}:

game_number_to_examine = (int)((double)rand() / ((double)RAND_MAX + 1) * N);

No need to actually scramble the games unless you want to.

Some implementations of rand() are iffy, so you might want to consider using the Mersenne Twister instead:
http://www.math.sci.hiroshima-u.ac.jp/~ ... T/emt.html

Jeremy Bernstein
Site Admin
Posts: 1226
Joined: Wed Jun 09, 2010 7:49 am
Real Name: Jeremy Bernstein
Location: Berlin, Germany
Contact:

Re: Randomize a PGN file

Post by Jeremy Bernstein » Wed Feb 29, 2012 1:44 am

User923005 wrote:Assuming that you have a list of N games, address them in this manner {from the Usenet C-FAQ}:

game_number_to_examine = (int)((double)rand() / ((double)RAND_MAX + 1) * N);

No need to actually scramble the games unless you want to.

Some implementations of rand() are iffy, so you might want to consider using the Mersenne Twister instead:
http://www.math.sci.hiroshima-u.ac.jp/~ ... T/emt.html
I ultimately just found some software for Android that does what your first suggestion does (although using an apparently iffy rand()): https://market.android.com/details?id=c ... ira.ichess. Unfortunately, SCID on the go doesn't do this yet.

Thanks for the suggestions, though -- if I decide to drill the tactical sets on paper, the various ideas here will definitely help to shuffle the PGN(s) in question.

jb

Post Reply