LucenaTheLucid wrote: Is there anything I should be weary of? Especially considering the time control?
I would say that you should be aware of the randomness and noise you introduce when testing with a book (this is something that happens with any book, not just Default.bin), in that it introduces unfairness for the engines depending on the openings that are given to them from the book.
It doesn't matter how good are the positions out of book from the engines, what matters are that they're different, and the engines react differently depending on the opening. So, say, randomly one engine gets to play more of the Spanish, while another gets to play more the Sicilian, this causes unnecessary biases in the results, specially if the engines play those positions better.
Another problem is lack of variety, say, an engine is really good when coming out of book in an unbalanced position, but the book doesn't have any such position. This makes the engine have a lower elo than in real life conditions. But having unbalanced positions randomly just worsens the above problem.
A nice solution is to use a varied opening suite, one that includes all kinds of playable positions out of book, and all the engines get to play them from both sides against all the opponents. The only problem is getting a hold of such a suite, specially when you need 3000 unique positions for 6000 games (or 4500 unique positions for total 9000 games), but the concept is that when you say "I use the same exact settings", it's not accurate if the openings are not the same for everyone, or the openings are biased (too balanced out of book, too many Sicilians in there, etc.)
How many 1.g3 games do you have there? I think they should be at least half as much as 1.c4, if not, you have a book that isn't representing the major openings fairly enough, and it's probably focusing too much on specific chess slices that don't represent well real life scenarios.
Finally, the randomness you get rid of by using 1CPU engines is brought back by the book.