Rebel wrote:BB+ wrote:CCRL has the following:
CEGT has
From the stipulations I understand the intent of both CCRL and CEGT is to measure the raw engine strength. While that choice has its merits injustice is done to other efforts of the programmer to add extra elo-points to his brainchild. Opening-books, book-learning, position-learning are essential parts of a chess program, they are able to fix holes, adapt, even avoid previous made mistakes.
IMO programs should be tested as a whole as the programmer intended and not be handicapped.
This has always been the policy of the SSDF.
Ed
I can understand that you, as a programmer, want all features that you have built into your chess program to be used in testing.
However, that presents a problem for a rating list that is trying to test many engines. Comparing two engines by way of their
head-to-head match does not really give an accurate idea of their relative strengths. So, a comparison of their results against
other engines is also needed. If some of those other engines have the ability to learn, then accuracy in the comparison suffers.
Let's say that I play Yace against a gauntlet of engines, including ProDeo with book learning and position learning turned on.
Then, some time later, I play Trace against the same gauntlet. The comparison between Yace and Trace suffers to some degree
because the ProDeo that Trace played against is not the same ProDeo that Yace played against. ProDeo has evolved during the
time between the two gauntlets ( this is assuming ProDeo has played more games during that time interval ).
A different method of testing is needed to show how an engine such as ProDeo improves over time.