Rating list question

General discussion about computer chess...
Hood
Posts: 200
Joined: Thu Jun 10, 2010 2:36 pm
Real Name: Krzych C.

Rating list question

Post by Hood » Sat Jun 19, 2010 7:12 pm

Hi,

are open-chess forum members able to provide independent rating list ?

I think it would be the shorter way to finish the 'known problem' :-).

Rgds Hood.
Smolensk 2010. Murder or accident... Cui bono ?

There are not bugs free programms. There are programms with undiscovered bugs.
Alleluia.

LetoAtreides82
Posts: 32
Joined: Thu Jun 10, 2010 12:46 am

Re: Rating list question

Post by LetoAtreides82 » Sat Jun 19, 2010 7:58 pm

I've been planning on testing engines like Houdini privately and in the same manner that I carry out the tests for CEGT Blitz, and compiling a private rating list. I might start testing Houdini 1.02 as soon as tomorrow.

LucenaTheLucid
Posts: 160
Joined: Thu Jun 10, 2010 2:14 am
Real Name: Luis Smith

Re: Rating list question

Post by LucenaTheLucid » Sat Jun 19, 2010 10:18 pm

I posted this idea in another topic. Currently I am gathering up some old cpu's to start out with. This is going to be no easy task.

User avatar
thorstenczub
Posts: 593
Joined: Wed Jun 09, 2010 12:51 pm
Real Name: Thorsten Czub
Location: United States of Europe, germany, NRW, Lünen
Contact:

Re: Rating list question

Post by thorstenczub » Sun Jun 20, 2010 9:37 am

i don't think that one more rating list is making the results and ELO more accurate.

yanquis1972
Posts: 36
Joined: Wed Jun 09, 2010 9:15 pm

Re: Rating list question

Post by yanquis1972 » Sun Jun 20, 2010 9:58 am

i talked about something like this as well. we finally have an ippolit engine being 'officially tested' by ingo, but all other rating lists ignore these monsters afaik. ingo has a particular way of testing (ponder on, short time controls) & another 'open' rating list of some kind would be complementary and not redundant, i think. i myself would like to see one that puts as much emphasis on simulating basic human analysis as possible. not exactly sure what that would be; i've thought about time controls like 2+12, ponder off, but that may be too offbeat.

User avatar
Chris Whittington
Posts: 437
Joined: Wed Jun 09, 2010 6:25 pm

Re: Rating list question

Post by Chris Whittington » Sun Jun 20, 2010 10:35 am

Rating lists as they are currenlty compiled are worse than useless and perpetuate by positive feedback the cul-de-sac that is chess program development.

Fundamentally a rating list should be independent of the development process. In other words it should measure something that can't be easily predicted by developers.

Think about a parallel system for a minute. Education and exams. If the exam is known to teachers beforehand, and teachers are rewarded by class exam results, then they will teach "to the exam", even, at worst, teaching known answers to the known questions. This is no education that anyone could want, but it leads to successful exam grades (teachers like it, their pay depends), continual use of the same exam boards that set the questions (exam boards like that) and collusion between teachers and exam boards which can result in payoffs, manipulations, fraud and worse.

Now, translate this back to computer chess. Developers work, basically all in the same way at present, they make a tweak or write some new code and then autoplay test several tens of thousands of games to detect if there's a better win/draw/loss statistic. If there is they keep the version, else they tweak again. The education process for a chess engine.

Testers then autoplay several tens of thousands or more games, and present a list of win/draw/loss statistics (the exam). These statistics tell the developers nothing they didn't already know, instead they provide positive feedback to developers to continue doing what they do. The exam is known and developers work to the exam.

We get continued use of favoured exam boards, continued use of a flawed(I think so) development process that emphasises playing in the comp-comp pool only (which favours tactical developments over strategic developments) and opens up all possibilities for collusion, manipulation and corruption between examiners (rating lists) and teachers (programmers) and others too. Meanwhile the feedback loop between rating lists and developments pushes computer chess into a particular direction and pushes the rating lists to ever new and unrealistic highs as they get more and more out of tune with reality.

Solution is to dump rating lists and promote a different way to measure excellence. Even find a new measure of excellence.

BTO7
Posts: 101
Joined: Thu Jun 10, 2010 4:21 am

Re: Rating list question

Post by BTO7 » Sun Jun 20, 2010 11:23 am

[quote="LetoAtreides82"]I've been planning on testing engines like Houdini privately and in the same manner that I carry out the tests for CEGT Blitz, and compiling a private rating list. I might start testing Houdini 1.02 as soon as tomorrow.[/quote

You need to talk the guys around there into ....just being testers and being independent of engine controversy. Being neutral and testing all engines is in the best interest of chess players. When you test only for the engine makers ...us chess players lose out.

Regards
BT

Hood
Posts: 200
Joined: Thu Jun 10, 2010 2:36 pm
Real Name: Krzych C.

Re: Rating list question

Post by Hood » Sun Jun 20, 2010 1:33 pm

BTO7 wrote:
You need to talk the guys around there into ....just being testers and being independent of engine controversy. Being neutral and testing all engines is in the best interest of chess players. When you test only for the engine makers ...us chess players lose out.

Regards
BT
That I wanted to hear and point out.
Time for independent rating list :-). All programs on similar = equal hardware, not neccessary the newest one. We need comparison of the programs not the hardware.
rgds Hood
Smolensk 2010. Murder or accident... Cui bono ?

There are not bugs free programms. There are programms with undiscovered bugs.
Alleluia.

User avatar
thorstenczub
Posts: 593
Joined: Wed Jun 09, 2010 12:51 pm
Real Name: Thorsten Czub
Location: United States of Europe, germany, NRW, Lünen
Contact:

Re: Rating list question

Post by thorstenczub » Sun Jun 20, 2010 2:45 pm

yanquis1972 wrote: 'officially tested'
there is nothing officially with rating lists. they are not scientific.

Gerold
Posts: 73
Joined: Thu Jun 10, 2010 1:32 am

Re: Rating list question

Post by Gerold » Sun Jun 20, 2010 2:54 pm

LetoAtreides82 wrote:I've been planning on testing engines like Houdini privately and in the same manner that I carry out the tests for CEGT Blitz, and compiling a private rating list. I might start testing Houdini 1.02 as soon as tomorrow.
I have been testing Houdini and that family for 6 months. Its interesting to see how some have progressed and others
have not.
Lately i have been recording the results but keeping results private.

Best,
Gerold.

Post Reply