CCRL and CEGT entirely in-competent !

kranium · Post by **kranium** » Wed Aug 04, 2010 11:52 pm

yes...
a group of complete amateurs...
none of which have even the slightest amount of programming expreience...or expertise in any way shape or form.
a bunch of overated (ego inflated, fat-headed) amateur hobby chess enthusiasts

Specifically i'm atalking about the CCRL and the CEGT, specifcally: Graham Banks, Gabor Szots, Werner Schüle, Johan Havegheer, etc...

why?
because they figured out how to run engine vs engine matches in Arena!?
and now we all owe them an enormous debt of gratitude for the electricity and CPU cycles invested!?
ridiculously enough they all think they now deserve a Nobel prize for chess engine testing..!

but their testing methods are completely unscientific, and near atrocious!
please read this for more info:
http://talkchess.com/forum/viewtopic.ph ... =&start=60

Milos S. succinctly and intelligently outlines at least a dozen major flaws in CCRL testing procedures...
it's pitiful...

and to think: these are the corrupt 'good old boys' that dictate what engines are acceptble and which are not?!
God help us!

BB+ · Post by **BB+** » Thu Aug 05, 2010 4:28 am

Milos S. succinctly and intelligently outlines at least a dozen major flaws in CCRL testing procedures...

Before I look at the list, I might say that this seems a large amount of "major" flaws. OTOH, even one major can invalidate various conclusions. My own opinion is that there is little "scientific" aspect to CCRL/CEGT, while SSDF is better in that respect. As I've said before, the main positive of CCRL/CEGT is that they cover quite a wide array of engines (hobbyists and pros alike) so that one can have a gross idea (within 50 ELO at least) of how good an engine is. To try to have them split hairs on questions revolving around the top engines is probably not feasible. Indeed, if they had testers who did nothing but test "Top 10" engines, it would almost be tangential to the greater focus of their work.

BB+ · Post by **BB+** » Thu Aug 05, 2010 5:02 am

Having now looked at the TalkChess thread, I can comment on a side issue: Rybka 3 has rather little "obfuscation" code -- there is code that does nothing in the end, for instance updating of counters whose contents end up having no effect, but I would not call this obfuscation, but rather unused ideas (an example, is the various extraneous positional gain updaters, or the 4 history slots with only the first 2 used). There are also a few arrays of zeros for scores with passed pawns, which can be eliminated. Some of the search functions have extra variables passed, which can at least be simplified if not eliminated. The 136-byte pawn-hash entries (VR padded it wrong, as it should be 128) is another minor slowdown. I do not think these abnormally affect fast time controls, though they are general slowness. My guess is that R3 could be 10% faster (still written in C), maybe 15% with reworking the data structures. I know nothing about the 32-bit version, other than that VR is rather uninterested in doing anything for it.

I never did find any list of dozen major flaws. The closest was a list of 7 (maybe 9 if you count #1 as 3 points) in http://talkchess.com/forum/viewtopic.ph ... 58&t=35579

The most exaggerated item (elsewhere in the theread) from Milos was: [VR]'s real skills were clear to some ppl when he released the first and totally mediocre 1600 elo version of his engine. After that came fruit, and than external help financed from who knows whom with a huge amount of money to do the tuning for him (since he didn't have, and still doesn't have a clue about engine parameter tuning)... I can name at least three things that VR brought to the computer chess world: material imbalances, heavily asymmetric search trees (more than just LMR), and hyperbullet testing (I'm not even sure anyone was thinking within 2 orders of magnitude of what he was doing). ZW has mentioned that R1 looked to have tuned some values from Fruit, so I can't think VR is totally hopeless in that field [unless it is true that NASA did it for him

--- in the R4 release, he thanked Alan Sassler for some help in dealing with correlated features]. The main (known) outsourcings were LK for evaluation [well, maybe also the material imbalance idea first came from him too, though not as an implementation] and Noomen (and others) for the opening book, and I guess you can mention M ANSARI, Kullberg, BigMomma (maybe others?) with some level of testing aid. Then again, his army of beta-testers never quite seem to find all the bugs.

BB+ · Post by **BB+** » Thu Aug 05, 2010 5:17 am

In fact, now I ran across Adam Hair's well-written response http://talkchess.com/forum/viewtopic.ph ... 18&t=35579
I see that he (like I) thinks that the "30 ELO" estimate of Milos for ELO resolution was, if anything, low, even if the "pure" list. Part of this is the catch-22 of publishing error margins. If you don't (like FIDE, or moreso with a statistician like Sonas), you typically give a erroneous idea of your precision. However, if you do, and use completely "statistical" error bounds, you've really not addressed the nastier question of possible systematic bias.

kingliveson · Post by **kingliveson** » Sun Aug 08, 2010 5:53 pm

I have to disagree with premises of this topic. A couple points first, CCRL bias toward certain author due to personal relationship is the major issue. It is a private entity after all, so... Only other issue I see is using book for testing; CEGT does it as well. In general, the top engine will still come out ahead, but point difference wont be as accurate compared to using positions that give equal opportunity to both engines.

Here is where I disagree. First, regarding all testers using the same hardware, it's not a good idea as it does not represent reality. People have different hardware and setups all around the world. Second issue is time control. The current time control used by CCRL, CEGT, SSDF are just fine. If you begin testing using time/move, you are not doing engine testing, because as Prof. Hyatt pointed in another thread, there are personalities that are eliminated. One that comes to mind is time management -- most engines will play slower/faster given a position.

In all, the current ratings list are good enough in giving relative strength. We just need one that is not biased especially when claiming to be independent.

Matthias Gemuh · Post by **Matthias Gemuh** » Sun Aug 08, 2010 8:14 pm

kingliveson wrote: so... Only other issue I see is using book for testing; CEGT does it as well. In general, the top engine will still come out ahead, but point difference wont be as accurate compared to using positions that give equal opportunity to both engines.

The remis books (*.cgb) of ChessGUI are actually start positions in obfuscated pgn format.

Peter C · Post by **Peter C** » Mon Aug 09, 2010 2:50 am

Matthias Gemuh wrote:
kingliveson wrote: so... Only other issue I see is using book for testing; CEGT does it as well. In general, the top engine will still come out ahead, but point difference wont be as accurate compared to using positions that give equal opportunity to both engines.
The remis books (*.cgb) of ChessGUI are actually start positions in obfuscated pgn format.

Why obfuscated?

It must be that ChessGUI is a clone! Matthias you scoundrel....

Peter

Roger Brown · Post by **Roger Brown** » Mon Aug 09, 2010 3:20 am

Peter C wrote:
Why obfuscated?

It must be that ChessGUI is a clone! Matthias you scoundrel....

Peter

I KNEW IT!

It is a Winboard/Arena/Chessb... clone. Uh, let us leave off the last one for the time being.

Later.

Matthias Gemuh · Post by **Matthias Gemuh** » Mon Aug 09, 2010 4:37 am

Peter C wrote:
Matthias Gemuh wrote:
kingliveson wrote: so... Only other issue I see is using book for testing; CEGT does it as well. In general, the top engine will still come out ahead, but point difference wont be as accurate compared to using positions that give equal opportunity to both engines.
The remis books (*.cgb) of ChessGUI are actually start positions in obfuscated pgn format.
Why obfuscated?

It must be that ChessGUI is a clone! Matthias you scoundrel....

Peter

The whole chess scene is full of book thieves

, so ChessGUI obfuscates to protect the hard work of book creators

.

Dr.Wael Deeb · Post by **Dr.Wael Deeb** » Mon Aug 09, 2010 8:01 am

Roger Brown wrote:
Peter C wrote:
Why obfuscated?

It must be that ChessGUI is a clone! Matthias you scoundrel....

Peter

I KNEW IT!

It is a Winboard/Arena/Chessb... clone. Uh, let us leave off the last one for the time being.

Later.

Matthias cloned the ChessBase GUI

How did he manage to clone the 100 tons of bugs living in the software

Now he has to release regular buggy updates like ChessBase to make it even more buggy

OpenChess

OpenChess

CCRL and CEGT entirely in-competent !

CCRL and CEGT entirely in-competent !

Re: CCRL and CEGT entirely in-competent !

Re: CCRL and CEGT entirely in-competent !

Re: CCRL and CEGT entirely in-competent !

Re: CCRL and CEGT entirely in-competent !

Re: CCRL and CEGT entirely in-competent !

Re: CCRL and CEGT entirely in-competent !

Re: CCRL and CEGT entirely in-competent !

Re: CCRL and CEGT entirely in-competent !

Re: CCRL and CEGT entirely in-competent !