A Talkchess thread: Misinformation being spread

orgfert · Post by **orgfert** » Wed Jan 12, 2011 12:50 am

thorstenczub wrote:its the job of the moderators, if a user has a problem logging in, to make sure he can log in.
in that time, there was a hidden secret admin, working behind the scene, even without informing the moderators. he banned members without telling the moderators he is doing so.
of course that came out later. but in that time the user asked for help, because he cannot log in,
it was not known.

this secret admin censoring the forum had no mandat by the participants.
while the moderators were elected to make sure the interests of the members comes through.
the secret admin was hired by the owner to make sure the interests of the owner comes through.

The admin was not hired. There is no pay, except insults. The owner does not know who the talkchess admin is any more than the owner knows who is moderator(s).

thorstenczub wrote:there is a contradiction/fight between the interests of the owner and the interests of the members, represented by the democratically elected moderators.
As the history of CCC has shown, the owner was interested in taking over the club and control the content of it. therefore he hired certain kind of people. one of them was Sam Hull.

open chess allows the computerchess friends to discuss issues without despotic admins
or owner-interests censoring the content or banning people for manipulation issues.

this is the reason open chess is not in an interest conflict like
css forum, hiarcs forum, CCC or rybka forum.

all those forums have the interest to satisfy the owner of the forums.
they have an intention to direct the content and the participants into an area
that deals their purpose.

open chess has the idea to give the participants/members a platform, uncensored,
for exchanging pluralistic opinions without abuse of skinner, conkie, hull or banks , williamson, friedel or any other person.

there is a long history in computerchess to be exploited by people for commercial reasons.
ICCA e.g. and its board was always part of this "business".

It would seem a compelling saga for those who've fallen in love with this now hoary myth. The ironic part is that you and Banks share the same trait in that when you think you possess the moral high ground, you feel free to act unilaterally even against the consensus of the other mods. Your actions are always morally justified. The principle difference between you is that Banks repented and apologized for acting unilaterally.

Sentinel · Post by **Sentinel** » Wed Jan 12, 2011 2:36 am

orgfert wrote:It would seem a compelling saga for those who've fallen in love with this now hoary myth. The ironic part is that you and Banks share the same trait in that when you think you possess the moral high ground, you feel free to act unilaterally even against the consensus of the other mods. Your actions are always morally justified. The principle difference between you is that Banks repented and apologized for acting unilaterally.

It is really funny comment coming from a guy that does exactly the same thing in each of his mandates...
Talking about hypocrisy...

Adam Hair · Post by **Adam Hair** » Wed Jan 12, 2011 4:05 am

Adam Hair wrote: I am asking for a coherent argument. It is easy to lob out criticism and insults. That is how most people go about
dismissing others. What would be much more enlightening are well thought out counterpoints to what I am about to
write.

1) Engine books are used in computer chess tournaments, actual competitions between engine authors. Your reference
to FIDE and ECF would apply to this scenario. It is irrelevant to rating lists. The SWCR, IPON, CEGT, and CCRL do not
use engine books because the goal is to find the strength of each engine, not engine+book.

orgfert wrote: Human rating lists are of the complete chess player, which includes memorized openings and endgames as well as learning.

Without a doubt, you are correct. However, most chess engines are not so complete. A choice has been made to test
what is common among all chess engines.

Adam Hair wrote:2)The "alien" books you refer to are forced on every engine. The purpose is to create a balanced position for the
engines to begin play. Whether or not balanced opening positions are achieved is another question.

orgfert wrote: Most human rating lists are not composed of such events.

Adam Hair wrote:3) TBs are not removed in general. The question of whether TBs improve Elo has been tested in some cases, but
the CCRL ( and the other rating groups, I think) uses TBs.

orgfert wrote: Ok.

Adam Hair wrote:4)On ponder strategies, when comparing the various rating lists it does not appear to affect the relative rankings
much whether ponder is on or off. And ponder off allows more games to be played and more engines tested.

orgfert wrote: Again, for the convenience of the tester. But in principle, this should not be done at all for reasons listed below.

Adam Hair wrote:5) One thing not explicitly named but is a part of the "total product" is learning. To test a engine for the purpose of
determining its Elo rating when learning is on creates problems. If an engine is continually changing, then the
comparision of different engines' results against that engine has little meaning. Bayeselo assumes that each
engine has an unchanging true Elo. If an engine is changing, then its true Elo is changing. Several engines with
learning on would make any rating list constructed more inaccurate than it already is.

orgfert wrote: I take this means Bayeselo cannot rate humans since they are dynamic, learning entities. If a program is written to be like a human, dynamic, its benefits will be concealed by this flawed testing. My analogy to human rating practices remains apropo. A program is a chess player and its strength is composed of design elements that are then arbitrarily turned off by testers. This is grossly incorrect.

How many programs are written to be dynamic? As far as I have seen, very few have been. Crafty, ProDeo, and (I think)
RomiChess come to mind. I am sure that there are some others. Yet, the vast majority of engines are static entities,
unlike humans. I fail to see how your analogy to humans and human rating lists apply. If engines like the three I named
were in the majority, then I would be in your camp on this issue. But they are not.

Adam Hair wrote:6) Each rating list is an attempt at something approaching a scientific measurement of engine strength. How close
the approach comes is open to opinion. In each case, there is an attempt to eliminate sources of variation.
Sometimes there are some trade offs ( more testers allow for more games and engines but creates more statistical
noise), but at least there is some idea of each engine's strength ( there are many more that should be been tested).

orgfert wrote: This approach fundamentally destroys many design elements of a computer chess player's strength. Even if you discover that specific elements tend to make little difference, you are blinding the test to potentially effective strategies when they arrive in newer, more innovative versions.

Therefore, testing should be careful to include all design elements in a system for evaluation, whether they are deemed to differentiate or not. This is a fundamental principle that should never be violated.

[/quote]

This is done quite often in science : Define what you are trying to measure, try to eliminate sources of variation, then
measure it. The scope of the testing, in this case, is narrowly defined. We are trying to find the relative strength of
each engine. And there are a lot of engines out there, many being updated and new engines arriving each month. The
CCRL has been trying to test as many of them as possible. This goal may be at odds with what you would like to see done.
It has been helpful to others.

Are you also caught up with the notion that the CCRL is some kind of accreditation organization?

If we were, then our tests should include all design elements. Well, we are not and do not pretend to be.

BB+ · Post by **BB+** » Wed Jan 12, 2011 4:59 am

If a program is written to be like a human, dynamic, its benefits will be concealed by this flawed testing.

I might point out that a decade ago the "learning" functions were quite popular with engines. Pepito, Phalanx, and Crafty (the latter perhaps the impetus via being open-source) all did this, usually doing some sort of update to the book info based on obtained results (either WLD, or evals 5-10 moves out of book). The bifurcation of book/engine and the advent of UCI kinda diminished all this, though around 2005, Sherwin was championing the approach he used in RomiChess, and some others were still interested in it.

As a side-note, in the post by GB, he gives various excerpts and they all revolve around: "Who is #1?" -- which is perhaps a bit incongruous with some of the stated aims of CCRL. I might note that this is not always (or perhaps is no longer) a typical part of his updates.

Prima · Post by **Prima** » Wed Jan 12, 2011 5:34 am

Adam Hair wrote:Prima,

Which CCRL list are you refering to and are you talking about 1 CPU, 2 CPU, or 4 CPU versions? If you look at the various
lists available at CCRL, you will see that, depending on the conditions, the results differ. Also keep in mind that there
is a lag between the release of an engine and when it gets tested by the CCRL. And it may get not get tested under all
of the various conditions at the same time. Taking your recollection to be true, there are multiple reasons why you
could have seen the ratings you saw and the ratings as they are now without involving some conspiracy.

I was referring to the entire CCRL lists, specifically the 4CPU lists on various time-control.

The "lag or periods between releases" excuse given by CCRL really applies to engines stronger than Rybka.....in this case and time-frame of Rybka 2.3.2a, Naum 4.1 and DS12. Their "lag between" versions on both Naum 4.1 and Deep Shredder 12 was not just mere weeks or couple of months. The so call "lag" conveniently spanned close to the release of Rybka 3. In the meantime, both Naum 4.1 and Deep Shredder 12 were still placed under Rybka 2.3.2a.

For me, I took both authors' words about the masssive improvements on their respective engines. It's really not hard to figure if their respective engines would do better than the-then champion, Rybka 2.3.2a. All I did was read on each author's estimated (or minimum expected )ELO, added it to each & respective previous engine's ELO and compared it against Rybka 2.3.2a...and it was easy to deduct that Rybka 2.3.2a was surpassed in strength by both engines. But of course, People weren't allowed to see that by all cost, at least till Rybka 3 was out and reigning once again. No wonder some people didn't know this.

Now the "innocence/Misinformation" game is played by a personnel of CCRL. Who's kidding who? I may not be able to produce pages in which, after Mr. Naumov stated the massive ELO increase in Naum 4.1(or 4.x?), which indicated it overtook Rybka 2.3.2a (my interpretation here), it was still not reflected in CCRL...for months!. Same applies for Deep Shredder 12, albeit it was released later in 2009.

Harvey Williamson · Post by **Harvey Williamson** » Wed Jan 12, 2011 6:48 am

As copying posts from other forums seems to be the way you generate discussion here I thought I would copy this:

http://www.talkchess.com/forum/viewtopi ... 0dc#387662

"I am sure CCRL didn't screw me and also didn't try to screw me."

You can use this quote if you like.

Who is the guy trolling there?

Best regards
Stefan

You know you guys have my full support.

Regards,
Alex

thorstenczub · Post by **thorstenczub** » Wed Jan 12, 2011 7:00 am

orgfert wrote:
The admin was not hired. There is no pay, except insults. The owner does not know who the talkchess admin is any more than the owner knows who is moderator(s).

the admin had no mandat to act against the will of the moderators.
the moderators were democratically elected by the members. the admin not.
when the mods tell the admin that person A is allowed to participate, the admin cannot
act against this will of the democratically elected moderators.
i doubt you understand this harvey.
its called democracy.
a concept neither you nor sam understands.

It would seem a compelling saga for those who've fallen in love with this now hoary myth. The ironic part is that you and Banks share the same trait in that when you think you possess the moral high ground, you feel free to act unilaterally even against the consensus of the other mods. Your actions are always morally justified. The principle difference between you is that Banks repented and apologized for acting unilaterally.

it was the idea of those people to take over CCC/CTF.
and as a result of this taking over, open chess was founded.

i will always fight against despotic behaviour/persons.
no matter in which computerchess forum they appear.
i did this before you entered the scene. and i will not stop
with you doing it.

BB+ · Post by **BB+** » Wed Jan 12, 2011 7:36 am

I can't confirm it...also, if you believe it to be true but don't have something to show then it's best to leave it alone until you can gather something -- just a thought...

This was my reaction to his original piece a few weeks ago. None of the available data fit the claims when I looked (and my recollection was quite different). I decided it was more prudent simply to ignore such rumblings (and the thread was quasi-hopeless in any case), but GB chose to feed the conspiracy theorists by rebutting it directly.

orgfert · Post by **orgfert** » Wed Jan 12, 2011 7:55 am

Adam Hair wrote:
Adam Hair wrote: I am asking for a coherent argument. It is easy to lob out criticism and insults. That is how most people go about
dismissing others. What would be much more enlightening are well thought out counterpoints to what I am about to
write.

1) Engine books are used in computer chess tournaments, actual competitions between engine authors. Your reference
to FIDE and ECF would apply to this scenario. It is irrelevant to rating lists. The SWCR, IPON, CEGT, and CCRL do not
use engine books because the goal is to find the strength of each engine, not engine+book.

orgfert wrote: Human rating lists are of the complete chess player, which includes memorized openings and endgames as well as learning.
Without a doubt, you are correct. However, most chess engines are not so complete. A choice has been made to test
what is common among all chess engines.

I don't see the relevance of testing programs that have been stripped to the least common denominator. I thought the point would be to discover the relative strengths of competing designs. Listing the strength of a lobotomized design looks meaningless.

Adam Hair wrote:
Adam Hair wrote:2)The "alien" books you refer to are forced on every engine. The purpose is to create a balanced position for the
engines to begin play. Whether or not balanced opening positions are achieved is another question.

orgfert wrote: Most human rating lists are not composed of such events.

Adam Hair wrote:3) TBs are not removed in general. The question of whether TBs improve Elo has been tested in some cases, but
the CCRL ( and the other rating groups, I think) uses TBs.

orgfert wrote: Ok.

Adam Hair wrote:4)On ponder strategies, when comparing the various rating lists it does not appear to affect the relative rankings
much whether ponder is on or off. And ponder off allows more games to be played and more engines tested.

orgfert wrote: Again, for the convenience of the tester. But in principle, this should not be done at all for reasons listed below.

Adam Hair wrote:5) One thing not explicitly named but is a part of the "total product" is learning. To test a engine for the purpose of
determining its Elo rating when learning is on creates problems. If an engine is continually changing, then the
comparision of different engines' results against that engine has little meaning. Bayeselo assumes that each
engine has an unchanging true Elo. If an engine is changing, then its true Elo is changing. Several engines with
learning on would make any rating list constructed more inaccurate than it already is.

orgfert wrote: I take this means Bayeselo cannot rate humans since they are dynamic, learning entities. If a program is written to be like a human, dynamic, its benefits will be concealed by this flawed testing. My analogy to human rating practices remains apropo. A program is a chess player and its strength is composed of design elements that are then arbitrarily turned off by testers. This is grossly incorrect.
How many programs are written to be dynamic? As far as I have seen, very few have been. Crafty, ProDeo, and (I think)
RomiChess come to mind. I am sure that there are some others. Yet, the vast majority of engines are static entities,
unlike humans. I fail to see how your analogy to humans and human rating lists apply. If engines like the three I named
were in the majority, then I would be in your camp on this issue. But they are not.

So you will admit that the few that use these techniques are put at a disadvantage? This would invalidate their standings in the lists, yes?

Not to mention that proliferation of these methods gives no incentive for programmers to design outside the boundaries of the testing methods, since they know anything else will be disabled by the list makers. And it seems a tacit opinion inherent in the testing process that chess on computers should not be treated a AI, since it's treatment as "intelligent behavior" is subject to surgery to dumb-down designs to a common level. We would never do this to biological intelligence.

IOW, computer chess is not considered AI by the masses, but only the stripped-down designs to the search itself, toss learning, full time management, self-tuned books, etc, even though they are striving for a scientific process in testing. But they aren't discovering the relative strengths of each AI design at all.

orgfert wrote:
Adam Hair wrote:6) Each rating list is an attempt at something approaching a scientific measurement of engine strength. How close
the approach comes is open to opinion. In each case, there is an attempt to eliminate sources of variation.
Sometimes there are some trade offs ( more testers allow for more games and engines but creates more statistical
noise), but at least there is some idea of each engine's strength ( there are many more that should be been tested).

orgfert wrote: This approach fundamentally destroys many design elements of a computer chess player's strength. Even if you discover that specific elements tend to make little difference, you are blinding the test to potentially effective strategies when they arrive in newer, more innovative versions.

Therefore, testing should be careful to include all design elements in a system for evaluation, whether they are deemed to differentiate or not. This is a fundamental principle that should never be violated.

Adam Hair wrote: This is done quite often in science : Define what you are trying to measure, try to eliminate sources of variation, then
measure it.

This is not a concern in chess player rating lists. It is strange to test a program in a way that it will not be used by the consumer, much less intended by its designer for real competitive play. The usefulness of the lists, while accepted by almost everyone, cannot be considered accurate to each programs design. People are essentially looking at inaccurate results, with most apparently not realizing it.

Adam Hair wrote: The scope of the testing, in this case, is narrowly defined. We are trying to find the relative strength of
each engine.

What about the design goal of the designer? How about testing the designs? CCRL-type testing looks exactly like taking a race cars and removing their engines and just testing the engines in a lab, as though nobody cares about the transmission, suspension or aerodynamics. It's like saying that racing is all about engines.

Adam Hair wrote:And there are a lot of engines out there, many being updated and new engines arriving each month. The
CCRL has been trying to test as many of them as possible. This goal may be at odds with what you would like to see done.
It has been helpful to others.

Are you also caught up with the notion that the CCRL is some kind of accreditation organization?
If we were, then our tests should include all design elements. Well, we are not and do not pretend to be.

I've no such notions. It might be one thing if this was but one of several kinds of lists. But for it to be the principle bellwether for chess AI, while very well intentioned, is nevertheless a Wrong Thing.

Wrong Thing: n. A design, action, or decision that is clearly incorrect or inappropriate. Often capitalized; always emphasized in speech as if capitalized. The opposite of the Right Thing; more generally, anything that is not the Right Thing. In cases where ‘the good is the enemy of the best’, the merely good — although good — is nevertheless the Wrong Thing. “In C, the default is for module-level declarations to be visible everywhere, rather than just within the module. This is clearly the Wrong Thing.”

orgfert · Post by **orgfert** » Wed Jan 12, 2011 8:16 am

BB+ wrote:
If a program is written to be like a human, dynamic, its benefits will be concealed by this flawed testing.
I might point out that a decade ago the "learning" functions were quite popular with engines. Pepito, Phalanx, and Crafty (the latter perhaps the impetus via being open-source) all did this, usually doing some sort of update to the book info based on obtained results (either WLD, or evals 5-10 moves out of book). The bifurcation of book/engine and the advent of UCI kinda diminished all this, though around 2005, Sherwin was championing the approach he used in RomiChess, and some others were still interested in it.

As a side-note, in the post by GB, he gives various excerpts and they all revolve around: "Who is #1?" -- which is perhaps a bit incongruous with some of the stated aims of CCRL. I might note that this is not always (or perhaps is no longer) a typical part of his updates.

From the perspective of AI, UCI could be the poster child of wrong design. But from a perspective of dumb calculators with interchangeable parts, UCI is a right thing -- a consequence perhaps of the fading of interest and influence of academia in computer chess.

OpenChess

OpenChess

A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread