A Talkchess thread: Misinformation being spread

orgfert · Post by **orgfert** » Fri Jan 14, 2011 9:08 am

You make it sound as if there is not very much left after turning off books, ponder, and learning. I believe many authors would disagree with you.

The real test is if you could do that to a human chess player, would he disagree with you?

Adam Hair wrote:Put at a disadvantage? No. Invalidate their standing? No. I would think some of the extras are there in order to make the chess engine play more interesting, not because it would help it be higher on any rating list.

Ask a human intelligence whether the things you turn off in an artificial intelligence are extras he can do without.

Adam Hair wrote:Do you really think that list makers determine what should be in a chess program?

They determine how to limit and hobble all the AI, which must surely affect differing designs by differing and unknowable quanta. The result is then considered a scientific effort at objective measurement, though how it could be considered so with such arbitrary interference is puzzling.

Adam Hair wrote:You give the whole group too much credit. Some authors undoubtly strive to climb the lists. Others pay more attention to giving their program a full set of features

I think you'll find I give them almost no credit (no offense intended) due to arbitrary tampering with the designs. I think this is done innocently in ignorance.

Adam Hair wrote:Anybody who does not understand that computer chess is artificial intellegence needs to do some reading. Yet,
simply testing for engine strength does not dismiss that connection. How do you think Bob Hyatt tests Crafty?
With books, ponder, and learning on? No. When he competes with Crafty, then yes.

Here you briefly take my point, but then immediately toss it aside with little attempt at explanation.

Adam Hair wrote:But when he wants to find out if some changes in the code makes Crafty stronger, all of that is turned off. The same for other authors.
And the rating lists serve as a check for them.

But in this case, he is tuning search and eval only. To do this, he must isolate it from its dynamic AI functions. Why rating lists would only be interested in a subset of the total AI seems strange. Why is no one interested in the total AI?

Adam Hair wrote:We are not giving any program a UL listing. The fact is this: we are testing the chess engine, not the chess program.

Ok, but this has been clear from the start. What is not so clear is the reason why no one wants to know the relative strength of the AI.

Adam Hair wrote:Start testing all the bells and whistles yourself.

What you are calling bells and whistles are the holy grail of AI. One wonders what we are endeavoring to discover by crippling whatever abilities have been achieved. I don't understand the answers that have be given to this so far.

Adam Hair wrote:You certainly feel strong about this. However, the strength of your convictions does not determine whether you are
right or wrong about an issue.

I'm somewhat at a loss since it seems completely obvious. Testing competing AI's would seem to be a goal with no shortage of champions, yet one finds it a goal of almost no one. And when it is suggested, eyebrows are raised as if the suggestion were utterly ludicrous (complete with laughing emoticons).

Adam Hair · Post by **Adam Hair** » Sat Jan 15, 2011 2:34 am

orgfert wrote:
Adam Hair wrote:But when he wants to find out if some changes in the code makes Crafty stronger, all of that is turned off. The same for other authors.
And the rating lists serve as a check for them.
But in this case, he is tuning search and eval only. To do this, he must isolate it from its dynamic AI functions. Why rating lists would only be interested in a subset of the total AI seems strange. Why is no one interested in the total AI?

This is the crux of the discussion here. It is not the case that "nobody is interested in the total AI". There are people
who actually play against chess programs that make use of these functions. However, it does seem that many
engine authors are not interested in adding these functions. My interest is to test as many engines as possible. To
leave these functions on would provide extra information about the programs that have learning functions. But it
would give us less information about the engines that do not have these functions. Given my interest,
majority rules.

Adam Hair wrote:You certainly feel strong about this. However, the strength of your convictions does not determine whether you are
right or wrong about an issue.

I'm somewhat at a loss since it seems completely obvious. Testing competing AI's would seem to be a goal with no shortage of champions, yet one finds it a goal of almost no one. And when it is suggested, eyebrows are raised as if the suggestion were utterly ludicrous (complete with laughing emoticons).[/quote]

I haven't suggested that the idea is ludicrous ( in contrast to your claim that anybody who does not agree with you on
this as being dumb ). But there is only a small subset of engines that have those features. I am interested in all engines.
Perhaps you should test those things that you want to see tested.

orgfert · Post by **orgfert** » Sat Jan 15, 2011 5:45 am

Adam Hair wrote:
orgfert wrote:
Adam Hair wrote:But when he wants to find out if some changes in the code makes Crafty stronger, all of that is turned off. The same for other authors.
And the rating lists serve as a check for them.
But in this case, he is tuning search and eval only. To do this, he must isolate it from its dynamic AI functions. Why rating lists would only be interested in a subset of the total AI seems strange. Why is no one interested in the total AI?
This is the crux of the discussion here. It is not the case that "nobody is interested in the total AI". There are people
who actually play against chess programs that make use of these functions. However, it does seem that many
engine authors are not interested in adding these functions. My interest is to test as many engines as possible. To
leave these functions on would provide extra information about the programs that have learning functions. But it
would give us less information about the engines that do not have these functions. Given my interest,
majority rules.

See for yourself if this is not the case: We wouldn't dream of wishing to disable a particular genius for some aspect of the game in human players (or exclude them from the group), because we thought a rating list that included them would tell us more about them than those who lacked the particular genius. Or would we?

To me this seems a rhetorical question. But maybe I'm missing something.

Adam Hair · Post by **Adam Hair** » Sat Jan 15, 2011 6:50 am

orgfert wrote:
Adam Hair wrote:
orgfert wrote:
Adam Hair wrote:But when he wants to find out if some changes in the code makes Crafty stronger, all of that is turned off. The same for other authors.
And the rating lists serve as a check for them.
But in this case, he is tuning search and eval only. To do this, he must isolate it from its dynamic AI functions. Why rating lists would only be interested in a subset of the total AI seems strange. Why is no one interested in the total AI?
This is the crux of the discussion here. It is not the case that "nobody is interested in the total AI". There are people
who actually play against chess programs that make use of these functions. However, it does seem that many
engine authors are not interested in adding these functions. My interest is to test as many engines as possible. To
leave these functions on would provide extra information about the programs that have learning functions. But it
would give us less information about the engines that do not have these functions. Given my interest,
majority rules.
See for yourself if this is not the case: We wouldn't dream of wishing to disable a particular genius for some aspect of the game in human players (or exclude them from the group), because we thought a rating list that included them would tell us more about them than those who lacked the particular genius. Or would we?

To me this seems a rhetorical question. But maybe I'm missing something.

All I can say is I don't believe apples are being compared to apples here. All human players can learn. Most chess
engines do not have learning functions. If most engines did have learning functions, then the way engines are tested
might would be different.

BB+ · Post by **BB+** » Sat Jan 15, 2011 8:46 am

What kind of learning do Hiarcs and Naum have? I've never seen it (unless Hiarcs got to it on the 13 series.)

Well, they at least advertise it.
HIARCS 10 (as in HIARCS 13): "Engine book learning and position learning capabilities are built into all HIARCS UCI engines helping HIARCS improve through its playing experience".
HIARCS 9 gives it as a major advance: "The latest version 9.0 of the program has been enhanced and extended in many ways, particularly in terms of implementing concrete chess knowledge and positional learning."
From the UCI options: http://www.hiarcs.com/pc_uci_options.htm
Position Learning (ON)
This setting allows HIARCS to learn from the games it plays or analyses. This can improve its play in future games. The default is ON.
Book Learning (ON)
This setting allows HIARCS to use its experiences with the current book to make decisions about which moves to make from the book. HIARCS has clever book learning so please use it! The default is ON and clearly best.

Naum has an UCI option called "EnableBookLearning".

BB+ · Post by **BB+** » Sat Jan 15, 2011 9:14 am

Most chess engines do not have learning functions. If most engines did have learning functions, then the way engines are tested might would be different.

Is it fair to argue the opposite? If more testing coalitions included learning features, would the typical amateur engine be more likely to implement them? A number of commercial engines (HIARCS, Shredder, Naum) do have learning functions, particularly with opening books. Some of them even recommend this -- for instance, the Junior FAQ has (emphasis added):

Q. How can I get Deep Junior to use its own opening book?
A. Deep Junior 12 comes with its own huge chess opening book by GM Alon Greenfeld in Chessbase ctg format (511Mb download) for use in all Chessbase or compatible GUIs. The OwnBook engine parameter enables use of the Deep Junior own engine book which is much smaller than the ctg book. We recommend the use of the ctg book with book learning on for all official testing.

Admittedly, once you've decided, as per CCRL, to use generic books, much of this becomes essentially moot. [Also, I'm not sure if what is meant here is some CTG-based learning rather than something Junior-based -- similarly with Fritz, while Shredder, Naum, and HIARCS all seem to have at least some "learning" which goes beyond just CTG manipulation].

hyatt · Post by **hyatt** » Sat Jan 15, 2011 7:12 pm

orgfert wrote:
You make it sound as if there is not very much left after turning off books, ponder, and learning. I believe many authors would disagree with you.
The real test is if you could do that to a human chess player, would he disagree with you?

Adam Hair wrote:Put at a disadvantage? No. Invalidate their standing? No. I would think some of the extras are there in order to make the chess engine play more interesting, not because it would help it be higher on any rating list.
Ask a human intelligence whether the things you turn off in an artificial intelligence are extras he can do without.

Adam Hair wrote:Do you really think that list makers determine what should be in a chess program?
They determine how to limit and hobble all the AI, which must surely affect differing designs by differing and unknowable quanta. The result is then considered a scientific effort at objective measurement, though how it could be considered so with such arbitrary interference is puzzling.

Adam Hair wrote:You give the whole group too much credit. Some authors undoubtly strive to climb the lists. Others pay more attention to giving their program a full set of features
I think you'll find I give them almost no credit (no offense intended) due to arbitrary tampering with the designs. I think this is done innocently in ignorance.

Adam Hair wrote:Anybody who does not understand that computer chess is artificial intellegence needs to do some reading. Yet,
simply testing for engine strength does not dismiss that connection. How do you think Bob Hyatt tests Crafty?
With books, ponder, and learning on? No. When he competes with Crafty, then yes.

This is _really_ mixing apples and oranges. In my testing, I am not trying to find out how much better or worse my program is when compared to another program. I am testing different versions of the _same_ program to see if the changes are good or bad. That is far different than the intent of a chess tournament or a rating list.

I eliminate pondering, because it increases randomness. I eliminate the book because I don't want to have to test every opening and I don't want to deal with the interference learning can cause. I don't use SMP because that is a performance issue that has nothing to do with modifying the program's search or evaluation to improve them. And all I am trying to measure is the change(s) we makes to the evaluation or search. Was the change good or bad.

A rating list or a tournament is a different thing. There, the "whole system" is under scrutiny, book, search, eval, speed (smp included) endgame tables, learning, whatever else your engine does to make it play better.

So, as I said, this is apples and oranges. How I test has nothing to do with how one should conduct a tournament, nor a rating list...

I agree with your comments below. We already know that a book can make a huge difference in real games. A good book will both (a) guide the program into openings where it plays well and (b) guide the program away from openings where its eval or search seem ill-suited to handle. This means that one can either try to fix a hole in their evaluation, or they can use their book to avoid that hole. If you graft an odd book onto a program, you deny it this protection that the author depended on, and the results can be artificially worse. Or if your opponent uses a book that is ill-suited to it, your program might look better if it forces the opponent into openings it would normally avoid.

The only question is, which is better? If you only want to compare engines, no books could work, assuming you expect all engines to play all openings equally skillfully. But none of the authors really believe that is possible. Humans don't play that way, we avoid that which we don't understand or are unfamiliar with.

Here you briefly take my point, but then immediately toss it aside with little attempt at explanation.

Adam Hair wrote:But when he wants to find out if some changes in the code makes Crafty stronger, all of that is turned off. The same for other authors.
And the rating lists serve as a check for them.
But in this case, he is tuning search and eval only. To do this, he must isolate it from its dynamic AI functions. Why rating lists would only be interested in a subset of the total AI seems strange. Why is no one interested in the total AI?

Adam Hair wrote:We are not giving any program a UL listing. The fact is this: we are testing the chess engine, not the chess program.
Ok, but this has been clear from the start. What is not so clear is the reason why no one wants to know the relative strength of the AI.

Adam Hair wrote:Start testing all the bells and whistles yourself.
What you are calling bells and whistles are the holy grail of AI. One wonders what we are endeavoring to discover by crippling whatever abilities have been achieved. I don't understand the answers that have be given to this so far.

Adam Hair wrote:You certainly feel strong about this. However, the strength of your convictions does not determine whether you are
right or wrong about an issue.
I'm somewhat at a loss since it seems completely obvious. Testing competing AI's would seem to be a goal with no shortage of champions, yet one finds it a goal of almost no one. And when it is suggested, eyebrows are raised as if the suggestion were utterly ludicrous (complete with laughing emoticons).

orgfert · Post by **orgfert** » Sun Jan 16, 2011 4:17 am

Adam Hair wrote: To leave these functions on would provide extra information about the programs that have learning functions. But it
would give us less information about the engines that do not have these functions.

I don't see how your conclusion follows logically. I'm not even sure that its possible to demonstrate scientifically whether more or less information is being discovered or lost in either category. You are focusing on learning, but it seems even more logical that just in disabling pondering, a critical component of an AI's potential strength is being ignored, as there are at least a few differentiating nuances in the technique, and its a widespread ability in chess AI. It can be extremely important in building momentum in a game that requires momentum to be successful.

It seems as if the intention is to create a rating list of crippled AIs -- scientifically devised to provide as little useful information on the relative strengths of the AIs as can possibly be arranged. I'm sure this was not the intention, but more an accident of not knowing how long it should have been thought about before deciding upon a procedure, and in applying scientific principles that do not pertain to accurate measurement of dynamic systems, such as trying to eliminate variables that are native to intelligence. That is simply defeating the purpose of measuring the intelligence in the first place.

BB+ · Post by **BB+** » Mon Jan 17, 2011 12:27 pm

We already know that a book can make a huge difference in real games

The CCC thread from 2005 had various guesses, from 20-300 Elo (the lower coming from VR, comparing an "extermely good" book to one prepared automatically in a few hours), and even VD's claim of 700 Elo for "book" versus "no book" at the WCCC level (perhaps when tuned against a specific opponent). I'm not sure there was any consensus, though book testers seem to show 50-100 Elo differentials between subsequent generations of a book as quite possible.

orgfert · Post by **orgfert** » Wed Jan 19, 2011 12:43 am

Adam Hair wrote:All I can say is I don't believe apples are being compared to apples here. All human players can learn. Most chess
engines do not have learning functions. If most engines did have learning functions, then the way engines are tested
might would be different.

This would not seem to follow, since most engines have pondering, yet this fact has not changed the way chess AI are tested with pondering switched off.

OpenChess

OpenChess

A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread

Re: A Talkchess thread: Misinformation being spread