To kick off some technical discussions

Richard Vida · Post by **Richard Vida** » Sat Jun 12, 2010 12:17 pm

mcostalba wrote:
Rebel wrote: I stopped developing mine some years ago, if memory serves me well my exclusion list is as follows:

1) No LMR in the last 3 plies of the search, this because of the use of futility pruning;

2) Always search at least 3 moves;

3) Hash-move;

4) Captures, guess I have to try your idea skipping bad captures;

5) Queen promotions (no minors);

6) Extended moves;

7) Moves that give check;

8) Moves that escape from check;

9) Static Mate-threads;

10) Killer moves;

Apart from bad captures that we have still to test (but I guess becuase in SF there is razoring starting form depth 4 plies the benefit of reducing search of bad captures should be mitigated, given that in that case position evaluation is far below beta so razored anyway) your list is the same of SF with the expecption of point (11), we currenlty do not have special code to avoid reducing killer moves (perhaps something else to try )

As for not reducing killer moves: In Critter I am deciding this by looking at SEE values of good captures. If there was any capture with SEE >= 100 then killer moves are subject to LMR. If there were only SEE == 0 captures then killer moves are not reduced. Of course other usual rules are also considered (movecnt > 3 && !extension ...)

Richard

Chris Whittington · Post by **Chris Whittington** » Sat Jun 12, 2010 12:17 pm

I've bolded the sections below where both poster are saying the same thing without actually saying.

The testing methodology can give ELO increases but these increases are NOT mapping to strength or chess skill.

Computer ELO and computer ELO lists are a kind of misleading nonsense.

mcostalba wrote:
thorstenczub wrote:
i doubt that the 15 ELO you "realized" by testing against yourself will be shown against other engines too.
I too, normally is smaller but the important key aspect is that normally it is smaller but with the same sign.

This is fundamental becuase allows to use the self--testing as a kind of leverage effect, a magnifying lens, to see if a patch is good or bad, also if the absolute value of a patch is very small against other engines, but turns out to be measurable against itself.

So the bottom line is that until the sign is the same the "incestuos" effect is a good thing to have IMHO because artifically increases the difference so to move it above noise level.

thorstenczub wrote: in the stoneage we had to play test games by hand.
then the autoplayer was invented and we needed a pc for each program.
so 8 or 12 pc's.

then came those wonderful GUIs such as ARENA for windows. suddenly you were able
to test eng-eng matches and engine-tournaments on 1 pc.

and the next industrial method of testing was the autotester with very fast games
and only statistical measurement without even looking into the games.
Yes then come cutechess-cli (far better and higher quality then crappy Arena) and that road map that you have summerized continues in that direction. So I don't see any reason to turn and look back, but instead go ahead along the lines you have already exposed.

thorstenczub wrote: maybe one should combine the methods.
maybe not

IMHO this is just an antropocentric view that has no scientific base apart from historically reasons and that just is a waste of resources IMHO.

As soon you realize that the quality metric to apply to engines cannot have human elements then as quicker you progress your engine.

thorstenczub · Post by **thorstenczub** » Sat Jun 12, 2010 12:37 pm

mcostalba wrote:IMHO this is just an antropocentric view that has no scientific base apart from historically reasons and that just is a waste of resources IMHO.

As soon you realize that the quality metric to apply to engines cannot have human elements then as quicker you progress your engine.

forgive me for beeing an imperfect human

but as long as i see an engine lose because it has no clue that KBB-K is draw when the bishops have the same color, or that you cannot mate with 2 knights, or that wrong colored bishop is draw,
...
IMO a good chess engine should identify those things without tablebases .

and when i cannot see the games, i cannot identify those weaknesses in 2800 or 3000 ELO engines.

it still astonished me to see those things happen in programs that are that strong.

it gives them such a strange mechanical computerished skin... it does not fit in my paradigm of intelligent programs or intelligent methods. its not human. its machine-like.

i am still not used to this.

mcostalba · Post by **mcostalba** » Sat Jun 12, 2010 12:50 pm

Richard Vida wrote: As for not reducing killer moves: In Critter I am deciding this by looking at SEE values of good captures. If there was any capture with SEE >= 100 then killer moves are subject to LMR. If there were only SEE == 0 captures then killer moves are not reduced. Of course other usual rules are also considered (movecnt > 3 && !extension ...)

Richard

This is very sophisticated indeed. I think I will try the simpler form of not reducing and then in case add more logic.

I have seen LMR and position static value does not match easily. Namely I tried reducing less at nodes with static value near beta...it failed

It seems LMR does not care if a node is near beta or vary far from it...this is strange indeed and counter-intuitive, but so far we had no success with testing in that direction. I say this to give a bit of rationale to my first attempt to not consider SEE (and indirectly position value) in killer's skipped LMR.

BTW, thanks Richard

mcostalba · Post by **mcostalba** » Sat Jun 12, 2010 12:53 pm

thorstenczub wrote:its not human. its machine-like.

I think we are saying the same thing

I am just a bit more used to this idea so to accept that for machine-like stuff you need machine-like metric

Anyhow your observation on KBB-K is draw is correct: it is a misisng endgame feature if an engine doesn't spot a draw in that case.

Rebel · Post by **Rebel** » Sat Jun 12, 2010 1:08 pm

thorstenczub wrote:
forgive me for beeing an imperfect human

but as long as i see an engine lose because it has no clue that KBB-K is draw when the bishops have the same color, or that you cannot mate with 2 knights, or that wrong colored bishop is draw,
...
IMO a good chess engine should identify those things without tablebases .

and when i cannot see the games, i cannot identify those weaknesses in 2800 or 3000 ELO engines.

it still astonished me to see those things happen in programs that are that strong.

it gives them such a strange mechanical computerished skin... it does not fit in my paradigm of intelligent programs or intelligent methods. its not human. its machine-like.

i am still not used to this.

It's really simple. Programmers spend their time on issues that matter. That is search from day-1 and still valid. Issue's you mentioned above good for 0.25 elo at best have a low priority. The guilty one is the competition between computers, if there was only the human aspect things would be different. Which program has solved the blocked-pawn-string problem and simply plays 1.Qxf4 from scratch?

4k3/8/3p3p/p1pPp1pP/PpP1PpP1/1P3P2/3Q4/4K3 w - -
No programmer cares because situations like these don't happen but you are right it does look stupid.

Ed

64x · Post by **64x** » Sat Jun 12, 2010 1:10 pm

FEN string seems to have a problem. The script triggers an error.

Jeremy Bernstein · Post by **Jeremy Bernstein** » Sat Jun 12, 2010 1:12 pm

64x wrote:FEN string seems to have a problem. The script triggers an error.

I just removed that error dialog from the script which does the generation, since it doesn't effect what's displayed. I'll do some more hacking on it as time permits.

Uly · Post by **Uly** » Sat Jun 12, 2010 1:27 pm

Jeremy Bernstein wrote:I just removed that error dialog

I'm still getting it.

orgfert · Post by **orgfert** » Sat Jun 12, 2010 5:12 pm

Chris Whittington wrote:I've bolded the sections below where both poster are saying the same thing without actually saying.

The testing methodology can give ELO increases but these increases are NOT mapping to strength or chess skill.

Computer ELO and computer ELO lists are a kind of misleading nonsense.

How do you know?

OpenChess

OpenChess

To kick off some technical discussions

Re: To kick off some technical discussions

Re: To kick off some technical discussions

Re: To kick off some technical discussions

Re: To kick off some technical discussions

Re: To kick off some technical discussions

Re: To kick off some technical discussions

Re: To kick off some technical discussions

Re: To kick off some technical discussions

Re: To kick off some technical discussions

Re: To kick off some technical discussions