Strange results when TT's and quiescence are combined??

Code, algorithms, languages, construction...
Post Reply
Bombax
Posts: 3
Joined: Tue Mar 26, 2013 2:03 am

Strange results when TT's and quiescence are combined??

Post by Bombax » Tue Mar 26, 2013 2:43 am

Hi all.

We're having some problems with a chess program we're developing. The program right now uses negascout (fail-soft) with PV move ordering, iterative deepening and an always-replace transposition table. It also uses SEE-ordered quiescence search (fail-hard).

The problem occurs when both quiescence search and transposition tables are activated. In this case the program produces moves/scores that seem "off". The test position at the moment is this one:

r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq - 0 1

The program plays the same move (Bxa6) but has a worse score with both activated (-226) than with transposition tables off (732) or both off (1034), and plays worse moves after this one. We're wondering if we've made an implementation mistake.

I have taken the time to produce a clean pseudo-code version of our program below. I would be delighted if anyone could explain what we might be doing wrong, or what some good steps to debug the program is, or test positions to debug, and so on, so we can proceed. Thanks!

Code: Select all

int maxd;
void start(Position pos, int d) {
    for (maxd=1; maxd<d; d++)
        negascout_best(pos, 0);
}

int negascout_best(Position pos, int d) {
    if (fifty_move_rule(pos)) return 0;
    if (repetition_rule(pos)) return 0;
    
    if (tt_contains_entry(pos.zobrist)) {
        entry = tt_get_entry(pos.zobrist);
        if (entry.depth >= maxd-d && is_exact_score(entry)) {
            return entry.score;
        }
    }

    if (d == maxd) {
        score = quiescence(pos, 0, -INF_SCORE, INF_SCORE);
        tt_replace(pos.zobrist, score, EXACT_SCORE);
        return score;
    }

    b = INF_SCORE; maxscore = -INF_SCORE;
    children = get_children_pv(pos); // ordered using the pv
    for (all children) {
        if (child_is_pv(child))
            score = -negascout_best(child, d+1);
        else
            score = -negascout(child, d+1, -b, -maxscore);
        if (score > maxscore) { // score improved
            if (b == INF_SCORE || d >= maxd-2) { // acc. to negascout paper score is always correct up to this depth
                maxscore = score;
            } else { // re-search
                maxscore = -negascout(child, d+1, -INF_SCORE, -score);
            }
            savepv(child);
        }
        b = maxscore+1; // set null window
    }

    if (had_children) {
        tt_replace(pos.zobrist, maxscore, EXACT_SCORE);
        return maxscore;
    } else {
        if (checkmate(pos)) {
            tt_replace(pos.zobrist, -CHECKMATE_SCORE, EXACT_SCORE);
            return -CHECKMATE_SCORE;
        } else { // stalemate
            tt_replace(pos.zobrist, 0, EXACT_SCORE);
            return 0;
        }
    }
}

int negascout(Position pos, int d, int alpha, int beta) {
    if (fifty_move_rule(pos)) return 0;
    if (repetition_rule(pos)) return 0;
    
    if (tt_contains_entry(pos.zobrist)) {
        entry = tt_get_entry(pos.zobrist);
        if (entry.depth >= maxd-d) {
            if (is_exact_score(entry))
                return entry.score;
            else if (is_lower_bound(entry) && entry.score >= beta)
                return entry.score;
            else if (is_upper_bound(entry) && entry.score <= alpha)
                return entry.score;
        }
    }

    if (d == maxd) {
        score = quiescence(pos, 0, alpha, beta);
        tt_replace(pos.zobrist, score, get_score_type(score, alpha, beta));
        return score;
    }

    b = beta; maxscore = -INF_SCORE; a = alpha;
    children = get_children(pos);
    for (all children) {
        score = -negascout(child, d+1, -b, -a);
        if (score > maxscore) { // score improved
            if (b == beta || d >= maxd-2) { // acc. to negascout paper score is always correct up to this depth
                maxscore = score;
            } else { // re-search
                maxscore = -negascout(child, d+1, -beta, -score);
            }
            savepv(child);
        }
        if (maxscore >= beta) {
            tt_replace(pos.zobrist, maxscore, LOWER_BOUND)
            return maxscore;
        }
        a = max(maxscore, alpha);
        b = a+1; // set null window
    }

    if (had_children) {
        tt_replace(pos.zobrist, maxscore, get_score_type(maxscore, alpha, beta));
        return maxscore;
    } else {
        if (checkmate(pos)) {
            tt_replace(pos.zobrist, -CHECKMATE_SCORE, get_score_type(-CHECKMATE_SCORE, alpha, beta));
            return -CHECKMATE_SCORE;
        } else { // stalemate
            tt_replace(pos.zobrist, 0, get_score_type(0, alpha, beta));
            return 0;
        }
    }
}

int quiescence(Position pos, int depth, int alpha, int beta) {
    stand_pat = pos.to_move * evaluate(pos);

    if (stand_pat >= beta) {
        return beta;
    }
    if (stand_pat > alpha) {
        alpha = stand_pat;
    }

    children = get_dramatic_moves(pos); // gets captures whose SEE value is larger than 0 (and ordered after them)
    for (all children) {
        score = -quiescence(child, depth+1, -beta, -alpha);
        if (score >= beta) {
            return beta;
        }
        if (score > alpha) {
            alpha = score;
        }
    }
    return alpha;
}

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: Strange results when TT's and quiescence are combined??

Post by BB+ » Tue Mar 26, 2013 4:59 am

Admittedly pseudocode is not always perfect, but I don't see where you set entry.depth in the tt?

There is only tt_replace(), which doesn't seem to take a depth argument.

Code: Select all

tt_replace(pos.zobrist, maxscore, EXACT_SCORE);

Bombax
Posts: 3
Joined: Tue Mar 26, 2013 2:03 am

Re: Strange results when TT's and quiescence are combined??

Post by Bombax » Tue Mar 26, 2013 2:24 pm

My bad, I did indeed not write the pseudo-code out completely correctly. Here it is with that problem corrected. Thanks for looking at it!

Code: Select all

int maxd;

tt_replace(u64 zobrist, int score, int scoretype, int depth);

void start(Position pos, int d) {
    for (maxd=1; maxd<d; d++)
        negascout_best(pos, 0);
}

int negascout_best(Position pos, int d) {
    if (fifty_move_rule(pos)) return 0;
    if (repetition_rule(pos)) return 0;
    
    if (tt_contains_entry(pos.zobrist)) {
        entry = tt_get_entry(pos.zobrist);
        if (entry.depth >= maxd-d && is_exact_score(entry)) {
            return entry.score;
        }
    }

    if (d == maxd) {
        score = quiescence(pos, 0, -INF_SCORE, INF_SCORE);
        tt_replace(pos.zobrist, score, EXACT_SCORE, 0);
        return score;
    }

    b = INF_SCORE; maxscore = -INF_SCORE;
    children = get_children_pv(pos); // ordered using the pv
    for (all children) {
        if (child_is_pv(child))
            score = -negascout_best(child, d+1);
        else
            score = -negascout(child, d+1, -b, -maxscore);
        if (score > maxscore) { // score improved
            if (b == INF_SCORE || d >= maxd-2) { // acc. to negascout paper score is always correct up to this depth
                maxscore = score;
            } else { // re-search
                maxscore = -negascout(child, d+1, -INF_SCORE, -score);
            }
            savepv(child);
        }
        b = maxscore+1; // set null window
    }

    if (had_children) {
        tt_replace(pos.zobrist, maxscore, EXACT_SCORE, maxd-d);
        return maxscore;
    } else {
        if (checkmate(pos)) {
            tt_replace(pos.zobrist, -CHECKMATE_SCORE, EXACT_SCORE, maxd-d);
            return -CHECKMATE_SCORE;
        } else { // stalemate
            tt_replace(pos.zobrist, 0, EXACT_SCORE, maxd-d);
            return 0;
        }
    }
}

int negascout(Position pos, int d, int alpha, int beta) {
    if (fifty_move_rule(pos)) return 0;
    if (repetition_rule(pos)) return 0;
    
    if (tt_contains_entry(pos.zobrist)) {
        entry = tt_get_entry(pos.zobrist);
        if (entry.depth >= maxd-d) {
            if (is_exact_score(entry))
                return entry.score;
            else if (is_lower_bound(entry) && entry.score >= beta)
                return entry.score;
            else if (is_upper_bound(entry) && entry.score <= alpha)
                return entry.score;
        }
    }

    if (d == maxd) {
        score = quiescence(pos, 0, alpha, beta);
        tt_replace(pos.zobrist, score, get_score_type(score, alpha, beta), 0);
        return score;
    }

    b = beta; maxscore = -INF_SCORE; a = alpha;
    children = get_children(pos);
    for (all children) {
        score = -negascout(child, d+1, -b, -a);
        if (score > maxscore) { // score improved
            if (b == beta || d >= maxd-2) { // acc. to negascout paper score is always correct up to this depth
                maxscore = score;
            } else { // re-search
                maxscore = -negascout(child, d+1, -beta, -score);
            }
            savepv(child);
        }
        if (maxscore >= beta) {
            tt_replace(pos.zobrist, maxscore, LOWER_BOUND)
            return maxscore;
        }
        a = max(maxscore, alpha);
        b = a+1; // set null window
    }

    if (had_children) {
        tt_replace(pos.zobrist, maxscore, get_score_type(maxscore, alpha, beta), maxd-d);
        return maxscore;
    } else {
        if (checkmate(pos)) {
            tt_replace(pos.zobrist, -CHECKMATE_SCORE, get_score_type(-CHECKMATE_SCORE, alpha, beta), maxd-d);
            return -CHECKMATE_SCORE;
        } else { // stalemate
            tt_replace(pos.zobrist, 0, get_score_type(0, alpha, beta), maxd-d);
            return 0;
        }
    }
}

int quiescence(Position pos, int depth, int alpha, int beta) {
    stand_pat = pos.to_move * evaluate(pos);

    if (stand_pat >= beta) {
        return beta;
    }
    if (stand_pat > alpha) {
        alpha = stand_pat;
    }

    children = get_dramatic_moves(pos); // gets captures whose SEE value is larger than 0 (and ordered after them)
    for (all children) {
        score = -quiescence(child, depth+1, -beta, -alpha);
        if (score >= beta) {
            return beta;
        }
        if (score > alpha) {
            alpha = score;
        }
    }
    return alpha;
}

Bombax
Posts: 3
Joined: Tue Mar 26, 2013 2:03 am

Re: Strange results when TT's and quiescence are combined??

Post by Bombax » Tue Mar 26, 2013 6:37 pm

After doing some debugging, I started to wonder if a node might put into the table an *exact* score when it is done with its searching, yet in the children nodes that very score was not exact. Could someone look through my pseudo-code and perhaps see if there's a problem with the windows, or such?

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: Strange results when TT's and quiescence are combined??

Post by BB+ » Wed Mar 27, 2013 5:36 am

What happens in negascout when the maxscore is less than alpha? You return maxscore (not in the [alpha,beta] interval), and store alpha? (Again the code is not clear, as get_store_type() returns "bound" I guess, but is that "bound" stored using alpha or maxscore?). Note that in quiescense you return alpha when all moves are insufficient, which is then passed up in the "d==maxd" case of negascout. I don't know if this is a problem or not.

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: Strange results when TT's and quiescence are combined??

Post by hyatt » Wed Mar 27, 2013 5:15 pm

BB+ wrote:What happens in negascout when the maxscore is less than alpha? You return maxscore (not in the [alpha,beta] interval), and store alpha? (Again the code is not clear, as get_store_type() returns "bound" I guess, but is that "bound" stored using alpha or maxscore?). Note that in quiescense you return alpha when all moves are insufficient, which is then passed up in the "d==maxd" case of negascout. I don't know if this is a problem or not.

Should work fine either way. If you store or return a score outside the alpha/beta window, this is called "fail-soft". If you only store the actual bound when a score lies outside the a-b window, this is called fail-hard. If you do something like mtd(f) then fail-soft is better as a fail-high at the root gives you some idea how far to shift the window for the next search, where with fail-hard, you just have the current beta score which is not much of a hint since it is just alpha+1...

I use fail-hard myself, as it makes the tree search easier to understand on those occasions where I have to dump the tree while debugging...

Post Reply