Ideal hash size for short time controls

hyatt · Post by **hyatt** » Tue Jun 29, 2010 8:20 pm

Mincho Georgiev wrote:
hyatt wrote:Some misinformation above, so to clarify.

1. Bigger hash is definitely bad, because the larger your program, the more physical pages of RAM you use, and the more TLB entries you need to avoid "walking the tables" to convert virtual to physical addresses. If you blow the TLB, you can (on 64 bit hardware) add 4 extra memory accesses to each primary memory address because of the virtual-to-physical mapping that must be done. So more RAM = more TLB entries needed, and when you run out, memory access time goes _way_ up.

2. Bigger hash makes the search more efficient, until you reach the point where you can store the entire tree you are searching. Going beyond that only stresses the TLB and slows things down. The idea is that you want a hash size that is about the size of the tree you want to store in the table. If you search 1M nodes per second, and you are playing 1 second per move, you need a hash size of 1M. A little bigger might help marginally, but you can quickly lose this marginal gain due to TLB thrashing. If the engine does not hash q-search nodes (Crafty is an example) then you need about 10% of the total nodes searched as a hash size. In the above, for Crafty, 100K entries would be fine. 1M would not hurt as that is not big enough to hurt so much.

A good processor might have 1024 TLB entries. That means for 1024 virtual pages, you can get instant virtual-to-real translations. At 4K byte page size, that means going beyond 4M total will start to stress the TLB. Modern processors can also use "large pages" which can be either 2M or 4M depending on architecture. That lets you address 1000x as much memory using the same TLB.
I'll say again. I don't see how smaller hash table will bring the benefit over the bigger one. I didn't say that bigger is better. Maybe I'm missing something, so help me if I am, but we are not talking about playing vs engine. The topic was about engine match, right ? So the conditions (hashsize) for the two engines will be equal? So the lookaside buffer issues will be valid for both engines. Probably this will have some impact on randomness, but I don't see what else ?

Several issues. Say program A is Crafty, program B is something that hashes q-search. TO reach the same level of efficiency, program B needs a hash table 10x larger than crafty, in terms of entries. It is not a given that an entry on either is the same size, either. So it is not quite as cut-and-dried as you might think. You are assuming all else is equal, but it might not be as you can see...

Mincho Georgiev · Post by **Mincho Georgiev** » Tue Jun 29, 2010 9:06 pm

Sentinel wrote:In you carefully read Charles post that started this thread you'll see that the hash size was not the same for both engines and the thing that Charles noticed is that engine with smaller hash was playing stronger for the given TC.

That explains things, I've missed that.

hyatt wrote: Several issues. Say program A is Crafty, program B is something that hashes q-search. TO reach the same level of efficiency, program B needs a hash table 10x larger than crafty, in terms of entries. It is not a given that an entry on either is the same size, either. So it is not quite as cut-and-dried as you might think. You are assuming all else is equal, but it might not be as you can see...

Another missprediction, sorry for that. I assume too fast that we are talking about 2 versions of a same program, honestly I don't know why. You're right of course.
Thank you!

BB+ · Post by **BB+** » Wed Jun 30, 2010 7:18 am

The idea is that you want a hash size that is about the size of the tree you want to store in the table. If you search 1M nodes per second, and you are playing 1 second per move, you need a hash size of 1M.

I'm not sure whether by "size" you mean "entries", or "bytes". I think 1 hash entry per node is a bit of overkill, even if you use hash in qsearch. As I noted above, IvanHoe claims to fill hash at 10-15 MB/s (on one cpu), while it searches somewhat under 2Mn/s, yielding a factor of 5 or so -- the hash entries are 128-bit, and qsearch is hashed, so this could be part of any difference. There is also the question of how long (that is, for how many consecutive searches) a hash entry is likely to be useful.

hyatt · Post by **hyatt** » Wed Jun 30, 2010 8:46 pm

BB+ wrote:
The idea is that you want a hash size that is about the size of the tree you want to store in the table. If you search 1M nodes per second, and you are playing 1 second per move, you need a hash size of 1M.
I'm not sure whether by "size" you mean "entries", or "bytes". I think 1 hash entry per node is a bit of overkill, even if you use hash in qsearch. As I noted above, IvanHoe claims to fill hash at 10-15 MB/s (on one cpu), while it searches somewhat under 2Mn/s, yielding a factor of 5 or so -- the hash entries are 128-bit, and qsearch is hashed, so this could be part of any difference. There is also the question of how long (that is, for how many consecutive searches) a hash entry is likely to be useful.

It may well be a bit of overkill, since we obviously get significant "hits" which don't require additional stores. And we also overwrite the same position multiple times as the depth advances. But it is a reasonable 1st-order approximation. One could refine it to some fraction N times the total nodes expected to be searched. But it is doubtful N would be refined by a whole order of magnitude so the 1st approximation would not be that far off.

This number probably would vary program-by-program since the deeper you go (selectively) the greater the probability of re-using old entries.

Also, "hash-full" is not a very good measure. When you reach 90% you are into heavy-overwriting territory, losing important information. A better measure is "how many times do I overwrite a different position from the current search?" since that is an issue of lost information. Even 50% full will result in some lost information due to different positions hashing to the same entry, it will just happen less frequently.

Mincho Georgiev · Post by **Mincho Georgiev** » Thu Jul 01, 2010 7:51 am

hyatt wrote:Also, "hash-full" is not a very good measure. When you reach 90% you are into heavy-overwriting territory, losing important information. A better measure is "how many times do I overwrite a different position from the current search?" since that is an issue of lost information. Even 50% full will result in some lost information due to different positions hashing to the same entry, it will just happen less frequently.

The 'hashfull' concept was always quite unclear to me, the way that it's described in the UCI 'protocol', I mean. It could be interpreted both ways - 1. measuring the entries that was stored for first time - 'key = 0' (which makes no sense at all, since 100% will be reached pretty fast, unless someone needs to see when hash gets cleared and counter starts all over again), and 2. measuring the entries, replaced by the current search. Does anyone have option 3. in mind ?

hyatt · Post by **hyatt** » Thu Jul 01, 2010 6:31 pm

Mincho Georgiev wrote:
hyatt wrote:Also, "hash-full" is not a very good measure. When you reach 90% you are into heavy-overwriting territory, losing important information. A better measure is "how many times do I overwrite a different position from the current search?" since that is an issue of lost information. Even 50% full will result in some lost information due to different positions hashing to the same entry, it will just happen less frequently.
The 'hashfull' concept was always quite unclear to me, the way that it's described in the UCI 'protocol', I mean. It could be interpreted both ways - 1. measuring the entries that was stored for first time - 'key = 0' (which makes no sense at all, since 100% will be reached pretty fast, unless someone needs to see when hash gets cleared and counter starts all over again), and 2. measuring the entries, replaced by the current search. Does anyone have option 3. in mind ?

What I believe everyone does is to use an "age" field that indicates whether an entry was saved during this search (not iteration but full search) or during a previous search. Then you can look at this field to see how many entries are old and how many are new, to get this hash full %. How you actually compute it is not so important as what it actually means.

It is easily possible to have more critical overwrites done at 80% full as opposed to 95% full, depending on how uniform your hash signature computation is.

Mincho Georgiev · Post by **Mincho Georgiev** » Thu Jul 01, 2010 7:21 pm

hyatt wrote:
Mincho Georgiev wrote:
hyatt wrote:Also, "hash-full" is not a very good measure. When you reach 90% you are into heavy-overwriting territory, losing important information. A better measure is "how many times do I overwrite a different position from the current search?" since that is an issue of lost information. Even 50% full will result in some lost information due to different positions hashing to the same entry, it will just happen less frequently.
The 'hashfull' concept was always quite unclear to me, the way that it's described in the UCI 'protocol', I mean. It could be interpreted both ways - 1. measuring the entries that was stored for first time - 'key = 0' (which makes no sense at all, since 100% will be reached pretty fast, unless someone needs to see when hash gets cleared and counter starts all over again), and 2. measuring the entries, replaced by the current search. Does anyone have option 3. in mind ?
What I believe everyone does is to use an "age" field that indicates whether an entry was saved during this search (not iteration but full search) or during a previous search. Then you can look at this field to see how many entries are old and how many are new, to get this hash full %. How you actually compute it is not so important as what it actually means.

It is easily possible to have more critical overwrites done at 80% full as opposed to 95% full, depending on how uniform your hash signature computation is.

Yes, that is what I meant with the 2nd. Thank you!

hyatt · Post by **hyatt** » Thu Jul 01, 2010 10:05 pm

Mincho Georgiev wrote:
hyatt wrote:Some misinformation above, so to clarify.

1. Bigger hash is definitely bad, because the larger your program, the more physical pages of RAM you use, and the more TLB entries you need to avoid "walking the tables" to convert virtual to physical addresses. If you blow the TLB, you can (on 64 bit hardware) add 4 extra memory accesses to each primary memory address because of the virtual-to-physical mapping that must be done. So more RAM = more TLB entries needed, and when you run out, memory access time goes _way_ up.

2. Bigger hash makes the search more efficient, until you reach the point where you can store the entire tree you are searching. Going beyond that only stresses the TLB and slows things down. The idea is that you want a hash size that is about the size of the tree you want to store in the table. If you search 1M nodes per second, and you are playing 1 second per move, you need a hash size of 1M. A little bigger might help marginally, but you can quickly lose this marginal gain due to TLB thrashing. If the engine does not hash q-search nodes (Crafty is an example) then you need about 10% of the total nodes searched as a hash size. In the above, for Crafty, 100K entries would be fine. 1M would not hurt as that is not big enough to hurt so much.

A good processor might have 1024 TLB entries. That means for 1024 virtual pages, you can get instant virtual-to-real translations. At 4K byte page size, that means going beyond 4M total will start to stress the TLB. Modern processors can also use "large pages" which can be either 2M or 4M depending on architecture. That lets you address 1000x as much memory using the same TLB.
I'll say again. I don't see how smaller hash table will bring the benefit over the bigger one. I didn't say that bigger is better. Maybe I'm missing something, so help me if I am, but we are not talking about playing vs engine. The topic was about engine match, right ? So the conditions (hashsize) for the two engines will be equal? So the lookaside buffer issues will be valid for both engines. Probably this will have some impact on randomness, but I don't see what else ?

Back to my original point, virtual address space, the size of which determines how many TLB entries we need to avoid the "table walk" when translating virtual to physical. If the virtual address space of both programs is below the number of TLB entries you have, this is not an issue. But what about a program that is quite large, whether it has large code blocks (C++ templates can turn a small amount of C++ into a huge number of binary procedures), large arrays (bitbases, magic tables, etc, and then there are the various hash tables.

So just saying "if hash size is equal there is no problem" is not quite correct. It would be better phrased as "if the TLB requirements are equal, then there is no problem. Unfortunately this is quite difficult to detect unless you want to use some asm code to access hardware performance monitor registers to see what is happening with regard to address translations.

Mincho Georgiev · Post by **Mincho Georgiev** » Fri Jul 02, 2010 6:18 am

Yes, I completely forgot that TLB entries will be needed for the rest of the programs data ,for the pawn hash and so on. There is no easy way to determine this. An unaligned table for example in the one of the programs changes everything.
Another issue. The virtual memory due to a large hashtable could be spitted by the OS between physical memory and the page file, right? If that happens, my above comment becomes completely irrelevant.

hyatt · Post by **hyatt** » Fri Jul 02, 2010 12:47 pm

Hopefully no one goes large enough to actually start paging, or performance dies anyway.

OpenChess

OpenChess

Ideal hash size for short time controls

Re: Ideal hash size for short time controls

Re: Ideal hash size for short time controls

Re: Ideal hash size for short time controls

Re: Ideal hash size for short time controls

Re: Ideal hash size for short time controls

Re: Ideal hash size for short time controls

Re: Ideal hash size for short time controls

Re: Ideal hash size for short time controls

Re: Ideal hash size for short time controls

Re: Ideal hash size for short time controls