Houdini 1.03 is available

Discussion about chess-playing software (engines, hosts, opening books, platforms, etc...)
User avatar
Robert Houdart
Posts: 180
Joined: Thu Jun 10, 2010 4:55 pm
Contact:

Re: Houdini 1.03 is available

Post by Robert Houdart » Sat Jul 17, 2010 10:27 pm

Odeus37 wrote:About the "lockless hash table access" version, it would be nice to have more results with other processors.

If it's about same speed with recent CPUs, and faster with older ones (like the 6% with my q6600), you could maybe promote this version to the new default one. :)
Exactly my thoughts.
There is one more parameter to consider: the size of the hash table. My above-mentioned values were for a fairly small hash size of 128 MB. With a 1024 MB hash (type "setoption name hash value 1024") I find that the standard version is slightly (1% to 2%) faster than the LOCKLESS. Complicated, isn't it? :?

Anyway, for the 6% performance gain you've found I'm quite willing to make an extra compile.
Odeus37 wrote:I noticed one more difference with the "lockless hash table access" version : it fills his hash slower than the normal version.

With 512 MB hash, 4 cores, on my q6600, in 30s :

- lockless version is about 65% hashtable full
- normal version is about 97% full
The hashfull reported by the LOCKLESS version is incorrect (meaningless), please ignore this value.

Robert

User avatar
Robert Houdart
Posts: 180
Joined: Thu Jun 10, 2010 4:55 pm
Contact:

Re: Houdini 1.03 is available

Post by Robert Houdart » Sat Jul 17, 2010 10:46 pm

Theo wrote:Hi Robert you're absolutely wrong.You have made Houdini ONLY for Intel " i " and AMD processors.This is the problem why the speed is much,much slower on Intel QX Skulltraills and Intel Xenons 8 threads.
The User make since months tests with the most compiles of the Ippolit family and has never such problems.
Hello Theo, welcome to the forum. Are you the owner of the 8-core machine sirabc was reporting about?

You're correct that I could be absolutely wrong about this system, that's why I wrote in conditional mode "IF this processor is a double quad core", as you can see higher in the thread this information was coming from sirabc.

It is also correct that I can only optimize and test Houdini on the computers to which I have access, which include Intel Core Duo and Core-i.

Note that the "tests made with most compiles of the Ippolit family" are not very relevant to Houdini. Houdini uses quite a different SMP architecture. As discussed above, it is quite possible that the LOCKLESS version provides a solution to the very low node counts obtained with Houdini on this 8-core system. We will not know until someone actually installs the LOCKLESS version on the 8-core system and performs some tests...

Robert

sirabc
Posts: 11
Joined: Wed Jun 23, 2010 8:18 pm

Re: Houdini 1.03 is available

Post by sirabc » Sat Jul 17, 2010 10:48 pm

Robert Houdart wrote:here's the download link for the 8_CPU version compiled with lockless hash table access ("Hyatt hashing"): http://www.cruxis.com/chess/download/Ho ... S_8CPU.zip

It may or may not solve the performance problem on the x5355 hardware, please keep me updated about success or failure.
Thanks for the quick response and your work on this compile. I hope I can report the results on the x5355 soon.

Odeus37
Posts: 43
Joined: Mon Jun 14, 2010 5:38 pm

Re: Houdini 1.03 is available

Post by Odeus37 » Sat Jul 17, 2010 11:17 pm

Robert Houdart wrote:There is one more parameter to consider: the size of the hash table. My above-mentioned values were for a fairly small hash size of 128 MB. With a 1024 MB hash (type "setoption name hash value 1024") I find that the standard version is slightly (1% to 2%) faster than the LOCKLESS. Complicated, isn't it? :?
Well, I tested with 1024 MB hash :

Code: Select all

Houdini 1.03a x64 4_CPU
1) info multipv 1 depth 20 seldepth 47 score cp 19  time 29993 nodes 91350982 nps 3045000 [1:54 CPU]
2) info multipv 1 depth 21 seldepth 48 score cp 14  time 29995 nodes 92839134 nps 3095000 [1:55 CPU]
3) info multipv 1 depth 20 seldepth 43 score cp 12  time 29993 nodes 93634994 nps 3121000 [1:55 CPU]

Code: Select all

Houdini 1.03a x64 LOCKLESS 8_CPU
1) info multipv 1 depth 20 seldepth 51 score cp 14  time 29994 nodes 94889953 nps 3163000 [1:52 CPU]
2) info multipv 1 depth 20 seldepth 44 score cp 19  time 29991 nodes 96489415 nps 3217000 [1:55 CPU]
3) info multipv 1 depth 21 seldepth 47 score cp 13  time 29991 nodes 96610146 nps 3221000 [1:53 CPU]

Code: Select all

Houdini 1.03a x64 4_CPU
1) 3045000 * 120 / 114 = 3205 kN/s
2) 3095000 * 120 / 115 = 3230 kN/s
3) 3121000 * 120 / 115 = 3257 kN/s
=> Average speed = 3231 kN/s

Code: Select all

Houdini 1.03a x64 LOCKLESS 8_CPU
1) 3163000 * 120 / 112 = 3389 kN/s
2) 3217000 * 120 / 115 = 3357 kN/s
3) 3221000 * 120 / 113 = 3421 kN/s
=> Average speed = 3389 kN/s
Still about 5% better for the lockless version on my q6600 with 1024 MB hash then.

But I tested fast with 256Mb and 512 MB too, and I noticed that the more hash I had, the lower my nps were ??? I would have thought the opposite... I have 4GB on my PC, and nothing else was running, so I had way enough memory to avoid swapping.

User avatar
Robert Houdart
Posts: 180
Joined: Thu Jun 10, 2010 4:55 pm
Contact:

Re: Houdini 1.03 is available

Post by Robert Houdart » Sat Jul 17, 2010 11:37 pm

Odeus37 wrote:But I tested fast with 256Mb and 512 MB too, and I noticed that the more hash I had, the lower my nps were ??? I would have thought the opposite... I have 4GB on my PC, and nothing else was running, so I had way enough memory to avoid swapping.
That is the normal behaviour, the more memory you use the slower it gets. It's related to the translation lookaside buffers (TLB) in the CPU, and this is the reason why "Large Pages" are useful.
See http://en.wikipedia.org/wiki/Translatio ... ide_buffer and http://en.wikipedia.org/wiki/Page_size.

Robert

Theo
Posts: 2
Joined: Sat Jul 17, 2010 9:57 pm
Real Name: Theodor

Re: Houdini 1.03 is available

Post by Theo » Sun Jul 18, 2010 12:16 am

Robert Houdart wrote:
Theo wrote:Hi Robert you're absolutely wrong.You have made Houdini ONLY for Intel " i " and AMD processors.This is the problem why the speed is much,much slower on Intel QX Skulltraills and Intel Xenons 8 threads.
The User make since months tests with the most compiles of the Ippolit family and has never such problems.
Hello Theo, welcome to the forum. Are you the owner of the 8-core machine sirabc was reporting about?

You're correct that I could be absolutely wrong about this system, that's why I wrote in conditional mode "IF this processor is a double quad core", as you can see higher in the thread this information was coming from sirabc.

It is also correct that I can only optimize and test Houdini on the computers to which I have access, which include Intel Core Duo and Core-i.

Note that the "tests made with most compiles of the Ippolit family" are not very relevant to Houdini. Houdini uses quite a different SMP architecture. As discussed above, it is quite possible that the LOCKLESS version provides a solution to the very low node counts obtained with Houdini on this 8-core system. We will not know until someone actually installs the LOCKLESS version on the 8-core system and performs some tests...

Hi Robert,
I'm not the owner but he can give me per private messages the technical details:
The compile for X5355 brings only 10% (up to 3900 kN/s instead 3500)but the way to normally 6800-7000 is very,very long!.
His Skulltrail QX ist 50% faster and has with your X5355 compile the same effect:(up to 5900 kN/s insteal 5300) but the way to
normally 10-11000 is also ....long.

Robert

User avatar
robbolito
Posts: 601
Joined: Thu Jun 10, 2010 3:48 am

Re: Houdini 1.03 is available

Post by robbolito » Sun Jul 18, 2010 3:03 am

Best engines-3 2010

1 Deep Rybka 4 x64 3190 +54 **************** ½½010½11½½½½10½1 ½11½½1½½½½½1½010 18.5/32
2 Houdini 1.03 x64 4_CPU 3190 0 ½½101½00½½½½01½0 **************** ½½0½½½½1½½½½½1½1 16.0/32
3 Ivanhoe-B57d_whm01_w64 3190 -54 ½00½½0½½½½½0½101 ½½1½½½½0½½½½½0½0 **************** 13.5/32

Odeus37
Posts: 43
Joined: Mon Jun 14, 2010 5:38 pm

Re: Houdini 1.03 is available

Post by Odeus37 » Sun Jul 18, 2010 6:49 am

Robert Houdart wrote:
Odeus37 wrote:But I tested fast with 256Mb and 512 MB too, and I noticed that the more hash I had, the lower my nps were ??? I would have thought the opposite... I have 4GB on my PC, and nothing else was running, so I had way enough memory to avoid swapping.
That is the normal behaviour, the more memory you use the slower it gets. It's related to the translation lookaside buffers (TLB) in the CPU, and this is the reason why "Large Pages" are useful.
See http://en.wikipedia.org/wiki/Translatio ... ide_buffer and http://en.wikipedia.org/wiki/Page_size.

Robert
So, until now, I was setting my hash size to 512 MB or 1024 MB, adapting it to the time control used, to avoid hash being full. Apparently, I was then totally wrong doing this, and I should better run with less hash (like 128 MB) to be more efficient ?

User avatar
Robert Houdart
Posts: 180
Joined: Thu Jun 10, 2010 4:55 pm
Contact:

Re: Houdini 1.03 is available

Post by Robert Houdart » Sun Jul 18, 2010 11:04 am

Odeus37 wrote:So, until now, I was setting my hash size to 512 MB or 1024 MB, adapting it to the time control used, to avoid hash being full. Apparently, I was then totally wrong doing this, and I should better run with less hash (like 128 MB) to be more efficient ?
No, your approach is correct. As long as the hash table is effectively used by the engine it will bring a larger advantage than the performance loss due to TLB misses.
What should be avoided is using a 1024 MB hash table for 1'+0 games, only a fraction of the hash table will then effectively be in use but you will still see the performance hit.

Robert

User avatar
Robert Houdart
Posts: 180
Joined: Thu Jun 10, 2010 4:55 pm
Contact:

Re: Houdini 1.03 is available

Post by Robert Houdart » Sun Jul 18, 2010 11:16 am

Theo wrote:Hi Robert,
I'm not the owner but he can give me per private messages the technical details:
The compile for X5355 brings only 10% (up to 3900 kN/s instead 3500)but the way to normally 6800-7000 is very,very long!.
His Skulltrail QX ist 50% faster and has with your X5355 compile the same effect:(up to 5900 kN/s insteal 5300) but the way to
normally 10-11000 is also ....long.
Thanks for the information, it's interesting to see that the LOCKLESS brings more than 10% improvement.
But the speed remains very low, clearly Houdini is not well adapted to this hardware and will underperform by at least 50 Elo compared to its normal strength.

Robert

Post Reply