Page 1 of 1
LZCNT and bitboards
Posted: Mon Feb 07, 2011 12:47 am
by arkon
some of you guys prbly did this already, but im wondering:
the LZCNT cmd in the SSE4a extensions (sadly only available on AMDs) offers a count of leading zeros. interesting for bitboards? so two questions:
- how would one force the c++ compiler of choice to use the LZCNT-command? code sample? im stuck in the interface between c++ and asm. it should be kinda straight forward, but im blacking out
- whats the fastest way to get the TRAILING zero count? my bitboard is kinda relaying on that. i have a sample from
http://graphics.stanford.edu/~seander/b ... tModLookup
which has around 4 ops. modded for 64 bit it comes to 6 ops. i could reduce that overhead, sure, but whats the state of the art on that frontier? smbdy using some crazy optimization? input welcome
Re: LZCNT and bitboards
Posted: Mon Feb 07, 2011 1:48 am
by BB+
I always thought LZCNT was just the same as "63 minus BSR", except if the input is 0. The opposite of BSR is BSF (bit scan forward), which should return the lowest set-bit (if any), and this should be the same as the number of trailing zeros.
Re: LZCNT and bitboards
Posted: Mon Feb 07, 2011 5:29 pm
by hyatt
BB+ wrote:I always thought LZCNT was just the same as "63 minus BSR", except if the input is 0. The opposite of BSR is BSF (bit scan forward), which should return the lowest set-bit (if any), and this should be the same as the number of trailing zeros.
That's the way it works. And it is an instruction taken directly from the Cray-1 and later machines (leadz was the opcode). Only problem was there was no good way to find the last one bit (leadz basically finds the first 1 from the left when you think about it). But one could use a bit-twiddling trick to extract just the rightmost bit and then apply leadz to that. Now, thankfully, we also have popcnt which also came from the Cray and was demanded by the crypto users if Intel wanted to sell them machines..
Re: LZCNT and bitboards
Posted: Thu Feb 10, 2011 10:41 pm
by Gerd Isenberg
arkon wrote:some of you guys prbly did this already, but im wondering:
the LZCNT cmd in the SSE4a extensions (sadly only available on AMDs) offers a count of leading zeros. interesting for bitboards? so two questions:
- how would one force the c++ compiler of choice to use the LZCNT-command? code sample? im stuck in the interface between c++ and asm. it should be kinda straight forward, but im blacking out
- whats the fastest way to get the TRAILING zero count? my bitboard is kinda relaying on that. i have a sample from
http://graphics.stanford.edu/~seander/b ... tModLookup
which has around 4 ops. modded for 64 bit it comes to 6 ops. i could reduce that overhead, sure, but whats the state of the art on that frontier? smbdy using some crazy optimization? input welcome
The
cpw Bitscan page has some suggestions, covers also
LZCNT of K10, which is two cycles direct path with reciprocal throughput of one. 64-bit BSR/BSF are 4 cycles vector path.
Re: LZCNT and bitboards
Posted: Thu Feb 10, 2011 11:15 pm
by arkon
thanks to all of you. since im currently developing using gcc and the assembler there is just a bitch im postponing any such optimizations for now. when im returning to that point i know where to start now
Re: LZCNT and bitboards
Posted: Fri Feb 11, 2011 10:08 am
by RobertP
arkon wrote:[...] im currently developing using gcc and the assembler there is just a bitch [...]
These built-ins work nicely with gcc and clang. No assembler; no look-up tables; just plain function calls.
Code: Select all
static inline int GetLowestBit( BitBoard b ) { return __builtin_ctzll( b ); }
static inline int GetHighestBit( BitBoard b ) { return 63 - __builtin_clzll( b ); }
static inline int CountBits( BitBoard b ) { return __builtin_popcountll( b ); }
Robert P.
Re: LZCNT and bitboards
Posted: Fri Feb 11, 2011 12:06 pm
by arkon
wow... thanks!
that saved me a headache