LZCNT and bitboards
LZCNT and bitboards
some of you guys prbly did this already, but im wondering:
the LZCNT cmd in the SSE4a extensions (sadly only available on AMDs) offers a count of leading zeros. interesting for bitboards? so two questions:
- how would one force the c++ compiler of choice to use the LZCNT-command? code sample? im stuck in the interface between c++ and asm. it should be kinda straight forward, but im blacking out
- whats the fastest way to get the TRAILING zero count? my bitboard is kinda relaying on that. i have a sample from http://graphics.stanford.edu/~seander/b ... tModLookup
which has around 4 ops. modded for 64 bit it comes to 6 ops. i could reduce that overhead, sure, but whats the state of the art on that frontier? smbdy using some crazy optimization? input welcome
the LZCNT cmd in the SSE4a extensions (sadly only available on AMDs) offers a count of leading zeros. interesting for bitboards? so two questions:
- how would one force the c++ compiler of choice to use the LZCNT-command? code sample? im stuck in the interface between c++ and asm. it should be kinda straight forward, but im blacking out
- whats the fastest way to get the TRAILING zero count? my bitboard is kinda relaying on that. i have a sample from http://graphics.stanford.edu/~seander/b ... tModLookup
which has around 4 ops. modded for 64 bit it comes to 6 ops. i could reduce that overhead, sure, but whats the state of the art on that frontier? smbdy using some crazy optimization? input welcome
Re: LZCNT and bitboards
I always thought LZCNT was just the same as "63 minus BSR", except if the input is 0. The opposite of BSR is BSF (bit scan forward), which should return the lowest set-bit (if any), and this should be the same as the number of trailing zeros.
-
- Posts: 1242
- Joined: Thu Jun 10, 2010 2:13 am
- Real Name: Bob Hyatt (Robert M. Hyatt)
- Location: University of Alabama at Birmingham
- Contact:
Re: LZCNT and bitboards
That's the way it works. And it is an instruction taken directly from the Cray-1 and later machines (leadz was the opcode). Only problem was there was no good way to find the last one bit (leadz basically finds the first 1 from the left when you think about it). But one could use a bit-twiddling trick to extract just the rightmost bit and then apply leadz to that. Now, thankfully, we also have popcnt which also came from the Cray and was demanded by the crypto users if Intel wanted to sell them machines..BB+ wrote:I always thought LZCNT was just the same as "63 minus BSR", except if the input is 0. The opposite of BSR is BSF (bit scan forward), which should return the lowest set-bit (if any), and this should be the same as the number of trailing zeros.
-
- Posts: 37
- Joined: Wed Jul 07, 2010 11:11 pm
- Real Name: Gerd Isenberg
Re: LZCNT and bitboards
The cpw Bitscan page has some suggestions, covers also LZCNT of K10, which is two cycles direct path with reciprocal throughput of one. 64-bit BSR/BSF are 4 cycles vector path.arkon wrote:some of you guys prbly did this already, but im wondering:
the LZCNT cmd in the SSE4a extensions (sadly only available on AMDs) offers a count of leading zeros. interesting for bitboards? so two questions:
- how would one force the c++ compiler of choice to use the LZCNT-command? code sample? im stuck in the interface between c++ and asm. it should be kinda straight forward, but im blacking out
- whats the fastest way to get the TRAILING zero count? my bitboard is kinda relaying on that. i have a sample from http://graphics.stanford.edu/~seander/b ... tModLookup
which has around 4 ops. modded for 64 bit it comes to 6 ops. i could reduce that overhead, sure, but whats the state of the art on that frontier? smbdy using some crazy optimization? input welcome
Re: LZCNT and bitboards
thanks to all of you. since im currently developing using gcc and the assembler there is just a bitch im postponing any such optimizations for now. when im returning to that point i know where to start now
Re: LZCNT and bitboards
These built-ins work nicely with gcc and clang. No assembler; no look-up tables; just plain function calls.arkon wrote:[...] im currently developing using gcc and the assembler there is just a bitch [...]
Code: Select all
static inline int GetLowestBit( BitBoard b ) { return __builtin_ctzll( b ); }
static inline int GetHighestBit( BitBoard b ) { return 63 - __builtin_clzll( b ); }
static inline int CountBits( BitBoard b ) { return __builtin_popcountll( b ); }
Re: LZCNT and bitboards
wow... thanks!
that saved me a headache
that saved me a headache