The Bit Twiddling Hacks website collects an array of useful code fragments that implement some very specific computations very efficiently. Here we collect references to some handy code fragments for SIMD based computation.
- AVX2 Population Count: Mula’s algorithm
- AVX2 Count Leading Zeros for 8-bit integers
- AVX2 Count Leading Zeroes for 32-bit integers
- AVX512 Alternatives faster than VP2INTERSECT
- AVX2 Setting single bit in SIMD register