{"id":2730,"date":"2023-11-25T14:13:51","date_gmt":"2023-11-25T14:13:51","guid":{"rendered":"https:\/\/blogs.qub.ac.uk\/dipsa\/?p=2730"},"modified":"2024-08-13T06:44:00","modified_gmt":"2024-08-13T05:44:00","slug":"simd-bit-twiddling-hacks","status":"publish","type":"post","link":"https:\/\/blogs.qub.ac.uk\/dipsa\/simd-bit-twiddling-hacks\/","title":{"rendered":"SIMD Bit Twiddling Hacks"},"content":{"rendered":"\n<p>The <a href=\"https:\/\/graphics.stanford.edu\/~seander\/bithacks.html\">Bit Twiddling Hacks<\/a> website collects an array of useful code fragments that implement some very specific computations very efficiently. Here we collect references to some handy code fragments for SIMD based computation.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>AVX2 Population Count: <a href=\"https:\/\/arxiv.org\/pdf\/1611.07612.pdf\">Mula&#8217;s algorithm<\/a><\/li>\n\n\n\n<li>AVX2 <a href=\"https:\/\/stackoverflow.com\/questions\/68985039\/avx2-bitscanreverse-or-countleadingzeros-on-8-bit-elements-in-avx-register\/68987092#68987092\">Count Leading Zeros<\/a> for 8-bit integers<\/li>\n\n\n\n<li>AVX2 <a href=\"https:\/\/stackoverflow.com\/questions\/58823140\/count-leading-zero-bits-for-each-element-in-avx2-vector-emulate-mm256-lzcnt-ep\/58827596#58827596\">Count Leading Zeroes<\/a> for 32-bit integers<\/li>\n\n\n\n<li>AVX512 <a href=\"https:\/\/arxiv.org\/pdf\/2112.06342.pdf\">Alternatives faster than VP2INTERSECT<\/a><\/li>\n\n\n\n<li>AVX2 <a href=\"https:\/\/stackoverflow.com\/questions\/72424660\/best-way-to-mask-a-single-bit-in-avx2\">Setting single bit in SIMD register<\/a><\/li>\n<\/ul>\n\n\n\n<p><\/p>\n\n\n\n<p class=\"has-medium-font-size\"><strong><a href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/technical-posts\/\" target=\"_blank\" rel=\"noreferrer noopener\">Technical Posts<\/a><\/strong><\/p>\n\n\n<ul class=\"wp-block-latest-posts__list has-dates wp-block-latest-posts\"><li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2024\/08\/cones2-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:75px;max-height:75px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/random-vertex-relabelling-in-laganlighter\/\">Random Vertex Relabelling in LaganLighter<\/a><time datetime=\"2024-08-21T12:37:52+01:00\" class=\"wp-block-latest-posts__post-date\">21 August 2024<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2024\/08\/trees-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:75px;max-height:75px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/minimum-spanning-forest-of-ms-biographs\/\">Minimum Spanning Forest of MS-BioGraphs<\/a><time datetime=\"2024-08-09T14:11:36+01:00\" class=\"wp-block-latest-posts__post-date\">9 August 2024<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2024\/08\/affinity-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:75px;max-height:75px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/topology-based-thread-affinity-setting-thread-pinning-in-openmp\/\">Topology-Based Thread Affinity Setting (Thread Pinning) in OpenMP<\/a><time datetime=\"2024-08-03T18:07:31+01:00\" class=\"wp-block-latest-posts__post-date\">3 August 2024<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2024\/04\/hedera-helix-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:75px;max-height:75px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/an-incomplete-list-of-publicly-available-graph-datasets-generators\/\">An (Incomplete) List of Publicly Available Graph Datasets\/Generators<\/a><time datetime=\"2024-06-21T13:25:49+01:00\" class=\"wp-block-latest-posts__post-date\">21 June 2024<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2024\/04\/passerine-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:75px;max-height:75px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/an-evaluation-of-bandwidth-of-different-storage-types-hdd-vs-ssd-vs-lustrefs-for-different-block-sizes-and-different-read-methods-mmap-vs-pread-vs-read\/\">An Evaluation of Bandwidth of Different Storage Types (HDD vs. SSD vs. LustreFS) for Different Block Sizes and Different Parallel Read Methods (mmap vs pread vs read)<\/a><time datetime=\"2024-04-20T09:48:10+01:00\" class=\"wp-block-latest-posts__post-date\">20 April 2024<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/11\/ni2-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:75px;max-height:75px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/simd-bit-twiddling-hacks\/\">SIMD Bit Twiddling Hacks<\/a><time datetime=\"2023-11-25T14:13:51+00:00\" class=\"wp-block-latest-posts__post-date\">25 November 2023<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2022\/12\/cactus-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:75px;max-height:75px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/laganlighter-source-code\/\">LaganLighter Source Code<\/a><time datetime=\"2022-11-14T13:17:17+00:00\" class=\"wp-block-latest-posts__post-date\">14 November 2022<\/time><\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>The Bit Twiddling Hacks website collects an array of useful code fragments that implement some very specific computations very efficiently. Here we collect references to some handy code fragments for SIMD based computation. Technical Posts<\/p>\n","protected":false},"author":974,"featured_media":2711,"comment_status":"open","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[28,127],"tags":[118],"class_list":{"0":"post-2730","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-graptor","8":"category-technical-posts","9":"tag-simd-2","10":"czr-hentry"},"jetpack_featured_media_url":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/11\/ni2.jpg","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/posts\/2730","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/users\/974"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/comments?post=2730"}],"version-history":[{"count":2,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/posts\/2730\/revisions"}],"predecessor-version":[{"id":3266,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/posts\/2730\/revisions\/3266"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/media\/2711"}],"wp:attachment":[{"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/media?parent=2730"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/categories?post=2730"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/tags?post=2730"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}