LOTUS: Locality Optimizing Triangle Counting – PPOPP’22

27th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP 2022)
April 2-6, 2022

DOI: 10.1145/3503221.3508402

Authors’ Copy (PDF Format)

Triangle Counting (TC) is a basic graph algorithm and is widely used in different fields of science, humanities and technology. The great size of real-world graphs with skewed degree distribution is a prohibiting factor in efficient TC.

In this paper, we study the implications of the power-law degree distribution of graph datasets on memory utilization in TC and we explain how a great percentage of triangles are formed around a small number of vertices with very great degrees (that are called hubs).

Using these novel observations, we present the LOTUS algorithm as a structure-aware and locality optimizing TC that separates counting triangles formed by hub and non-hub edges. Lotus introduces bespoke TC algorithms and data structures for different edges in order (i) to reduce random memory accesses, (ii) to optimize cache space utilization, and (iii) to provide better load balance.

To provide better CPU utilization, we introduce the Squared Edge Tiling algorithm that divides a large task into smaller ones when the work for each edge in the neighbour-list depends on the index of that edge.

We evaluate the Lotus algorithm on 3 different processor architectures and for real-world graph datasets with up to 162 billion edges that shows LOTUS provides 2.2-5.5 times speedup in comparison to the previous works.

Code Availability
The source-code will be published soon.


  author={Koohi Esfahani, Mohsen and Kilpatrick, Peter and Vandierendonck, Hans},
  booktitle={27th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP 2022)},  
  title={LOTUS: Locality Optimizing Triangle Counting}, 


Related Posts