High-Performance Graph Processing

A large and fast-growing amount of data is produced from different sources and in different fields of research, science, humanities and industry: telescopes and interferometers in radio astronomy, cameras and sensors of different devices such as  smart phones, security systems, and healthcare systems, interactions between proteins in biology, purchases and ratings of products in e-commerce websites, posts and connections between users in social networks, interactions between grains in studying crack growth resistance in materials engineering, fMRI datasets in Alzheimer classification, … .

These raw data are only beneficial after being analysed by high performance software systems to extract special information and patterns. Our group investigates and introduces novel and effective data structures, algorithms, and techniques to enhance high performance graph processing systems:

– Knowledge Graph [2019 – ongoing]

– Transprecision Graph Processing for Precision Medicine [2021 – ongoing]

– Graph Isomorphism [2021 – ongoing]

Graptor [2019 – ongoing] enhances GraphGrind and VEBO by providing Single Instruction Multiple Data (SIMD) vectorization for graph analytics using the SSE, AVX2 and AVX-512 instruction sets families.

LaganLighter [2019 – 2022]: Having the experience of GraphGrind and VEBO, we understood the requirement of re-studying graph algorithms based on the implications imposed by the structure of datasets into the execution of graph analytics. LaganLighter introduces structure-aware high performance graph algorithms.

VEBO [2017 – 2018] provides load balancing and better CPU utilization by applying a new graph relabeling and partitioning algorithm. The assumption behind VEBO is that load balance is a more important consideration for shared-memory graph partitioning than memory locality. VEBO has been implemented for GraphGrind and also for Polymer.

GraphGrind [2014 – 2017] optimizes Ligra and Polymer for systems with Non-Uniform Memory-Architecture (NUMA) by adapting the parallelisation approach (vertices and edges partitioning and scheduling algorithms) to the graph algorithm, by compressing of topology data, and by introducing a new NUMA-aware work-stealing algorithm that extends Cilk.