high-performance graph processing – DIPSA: Data-Intensive Parallel Systems and Algorithms

DOI: 10.1109/BigData66926.2025.11401782 Whereas the literature describes an increasing number of graph algorithms, loading graphs remains a time-consuming component of the end-to-end execution time. Graph frameworks often rely on custom graph storage formats, that are not optimized for efficient loading of large-scale graph datasets. Furthermore, graph loading is often not optimized […]

ParaGrapher

ParaGrapher: A Parallel and Distributed Graph Loading Library for Large-Scale …

DOI: 10.48550/arXiv.2501.06872PDF version This paper investigates the shared-memory Graph Transposition (GT) problem, a fundamental graph algorithm that is widely used in graph analytics and scientific computing. Previous GT algorithms have significant memory requirements that are proportional to the number of vertices and threads which obstructs their use on large graphs. […]

LaganLighter

On Optimizing Locality of Graph Transposition on Modern Architectures

PDF versionDOI: 10.48550/arXiv.2507.00716 ParaGrapher is a graph loading API and library that enables graph processing frameworks to load large-scale compressed graphs with minimal overhead. This capability accelerates the design and implementation of new high-performance graph algorithms and their evaluation on a wide range of graphs and across different frameworks. However, […]

ParaGrapher

Accelerating Loading WebGraphs in ParaGrapher

To evaluate the impacts of locality-optimizing reordering algorithms, a baseline is required. To create the baseline a random assignment of IDs to vertices may be used to produce a representation of the graph with reduced locality [ DOI:10.1109/ISPASS57527.2023.00029, DOI:10.1109/IISWC53511.2021.00020 ]. To that end, we create the random_ordering() function in relabel.c […]

LaganLighter Technical Posts

Random Vertex Relabelling in LaganLighter

We use MASTIFF to compute the weight of Minimum Spanning Forest (MST) of MS-BioGraphs while ignoring self-edges of the graphs. – MS1 Using machine with 24 cores. MSF weight: 109,915,787,546 – MS50 Using machine with 128 cores. MSF weight: 416,318,200,808 MS-BioGraphsRelated Posts Technical Posts LaganLighter

LaganLighter MS-BioGraphs Technical Posts

Minimum Spanning Forest of MS-BioGraphs

In applications such as graph processing, it is important how threads are pinned on CPU cores as the threads that share resources (such as memory and cache) can accelerate the performance by processing consecutive blocks of input dataset, especially, when the dataset has a high-level of locality. In LaganLighter, we […]

LaganLighter Technical Posts

Topology-Based Thread Affinity Setting (Thread Pinning) in OpenMP

Short URL of this post: https://blogs.qub.ac.uk/DIPSA/graphs-list-2024 Real-World Graphs Smaller Graphs Synthetic Graph Generators Technical Posts

Technical Posts

An (Incomplete) List of Publicly Available Graph Datasets/Generators

PDF versionDOI: 10.48550/arXiv.2404.19735 Comprehensive evaluation is one of the basis of experimental science. In High-Performance Graph Processing, a thorough evaluation of contributions becomes more achievable by supporting common input formats over different frameworks. However, each framework creates its specific format, which may not support reading large-scale real-world graph datasets. This […]

ParaGrapher

Selective Parallel Loading of Large-Scale Compressed Graphs with ParaGrapher – …

MS-BioGraph sequence similarity graph datasets are now publicly available on IEEE DataPort: https://doi.org/10.21227/gmd9-1534 . To access the files, you need to register/login to IEEE DataPort and then visit the MS-BioGraphs page. By saving the page as an HTML file such as dp.html, you may download the datasets (as an example […]

MS-BioGraphs

MS-BioGraphs on IEEE DataPort

2023 IEEE International Symposium on Workload Characterization (IISWC’23)October 1-3, 2023, Ghent, Belgium DOI: 10.1109/IISWC59245.2023.00029PDF Version Progress in High-Performance Computing in general, and High-Performance Graph Processing in particular, is highly dependent on the availability of publicly-accessible, relevant, and realistic data sets. In this paper, we announce publication of MS-BioGraphs, a new […]

MS-BioGraphs