ParaGrapher: A Parallel and Distributed Graph Loading Library for Large-Scale Compressed Graphs – BigData’25 (Short Paper)

Posted on 8 November 2025 by Mohsen Koohi Esfahani

DOI:

Whereas the literature describes an increasing number of graph algorithms, loading graphs remains a time-consuming component of the end-to-end execution time. Graph frameworks often rely on custom graph storage formats, that are not optimized for efficient loading of large-scale graph datasets. Furthermore, graph loading is often not optimized as it is time-consuming to implement.

This shows a demand for high-performance libraries capable of efficiently loading graphs to (i) accelerate designing new graph algorithms, (ii) to evaluate the contributions across a wide range of graph datasets, and (iii) to facilitate easy and fast comparisons across different graph frameworks.

We present ParaGrapher, a library for loading large-scale compressed graphs in parallel and distributed graph frameworks. ParaGrapher supports (a) loading the graph while the caller is blocked and (b) interleaving graph loading with graph processing. ParaGrapher is designed to support loading graphs in shared-memory, distributed-memory, and out-of-core graph processing.

We explain the design of ParaGrapher and present a performance model of graph decompression. Our evaluation shows that ParaGrapher delivers up to 3.2 times speedup in loading and up to 5.2 times speedup in end-to-end execution (i.e., through interleaved loading and execution).

Source Code

https://github.com/DIPSA-QUB/ParaGrapher

API Documentation

Please refer to the Wiki, https://github.com/DIPSA-QUB/ParaGrapher/wiki/API-Documentation, or download the PDF file using https://github.com/DIPSA-QUB/ParaGrapher/raw/main/doc/api.pdf .

BibTex

@articel{paragrapher-bigdata,

}

Related Posts & Source Code

ParaGrapher Web Page

ParaGrapher: A Parallel and Distributed Graph Loading Library for Large-Scale Compressed Graphs – BigData’25 (Short Paper)8 November 2025
Accelerating Loading WebGraphs in ParaGrapher2 December 2024
Selective Parallel Loading of Large-Scale Compressed Graphs with ParaGrapher – arXiv Version1 May 2024
An Evaluation of Bandwidth of Different Storage Types (HDD vs. SSD vs. LustreFS) for Different Block Sizes and Different Parallel Read Methods (mmap vs pread vs read)20 April 2024
ParaGrapher Integrated to LaganLighter16 February 2024
ParaGrapher Source Code For WebGraph Types16 February 2024

On Optimizing Locality of Graph Transposition on Modern Architectures

Posted on 15 January 2025 by Mohsen Koohi Esfahani

DOI: 10.48550/arXiv.2501.06872
PDF version

This paper investigates the shared-memory Graph Transposition (GT) problem, a fundamental graph algorithm that is widely used in graph analytics and scientific computing.

Previous GT algorithms have significant memory requirements that are proportional to the number of vertices and threads which obstructs their use on large graphs. Moreover, atomic memory operations have become comparably fast on recent CPU architectures, which creates new opportunities for improving the performance of concurrent atomic accesses in GT.

We design PoTra, a GT algorithm which leverages graph structure and processor and memory architecture to optimize locality and performance. PoTra limits the size of additional data structures close to CPU cache sizes and utilizes the skewed degree distribution of graph datasets to optimize locality and performance. We present the performance model of PoTra to explain the connection between cache and memory response times and graph locality.

Our evaluation of PoTra on three CPU architectures and 20 real-world and synthetic graph datasets with up to 128 billion edges demonstrates that PoTra achieves up to 8.7 times speedup compared to previous works and if there is a performance loss it remains limited to 15.7%, on average.

Source code

The source code of PoTra is available on LaganLighter repository.

BibTex

@misc{PoTra,
     title={On Optimizing Locality of Graph Transposition on Modern Architectures}, 
     author={Mohsen {Koohi Esfahani} and Hans Vandierendonck},
     year={2025},
     eprint={2501.06872},
     archivePrefix={arXiv},
     primaryClass={cs.DC},
     url={https://arxiv.org/abs/2501.06872},
     doi={10.48550/arXiv.2501.06872} 
}

LaganLighter

On Optimizing Locality of Graph Transposition on Modern Architectures15 January 2025
Random Vertex Relabelling in LaganLighter21 August 2024
Minimum Spanning Forest of MS-BioGraphs9 August 2024
Topology-Based Thread Affinity Setting (Thread Pinning) in OpenMP3 August 2024
ParaGrapher Integrated to LaganLighter16 February 2024
On Designing Structure-Aware High-Performance Graph Algorithms (PhD Thesis)8 December 2022
LaganLighter Source Code14 November 2022
MASTIFF: Structure-Aware Minimum Spanning Tree/Forest – ICS’2228 June 2022
SAPCo Sort: Optimizing Degree-Ordering for Power-Law Graphs – ISPASS’22 (Poster)23 May 2022
LOTUS: Locality Optimizing Triangle Counting – PPOPP’225 April 2022
Locality Analysis of Graph Reordering Algorithms – IISWC’218 November 2021
Thrifty Label Propagation: Fast Connected Components for Skewed-Degree Graphs – IEEE CLUSTER’219 September 2021
Exploiting in-Hub Temporal Locality in SpMV-based Graph Processing – ICPP’219 August 2021
How Do Graph Relabeling Algorithms Improve Memory Locality? ISPASS’21 (Poster)28 March 2021

Accelerating Loading WebGraphs in ParaGrapher

Posted on 2 December 2024 by Mohsen Koohi Esfahani

PDF version
DOI: 10.48550/arXiv.2507.00716

ParaGrapher is a graph loading API and library that enables graph processing frameworks to load large-scale compressed graphs with minimal overhead. This capability accelerates the design and implementation of new high-performance graph algorithms and their evaluation on a wide range of graphs and across different frameworks.

However, our previous study identified two major limitations in ParaGrapher: inefficient utilization of high-bandwidth storage and reduced decompression bandwidth due to increased compression ratios. To address these limitations, we present two optimizations for ParaGrapher in this paper.

To improve storage utilization, particularly for high-bandwidth storage, we introduce ParaGrapher-FUSE (PG-Fuse) a filesystem based on the FUSE (Filesystem in User Space). PG-Fuse optimizes storage access by increasing the size of requested blocks, reducing the number of calls to the underlying filesystem, and caching the received blocks in memory for future calls.

To improve the decompression bandwidth, we introduce CompBin, a compact binary representation of the CSR format. CompBin facilitates direct accesses to neighbors while preventing storage usage for unused bytes.

Our evaluation on 12 real-world and synthetic graphs with up to 128 billion edges shows that PG-Fuse and CompBin achieve up to 7.6 and 21.8 times speedup, respectively.

Source Code

https://github.com/DIPSA-QUB/ParaGrapher

API Documentation

Please refer to the Wiki, https://github.com/DIPSA-QUB/ParaGrapher/wiki/API-Documentation, or download the PDF file using https://github.com/DIPSA-QUB/ParaGrapher/raw/main/doc/api.pdf .

BibTex

@misc{pg_fuse,
      title={Accelerating Loading WebGraphs in ParaGrapher}, 
      author={Mohsen {Koohi Esfahani}},
      year={2025},
      eprint={2507.00716},
      archivePrefix={arXiv},
      primaryClass={cs.DC},
      url={https://arxiv.org/abs/2507.00716}, 
}

Related Posts & Source Code

ParaGrapher Web Page

ParaGrapher: A Parallel and Distributed Graph Loading Library for Large-Scale Compressed Graphs – BigData’25 (Short Paper)8 November 2025
Accelerating Loading WebGraphs in ParaGrapher2 December 2024
Selective Parallel Loading of Large-Scale Compressed Graphs with ParaGrapher – arXiv Version1 May 2024
An Evaluation of Bandwidth of Different Storage Types (HDD vs. SSD vs. LustreFS) for Different Block Sizes and Different Parallel Read Methods (mmap vs pread vs read)20 April 2024
ParaGrapher Integrated to LaganLighter16 February 2024
ParaGrapher Source Code For WebGraph Types16 February 2024

Random Vertex Relabelling in LaganLighter

Posted on 21 August 2024 by Mohsen Koohi Esfahani

To evaluate the impacts of locality-optimizing reordering algorithms, a baseline is required. To create the baseline a random assignment of IDs to vertices may be used to produce a representation of the graph with reduced locality [ DOI:10.1109/ISPASS57527.2023.00029, DOI:10.1109/IISWC53511.2021.00020 ].

To that end, we create the random_ordering() function in relabel.c file. It consists a number of iterations. In each iteration, concurrent threads traverse the list of vertices and assign them new IDs. The function uses xoshiro to produce random numbers.

The alg4_randomize tests this function for a number of graphs. For each dataset, an initial plot of degree distribution of Neighbor to Neighbor Average ID Distance (N2N AID) [DOI:10.1109/IISWC53511.2021.00020] is created. Also, after each iteration of random_ordering() the N2N AID distribution is plotted. This shows the impacts of randomization.

The complete results for all graphs can be seen in this PDF file. The results for some graphs are in the following.

The algorithm has been executed on a machine with two AMD 7401 CPUs, 128 cores, 128 threads. The report created by the launcher is in the following.

Technical Posts

Random Vertex Relabelling in LaganLighter21 August 2024
Minimum Spanning Forest of MS-BioGraphs9 August 2024
Topology-Based Thread Affinity Setting (Thread Pinning) in OpenMP3 August 2024
An (Incomplete) List of Publicly Available Graph Datasets/Generators21 June 2024
An Evaluation of Bandwidth of Different Storage Types (HDD vs. SSD vs. LustreFS) for Different Block Sizes and Different Parallel Read Methods (mmap vs pread vs read)20 April 2024
SIMD Bit Twiddling Hacks25 November 2023
LaganLighter Source Code14 November 2022

LaganLighter

On Optimizing Locality of Graph Transposition on Modern Architectures15 January 2025
Random Vertex Relabelling in LaganLighter21 August 2024
Minimum Spanning Forest of MS-BioGraphs9 August 2024
Topology-Based Thread Affinity Setting (Thread Pinning) in OpenMP3 August 2024
ParaGrapher Integrated to LaganLighter16 February 2024
On Designing Structure-Aware High-Performance Graph Algorithms (PhD Thesis)8 December 2022
LaganLighter Source Code14 November 2022
MASTIFF: Structure-Aware Minimum Spanning Tree/Forest – ICS’2228 June 2022
SAPCo Sort: Optimizing Degree-Ordering for Power-Law Graphs – ISPASS’22 (Poster)23 May 2022
LOTUS: Locality Optimizing Triangle Counting – PPOPP’225 April 2022
Locality Analysis of Graph Reordering Algorithms – IISWC’218 November 2021
Thrifty Label Propagation: Fast Connected Components for Skewed-Degree Graphs – IEEE CLUSTER’219 September 2021
Exploiting in-Hub Temporal Locality in SpMV-based Graph Processing – ICPP’219 August 2021
How Do Graph Relabeling Algorithms Improve Memory Locality? ISPASS’21 (Poster)28 March 2021

Minimum Spanning Forest of MS-BioGraphs

Posted on 9 August 2024 by Mohsen Koohi Esfahani

We use MASTIFF to compute the weight of Minimum Spanning Forest (MST) of MS-BioGraphs while ignoring self-edges of the graphs.

– MS1

Using machine with 24 cores.

MSF weight: 109,915,787,546

– MS50

Using machine with 128 cores.

MSF weight: 416,318,200,808

MS-BioGraphs
Related Posts

Minimum Spanning Forest of MS-BioGraphs9 August 2024
MS-BioGraphs on IEEE DataPort17 April 2024
ParaGrapher Source Code For WebGraph Types16 February 2024
On Overcoming HPC Challenges of Trillion-Scale Real-World Graph Datasets – BigData’23 (Short Paper)15 December 2023
Dataset Announcement: MS-BioGraphs, Trillion-Scale Public Real-World Sequence Similarity Graphs – IISWC’23 (Poster)2 October 2023
MS-BioGraphs: Sequence Similarity Graph Datasets30 August 2023
MS-BioGraphs MS10 August 2023
MS-BioGraphs MSA50010 August 2023
MS-BioGraphs MS20010 August 2023
MS-BioGraphs MSA20010 August 2023
MS-BioGraphs MS5010 August 2023
MS-BioGraphs MSA5010 August 2023
MS-BioGraphs MSA1010 August 2023
MS-BioGraphs MS110 August 2023
MS-BioGraphs Validation10 August 2023

Technical Posts

Random Vertex Relabelling in LaganLighter21 August 2024
Minimum Spanning Forest of MS-BioGraphs9 August 2024
Topology-Based Thread Affinity Setting (Thread Pinning) in OpenMP3 August 2024
An (Incomplete) List of Publicly Available Graph Datasets/Generators21 June 2024
An Evaluation of Bandwidth of Different Storage Types (HDD vs. SSD vs. LustreFS) for Different Block Sizes and Different Parallel Read Methods (mmap vs pread vs read)20 April 2024
SIMD Bit Twiddling Hacks25 November 2023
LaganLighter Source Code14 November 2022

LaganLighter

On Optimizing Locality of Graph Transposition on Modern Architectures15 January 2025
Random Vertex Relabelling in LaganLighter21 August 2024
Minimum Spanning Forest of MS-BioGraphs9 August 2024
Topology-Based Thread Affinity Setting (Thread Pinning) in OpenMP3 August 2024
ParaGrapher Integrated to LaganLighter16 February 2024
On Designing Structure-Aware High-Performance Graph Algorithms (PhD Thesis)8 December 2022
LaganLighter Source Code14 November 2022
MASTIFF: Structure-Aware Minimum Spanning Tree/Forest – ICS’2228 June 2022
SAPCo Sort: Optimizing Degree-Ordering for Power-Law Graphs – ISPASS’22 (Poster)23 May 2022
LOTUS: Locality Optimizing Triangle Counting – PPOPP’225 April 2022
Locality Analysis of Graph Reordering Algorithms – IISWC’218 November 2021
Thrifty Label Propagation: Fast Connected Components for Skewed-Degree Graphs – IEEE CLUSTER’219 September 2021
Exploiting in-Hub Temporal Locality in SpMV-based Graph Processing – ICPP’219 August 2021
How Do Graph Relabeling Algorithms Improve Memory Locality? ISPASS’21 (Poster)28 March 2021

Topology-Based Thread Affinity Setting (Thread Pinning) in OpenMP

Posted on 3 August 2024 by Mohsen Koohi Esfahani

In applications such as graph processing, it is important how threads are pinned on CPU cores as the threads that share resources (such as memory and cache) can accelerate the performance by processing consecutive blocks of input dataset, especially, when the dataset has a high-level of locality.

In LaganLighter, we read the CPU topology to specify how OpenMP threads are pinned. In omp.c file, the block starting with comment “Reading sibling groups of each node“, reads the “/sys/devices/system/cpu/cpu*/topology/thread_siblings” files to identify the sibling threads and three arrays ("node_sibling_groups_start_ID“, “sibling_group_cpus_start_offsets“, and “sibling_groups_cpus“) are used to store the sibling CPUs.

Then, in block starting with comment “Setting affinity of threads“, the sibling groups are read and based on the total number of threads requested by user, a number of threads with consecutive IDs are pinned to sibling CPUs.

For a machine with 24 cores, 48 hyperthreads, when 48 threads are requested, we have:

If 96 threads are created, we have:

Technical Posts

Random Vertex Relabelling in LaganLighter21 August 2024
Minimum Spanning Forest of MS-BioGraphs9 August 2024
Topology-Based Thread Affinity Setting (Thread Pinning) in OpenMP3 August 2024
An (Incomplete) List of Publicly Available Graph Datasets/Generators21 June 2024
An Evaluation of Bandwidth of Different Storage Types (HDD vs. SSD vs. LustreFS) for Different Block Sizes and Different Parallel Read Methods (mmap vs pread vs read)20 April 2024
SIMD Bit Twiddling Hacks25 November 2023
LaganLighter Source Code14 November 2022

LaganLighter

On Optimizing Locality of Graph Transposition on Modern Architectures15 January 2025
Random Vertex Relabelling in LaganLighter21 August 2024
Minimum Spanning Forest of MS-BioGraphs9 August 2024
Topology-Based Thread Affinity Setting (Thread Pinning) in OpenMP3 August 2024
ParaGrapher Integrated to LaganLighter16 February 2024
On Designing Structure-Aware High-Performance Graph Algorithms (PhD Thesis)8 December 2022
LaganLighter Source Code14 November 2022
MASTIFF: Structure-Aware Minimum Spanning Tree/Forest – ICS’2228 June 2022
SAPCo Sort: Optimizing Degree-Ordering for Power-Law Graphs – ISPASS’22 (Poster)23 May 2022
LOTUS: Locality Optimizing Triangle Counting – PPOPP’225 April 2022
Locality Analysis of Graph Reordering Algorithms – IISWC’218 November 2021
Thrifty Label Propagation: Fast Connected Components for Skewed-Degree Graphs – IEEE CLUSTER’219 September 2021
Exploiting in-Hub Temporal Locality in SpMV-based Graph Processing – ICPP’219 August 2021
How Do Graph Relabeling Algorithms Improve Memory Locality? ISPASS’21 (Poster)28 March 2021

An (Incomplete) List of Publicly Available Graph Datasets/Generators

Posted on 21 June 2024 by Mohsen Koohi Esfahani

Short URL of this post: https://blogs.qub.ac.uk/DIPSA/graphs-list-2024

Real-World Graphs

Smaller Graphs

Synthetic Graph Generators

Graph500, [OSTI ID: 1014641], https://github.com/graph500/graph500/tree/newreference/generator
KaGen, [DOI: 10.1109/IPDPS.2018.00043], https://github.com/KarlsruheGraphGeneration/KaGen
GTgraph, https://github.com/Bader-Research/GTgraph
Smooth Kronocker, [DOI: 10.1145/3398682.3399161], https://github.com/dmargo/smooth_kron_gen

Technical Posts

Random Vertex Relabelling in LaganLighter21 August 2024
Minimum Spanning Forest of MS-BioGraphs9 August 2024
Topology-Based Thread Affinity Setting (Thread Pinning) in OpenMP3 August 2024
An (Incomplete) List of Publicly Available Graph Datasets/Generators21 June 2024
An Evaluation of Bandwidth of Different Storage Types (HDD vs. SSD vs. LustreFS) for Different Block Sizes and Different Parallel Read Methods (mmap vs pread vs read)20 April 2024
SIMD Bit Twiddling Hacks25 November 2023
LaganLighter Source Code14 November 2022

Selective Parallel Loading of Large-Scale Compressed Graphs with ParaGrapher – arXiv Version

Posted on 1 May 2024 by Mohsen Koohi Esfahani

PDF version
DOI: 10.48550/arXiv.2404.19735

Comprehensive evaluation is one of the basis of experimental science. In High-Performance Graph Processing, a thorough evaluation of contributions becomes more achievable by supporting common input formats over different frameworks. However, each framework creates its specific format, which may not support reading large-scale real-world graph datasets. This shows a demand for high-performance libraries capable of loading graphs to (i) accelerate designing new graph algorithms, (ii) to evaluate the contributions on a wide range of graph algorithms, and (iii) to facilitate easy and fast comparison over different graph frameworks.

To that end, we present ParaGrapher, a high-performance API and library for loading large-scale and compressed graphs. ParaGrapher supports different types of requests for accessing graphs in shared- and distributed-memory and out-of-core graph processing. We explain the design of ParaGrapher and present a performance model of graph decompression, which is used for evaluation of ParaGrapher over three storage types.

Our evaluation shows that by decompressing compressed graphs in WebGraph format, ParaGrapher delivers up to 3.2 times speedup in loading and up to 5.2 times speedup in end-to-end execution in comparison to the binary and textual formats.

ParaGrapher is available online on https://blogs.qub.ac.uk/DIPSA/ParaGrapher/.

Source Code

https://github.com/DIPSA-QUB/ParaGrapher

API Documentation

Please refer to the Wiki, https://github.com/DIPSA-QUB/ParaGrapher/wiki/API-Documentation, or download the PDF file using https://github.com/DIPSA-QUB/ParaGrapher/raw/main/doc/api.pdf .

BibTex

@misc{paragrapher-arxiv,
  title = { Selective Parallel Loading of Large-Scale 
            Compressed Graphs with {ParaGrapher}}, 
  author = { {Mohsen} {Koohi Esfahani} and Marco D'Antonio and 
             Syed Ibtisam Tauhidi and Thai Son Mai and 
             Hans Vandierendonck},
  year = {2024},
  eprint = {2404.19735},
  archivePrefix = {arXiv},
  primaryClass = {cs.AR},
  doi = {10.48550/arXiv.2404.19735},
  url={https://arxiv.org/abs/2404.19735}, 
}

Related Posts & Source Code

ParaGrapher Web Page

ParaGrapher: A Parallel and Distributed Graph Loading Library for Large-Scale Compressed Graphs – BigData’25 (Short Paper)8 November 2025
Accelerating Loading WebGraphs in ParaGrapher2 December 2024
Selective Parallel Loading of Large-Scale Compressed Graphs with ParaGrapher – arXiv Version1 May 2024
An Evaluation of Bandwidth of Different Storage Types (HDD vs. SSD vs. LustreFS) for Different Block Sizes and Different Parallel Read Methods (mmap vs pread vs read)20 April 2024
ParaGrapher Integrated to LaganLighter16 February 2024
ParaGrapher Source Code For WebGraph Types16 February 2024

MS-BioGraphs on IEEE DataPort

Posted on 17 April 2024 by Mohsen Koohi Esfahani

MS-BioGraph sequence similarity graph datasets are now publicly available on IEEE DataPort: https://doi.org/10.21227/gmd9-1534 .

To access the files, you need to register/login to IEEE DataPort and then visit the MS-BioGraphs page. By saving the page as an HTML file such as dp.html, you may download the datasets (as an example MS1) using the following script:

dsname="MS1"
html_file="dp.html"

urls=`cat $html_file  | sed  -e 's/\&amp;/\&/g'  | grep -Eo "(http|https)://[a-zA-Z0-9./?&=_%:-]*" | grep amazonaws  | sort | uniq | grep -E "$dsname[-_\.]"`

for u in $urls; do
    wget $u
    if [ $? != 0 ]; then break; fi
done

# removing query strings
for f in $(find $1 -type f); do
    if [ $f = ${f%%\?*} ]; then continue; fi
    mv "${f}" "${f%%\?*}"
done

# liking offsets.bin to be found by ParaGrapher
ln -s ${dsname}_offsets.bin ${dsname}-underlying_offsets.bin

Instead of wget you may use axel -n 10 to use multiple connections (here, 10) for downloading each file (https://manpages.ubuntu.com/manpages/noble/en/man1/axel.1.html).

MS-BioGraphs

Related Posts

Minimum Spanning Forest of MS-BioGraphs9 August 2024
MS-BioGraphs on IEEE DataPort17 April 2024
ParaGrapher Source Code For WebGraph Types16 February 2024
On Overcoming HPC Challenges of Trillion-Scale Real-World Graph Datasets – BigData’23 (Short Paper)15 December 2023
Dataset Announcement: MS-BioGraphs, Trillion-Scale Public Real-World Sequence Similarity Graphs – IISWC’23 (Poster)2 October 2023
MS-BioGraphs: Sequence Similarity Graph Datasets30 August 2023
MS-BioGraphs MS10 August 2023
MS-BioGraphs MSA50010 August 2023
MS-BioGraphs MS20010 August 2023
MS-BioGraphs MSA20010 August 2023
MS-BioGraphs MS5010 August 2023
MS-BioGraphs MSA5010 August 2023
MS-BioGraphs MSA1010 August 2023
MS-BioGraphs MS110 August 2023
MS-BioGraphs Validation10 August 2023

Dataset Announcement: MS-BioGraphs, Trillion-Scale Public Real-World Sequence Similarity Graphs – IISWC’23 (Poster)

Posted on 2 October 2023 by Mohsen Koohi Esfahani

2023 IEEE International Symposium on Workload Characterization (IISWC’23)
October 1-3, 2023, Ghent, Belgium

DOI: 10.1109/IISWC59245.2023.00029
PDF Version

Progress in High-Performance Computing in general, and High-Performance Graph Processing in particular, is highly dependent on the availability of publicly-accessible, relevant, and realistic data sets.

In this paper, we announce publication of MS-BioGraphs, a new family of publicly-available real-world edge-weighted graph datasets with up to 2.5 trillion edges, that is, 6.6 times greater than the largest graph published recently.

We briefly review the two main challenges we faced in generating large graph datasets and our solutions, that are, (i) optimizing data structures and algorithms for this multi-step process and (ii) WebGraph parallel compression technique. We also study some characteristics of MS-BioGraphs.

The datasets are available on https://blogs.qub.ac.uk/DIPSA/MS-BioGraphs .

Please visit https://blogs.qub.ac.uk/DIPSA/MS-BioGraphs-Sequence-Similarity-Graph-Datasets/ for a complete version of this paper.

Bibtex

@INPROCEEDINGS{10.1109/IISWC59245.2023.00029,
   author = {Koohi Esfahani, Mohsen and Boldi, Paolo and Vandierendonck, Hans and Kilpatrick,  Peter and  Vigna, Sebastiano},  
  booktitle={2023 IEEE International Symposium on Workload Characterization (IISWC'23)},  
  title={Dataset Announcement: {MS-BioGraphs}, Trillion-Scale Public Real-World Sequence Similarity Graphs}, 
  year={2023},
  volume={},
  number={},
  pages={},
  location={Belgium, Ghent},
  publisher={IEEE Computer Society},
  doi={10.1109/IISWC59245.2023.00029}
}

MS-BioGraphs

Related Posts

Minimum Spanning Forest of MS-BioGraphs9 August 2024
MS-BioGraphs on IEEE DataPort17 April 2024
ParaGrapher Source Code For WebGraph Types16 February 2024
On Overcoming HPC Challenges of Trillion-Scale Real-World Graph Datasets – BigData’23 (Short Paper)15 December 2023
Dataset Announcement: MS-BioGraphs, Trillion-Scale Public Real-World Sequence Similarity Graphs – IISWC’23 (Poster)2 October 2023
MS-BioGraphs: Sequence Similarity Graph Datasets30 August 2023
MS-BioGraphs MS10 August 2023
MS-BioGraphs MSA50010 August 2023
MS-BioGraphs MS20010 August 2023
MS-BioGraphs MSA20010 August 2023
MS-BioGraphs MS5010 August 2023
MS-BioGraphs MSA5010 August 2023
MS-BioGraphs MSA1010 August 2023
MS-BioGraphs MS110 August 2023
MS-BioGraphs Validation10 August 2023

DIPSA: Data-Intensive Parallel Systems and Algorithms

Tag Archives: high-performance graph processing

ParaGrapher: A Parallel and Distributed Graph Loading Library for Large-Scale Compressed Graphs – BigData’25 (Short Paper)

Source Code

API Documentation

On Optimizing Locality of Graph Transposition on Modern Architectures

Source code

BibTex

Accelerating Loading WebGraphs in ParaGrapher

API Documentation

Random Vertex Relabelling in LaganLighter

Minimum Spanning Forest of MS-BioGraphs

– MS1

– MS50

Topology-Based Thread Affinity Setting (Thread Pinning) in OpenMP

An (Incomplete) List of Publicly Available Graph Datasets/Generators

Selective Parallel Loading of Large-Scale Compressed Graphs with ParaGrapher – arXiv Version

Source Code

API Documentation

MS-BioGraphs on IEEE DataPort

Dataset Announcement: MS-BioGraphs, Trillion-Scale Public Real-World Sequence Similarity Graphs – IISWC’23 (Poster)