{"id":3529,"date":"2024-12-02T05:51:29","date_gmt":"2024-12-02T05:51:29","guid":{"rendered":"https:\/\/blogs.qub.ac.uk\/dipsa\/?p=3529"},"modified":"2025-11-09T10:52:22","modified_gmt":"2025-11-09T10:52:22","slug":"accelerating-loading-webgraphs-in-paragrapher","status":"publish","type":"post","link":"https:\/\/blogs.qub.ac.uk\/dipsa\/accelerating-loading-webgraphs-in-paragrapher\/","title":{"rendered":"Accelerating Loading WebGraphs in ParaGrapher"},"content":{"rendered":"\n<p><strong><a href=\"https:\/\/arxiv.org\/pdf\/2507.00716\" target=\"_blank\" rel=\"noreferrer noopener\">PDF version<\/a><\/strong><br><strong>DOI: <\/strong><a href=\"https:\/\/doi.org\/10.48550\/arXiv.2507.00716\"><strong>10.48550\/arXiv.2507.00716<\/strong><\/a><\/p>\n\n\n\n<p class=\"has-text-align-justify\">ParaGrapher is a graph loading API and library that enables graph processing frameworks to load large-scale compressed graphs with minimal overhead. This capability accelerates the design and implementation of new high-performance graph algorithms and their evaluation on a wide range of graphs and across different frameworks.<\/p>\n\n\n\n<p class=\"has-text-align-justify\">However, our previous study identified two major limitations in ParaGrapher: inefficient utilization of high-bandwidth storage and reduced decompression bandwidth due to increased compression ratios. To address these limitations, we present two optimizations for ParaGrapher in this paper.<br><\/p>\n\n\n\n<p class=\"has-text-align-justify\">To improve storage utilization, particularly for high-bandwidth storage, we introduce ParaGrapher-FUSE (PG-Fuse) a filesystem based on the FUSE (Filesystem in User Space). PG-Fuse optimizes storage access by increasing the size of requested blocks, reducing the number of calls to the underlying filesystem, and caching the received blocks in memory for future calls.<\/p>\n\n\n\n<p class=\"has-text-align-justify\">To improve the decompression bandwidth, we introduce CompBin, a compact binary representation of the CSR format. CompBin facilitates direct accesses to neighbors while preventing storage usage for unused bytes.<\/p>\n\n\n\n<p class=\"has-text-align-justify\">Our evaluation on 12 real-world and synthetic graphs with up to 128 billion edges shows that PG-Fuse and CompBin achieve up to 7.6 and 21.8 times speedup, respectively.<\/p>\n\n\n\n<p class=\"has-text-align-justify\"><strong>Source Code<\/strong><\/p>\n\n\n\n<p style=\"font-size:16px\"><strong><a href=\"https:\/\/github.com\/DIPSA-QUB\/ParaGrapher\">https:\/\/github.com\/DIPSA-QUB\/ParaGrapher<\/a><\/strong><\/p>\n\n\n\n<h2 class=\"wp-block-heading has-medium-font-size\"><strong>API Documentation<\/strong><\/h2>\n\n\n\n<p class=\"has-text-align-justify\">Please refer to the Wiki, <a href=\"https:\/\/github.com\/DIPSA-QUB\/ParaGrapher\/wiki\/API-Documentation\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/github.com\/DIPSA-QUB\/ParaGrapher\/wiki\/API-Documentation<\/a>, or download the PDF file using <a href=\"https:\/\/github.com\/DIPSA-QUB\/ParaGrapher\/raw\/main\/doc\/api.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/github.com\/DIPSA-QUB\/ParaGrapher\/raw\/main\/doc\/api.pdf<\/a> .<\/p>\n\n\n\n<p class=\"has-medium-font-size\"><strong>BibTex<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>@misc{pg_fuse,\n      title={Accelerating Loading WebGraphs in ParaGrapher}, \n      author={Mohsen {Koohi Esfahani}},\n      year={2025},\n      eprint={2507.00716},\n      archivePrefix={arXiv},\n      primaryClass={cs.DC},\n      url={https:\/\/arxiv.org\/abs\/2507.00716}, \n}<\/code><\/pre>\n\n\n\n<p class=\"has-medium-font-size\"><strong>Related Posts &amp; Source Code<\/strong><\/p>\n\n\n\n<p style=\"font-size:16px\"><a href=\"https:\/\/blogs.qub.ac.uk\/DIPSA\/ParaGrapher\"><strong>ParaGrapher Web Page<\/strong><\/a> <\/p>\n\n\n<ul class=\"wp-block-latest-posts__list has-dates wp-block-latest-posts\"><li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2025\/11\/1-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:75px;max-height:75px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/paragrapher-a-parallel-and-distributed-graph-loading-library-for-large-scale-compressed-graphs-bigdata25\/\">ParaGrapher: A Parallel and Distributed Graph Loading Library for Large-Scale Compressed Graphs &#8211; BigData&#8217;25 (Short Paper)<\/a><time datetime=\"2025-11-08T12:02:51+00:00\" class=\"wp-block-latest-posts__post-date\">8 November 2025<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2024\/12\/1-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:75px;max-height:75px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/accelerating-loading-webgraphs-in-paragrapher\/\">Accelerating Loading WebGraphs in ParaGrapher<\/a><time datetime=\"2024-12-02T05:51:29+00:00\" class=\"wp-block-latest-posts__post-date\">2 December 2024<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2024\/05\/fern-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:75px;max-height:75px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/selective-parallel-loading-of-large-scale-compressed-graphs-with-paragrapher\/\">Selective Parallel Loading of Large-Scale Compressed Graphs with ParaGrapher &#8211; arXiv Version<\/a><time datetime=\"2024-05-01T05:44:14+01:00\" class=\"wp-block-latest-posts__post-date\">1 May 2024<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2024\/04\/passerine-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:75px;max-height:75px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/an-evaluation-of-bandwidth-of-different-storage-types-hdd-vs-ssd-vs-lustrefs-for-different-block-sizes-and-different-read-methods-mmap-vs-pread-vs-read\/\">An Evaluation of Bandwidth of Different Storage Types (HDD vs. SSD vs. LustreFS) for Different Block Sizes and Different Parallel Read Methods (mmap vs pread vs read)<\/a><time datetime=\"2024-04-20T09:48:10+01:00\" class=\"wp-block-latest-posts__post-date\">20 April 2024<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2024\/02\/loriini-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:75px;max-height:75px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/paragrapher-integrated-to-laganlighter\/\">ParaGrapher Integrated to LaganLighter<\/a><time datetime=\"2024-02-16T08:29:26+00:00\" class=\"wp-block-latest-posts__post-date\">16 February 2024<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2024\/02\/poplar2-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:75px;max-height:75px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/paragrapher-source-code-for-webgraph-types\/\">ParaGrapher Source Code For WebGraph Types<\/a><time datetime=\"2024-02-16T08:13:13+00:00\" class=\"wp-block-latest-posts__post-date\">16 February 2024<\/time><\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>PDF versionDOI: 10.48550\/arXiv.2507.00716 ParaGrapher is a graph loading API and library that enables graph processing frameworks to load large-scale compressed graphs with minimal overhead. This capability accelerates the design and implementation of new high-performance graph algorithms and their evaluation on a wide range of graphs and across different frameworks. However, our previous study identified two [&hellip;]<\/p>\n","protected":false},"author":1315,"featured_media":3633,"comment_status":"open","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[119],"tags":[34,35,38,64,122,66,65,19,123,124],"class_list":{"0":"post-3529","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-paragrapher","8":"tag-algorithm-design-and-engineering","9":"tag-graph-processing","10":"tag-high-performance-computing","11":"tag-high-performance-graph-processing","12":"tag-parallel-io","13":"tag-real-world-graphs","14":"tag-sequence-similarity-graphs","15":"tag-source-code","16":"tag-storage","17":"tag-trillion-scale-graph-datasets","18":"czr-hentry"},"jetpack_featured_media_url":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2024\/12\/1.jpg","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/posts\/3529","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/users\/1315"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/comments?post=3529"}],"version-history":[{"count":3,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/posts\/3529\/revisions"}],"predecessor-version":[{"id":3630,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/posts\/3529\/revisions\/3630"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/media\/3633"}],"wp:attachment":[{"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/media?parent=3529"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/categories?post=3529"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/tags?post=3529"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}