ParaGrapher source code for accessing WebGraphs have been published. The supported graph types are:
PARAGRAPHER_CSX_WG_400_AP
: graphs compressed in WebGraph format with 4 Bytes ID per vertex. Graphs in this category: LAW web graphs (https://law.di.unimi.it/datasets.php) .-
PARAGRAPHER_CSX_WG_404_AP
: graphs compressed in WebGraph format with 4 Bytes ID per vertex and 4 Bytes integer weights per edge. Graphs in this category: MS-BioGraphs (https://blogs.qub.ac.uk/DIPSA/MS-BioGraphs/). PARAGRAPHER_CSX_WG_800_AP
: graphs compressed in Big WebGraph format with 8 Bytes ID per vertex. Graphs in this category: (i) WDC Hyper Link 2012 (https://webdatacommons.org/hyperlinkgraph/) and (ii) SWH graphs (https://docs.softwareheritage.org/devel/swh-dataset/graph/dataset.html)
ParaGrapher uses its asynchronous and parallel API to implement these graph types. The user needs to implement a callback function that is called by the API upon completion of reading a block of edges. Poplar uses a shared memory for interaction between its C library and the Java library that deploys the WebGraph framework.
For further details, please refer to Poplar source code repository: https://github.com/DIPSA-QUB/ParaGrapher, particularly, src/webgraph.c
and src/WG*.java
files.
Related Posts
- Selective Parallel Loading of Large-Scale Compressed Graphs with ParaGrapher – arXiv Version
- An Evaluation of Bandwidth of Different Storage Types (HDD vs. SSD vs. LustreFS) for Different Block Sizes and Different Parallel Read Methods (mmap vs pread vs read)
- ParaGrapher Integrated to LaganLighter
- ParaGrapher Source Code For WebGraph Types