DOI: 10.1109/BigData66926.2025.11401782 Whereas the literature describes an increasing number of graph algorithms, loading graphs remains a time-consuming component of the end-to-end execution time. Graph frameworks often rely on custom graph storage formats, that are not optimized for efficient loading of large-scale graph datasets. Furthermore, graph loading is often not optimized […]
sequence similarity graphs
PDF versionDOI: 10.48550/arXiv.2507.00716 ParaGrapher is a graph loading API and library that enables graph processing frameworks to load large-scale compressed graphs with minimal overhead. This capability accelerates the design and implementation of new high-performance graph algorithms and their evaluation on a wide range of graphs and across different frameworks. However, […]
PDF versionDOI: 10.48550/arXiv.2404.19735 Comprehensive evaluation is one of the basis of experimental science. In High-Performance Graph Processing, a thorough evaluation of contributions becomes more achievable by supporting common input formats over different frameworks. However, each framework creates its specific format, which may not support reading large-scale real-world graph datasets. This […]
MS-BioGraph sequence similarity graph datasets are now publicly available on IEEE DataPort: https://doi.org/10.21227/gmd9-1534 . To access the files, you need to register/login to IEEE DataPort and then visit the MS-BioGraphs page. By saving the page as an HTML file such as dp.html, you may download the datasets (as an example […]
ParaGrapher source code has been integrated to LaganLighter and access to different WebGraph formats are available in LaganLighter: For further details, please refer to – LaganLighter source coder Repository: https://github.com/DIPSA-QUB/LaganLighter, particularly, the graph.c file.– ParaGrapher source code repository: https://github.com/DIPSA-QUB/ParaGrapher particularly, the src/webgraph.c and src/WG*.java files. Read more about ParaGrapher and […]
ParaGrapher source code for accessing WebGraphs have been published. The supported graph types are: ParaGrapher uses its asynchronous and parallel API to implement these graph types. The user needs to implement a callback function that is called by the API upon completion of reading a block of edges. Poplar uses […]
2023 IEEE International Conference on Big Data (BigData’23)December 15-18, 2023, Sorrento, Italia DOI: 10.1109/BigData59044.2023.10386309PDF (Authors Copy) Progress in High-Performance Computing in general, and High-Performance Graph Processing in particular, is highly dependent on the availability of publicly-accessible, relevant, and realistic data sets. To ensure continuation of this progress, we (i) investigate […]
2023 IEEE International Symposium on Workload Characterization (IISWC’23)October 1-3, 2023, Ghent, Belgium DOI: 10.1109/IISWC59245.2023.00029PDF Version Progress in High-Performance Computing in general, and High-Performance Graph Processing in particular, is highly dependent on the availability of publicly-accessible, relevant, and realistic data sets. In this paper, we announce publication of MS-BioGraphs, a new […]
DOI: 10.48550/arXiv.2308.16744 PDF VersionarXiv Link Progress in High-Performance Computing in general, and High-Performance Graph Processing in particular, is highly dependent on the availability of publicly-accessible, relevant, and realistic data sets. To ensure continuation of this progress, we (i) investigate and optimize the process of generating large sequence similarity graphs as […]
Name MS-BioGraphs – MS URL https://blogs.qub.ac.uk/DIPSA/MS-BioGraphs-MS Download Link https://doi.org/10.21227/gmd9-1534 Script for Downloading All Files https://blogs.qub.ac.uk/DIPSA/MS-BioGraphs-on-IEEE-DataPort/ Validating and Sample Code https://blogs.qub.ac.uk/DIPSA/MS-BioGraphs-Validation/ Graph Explanation Vertices represent proteins and each edge represents the sequence similarity between its two endpoints Edge Weighted Yes Directed No Number of Vertices 1,757,323,526 Number of Edges 2,488,069,027,875 Maximum […]