{"id":2288,"date":"2023-08-10T09:50:00","date_gmt":"2023-08-10T08:50:00","guid":{"rendered":"https:\/\/blogs.qub.ac.uk\/dipsa\/?p=2288"},"modified":"2024-06-20T16:29:18","modified_gmt":"2024-06-20T15:29:18","slug":"ms-biographs-msa200","status":"publish","type":"post","link":"https:\/\/blogs.qub.ac.uk\/dipsa\/ms-biographs-msa200\/","title":{"rendered":"MS-BioGraphs MSA200"},"content":{"rendered":"\n<p><\/p>\n\n\n\n\n\t\t<div id='msb'>\n\t<table id='tb1'>\n<tr><td>Name<\/td><td>MS-BioGraphs &#8211; MSA200<\/td><\/tr>\n<tr><td>URL<\/td><td style='font-weight:normal'><a target='_blank' href='https:\/\/blogs.qub.ac.uk\/DIPSA\/MS-BioGraphs-MSA200' rel=\"noopener\">https:\/\/blogs.qub.ac.uk\/DIPSA\/MS-BioGraphs-MSA200<\/a><\/td><\/tr>\n<tr><td>Download Link<\/td><td><a href='https:\/\/doi.org\/10.21227\/gmd9-1534'>https:\/\/doi.org\/10.21227\/gmd9-1534<\/a><\/td><\/tr><tr><td>Script for Downloading All Files<\/td><td><a href='https:\/\/blogs.qub.ac.uk\/dipsa\/MS-BioGraphs-on-IEEE-DataPort\/'>https:\/\/blogs.qub.ac.uk\/DIPSA\/MS-BioGraphs-on-IEEE-DataPort\/<\/a><\/td><\/tr><tr><td>Validating and Sample Code<\/td><td><a href='https:\/\/blogs.qub.ac.uk\/DIPSA\/MS-BioGraphs-Validation\/'>https:\/\/blogs.qub.ac.uk\/DIPSA\/MS-BioGraphs-Validation\/<\/a><\/td><\/tr><tr><td>Graph Explanation<\/td><td style='font-weight: normal'>Vertices represent proteins and each edge represents the sequence similarity between its two endpoints<\/td><\/tr><tr><td>Edge Weighted<\/td><td>Yes<\/td><\/tr>\n<tr><td>Directed<\/td><td>Yes<\/td><\/tr>\n<tr><td>Number of Vertices<\/td><td>1,757,323,526<\/td><\/tr>\n<tr><td>Number of Edges<\/td><td>500,444,322,597<\/td><\/tr>\n<tr><td>Maximum In-Degree<\/td><td>658,879<\/td><\/tr>\n<tr><td>Maximum Out-Degree<\/td><td>709,176<\/td><\/tr>\n<tr><td>Minimum Weight<\/td><td>98<\/td><\/tr>\n<tr><td>Maximum Weight<\/td><td>634,925<\/td><\/tr>\n<tr><td>Number of Zero In-Degree Vertices<\/td><td>6,437,984<\/td><\/tr>\n<tr><td>Number of Zero Out-Degree Vertices<\/td><td>7,471,315<\/td><\/tr>\n<tr><td>Average In-Degree<\/td><td>285.8<\/td><\/tr>\n<tr><td>Average Out-Degree<\/td><td>286.0<\/td><\/tr>\n<tr><td>Size of The Largest Weakly Connected Component<\/td><td>496,880,685,957<\/td><\/tr>\n<tr><td>Number of Weakly Connected Components<\/td><td>221,467,156<\/td><\/tr>\n<tr><td>Creation Details<\/td><td style='font-weight: normal'><a href='https:\/\/blogs.qub.ac.uk\/DIPSA\/MS-BioGraphs-Sequence-Similarity-Graph-Datasets\/' target='_blank' rel=\"noopener\">MS-BioGraphs: Sequency Similarity Graph Datasets<\/a><\/td><\/tr><tr><td>Format<\/td><td><a href='https:\/\/webgraph.di.unimi.it\/' target='_blank' rel=\"noopener\">WebGraph<\/a><\/td><\/tr><tr><td>License<\/td><td><a target='_blank' href='https:\/\/creativecommons.org\/licenses\/by-nc-sa\/2.0\/' rel=\"noopener\">CC BY-NC-SA<\/td><tr><td>QUB IDF<\/td><td style='font-weight: normal'>2223-052<\/td><\/tr>\n<tr><td>DOI<\/td><td><a target='_blank' href='https:\/\/doi.org\/10.5281\/zenodo.7820815' rel=\"noopener\">10.5281\/zenodo.7820815<\/a><\/td><\/tr>\n<tr><td>Citation<\/td><td style='font-weight:normal'><pre>Mohsen Koohi Esfahani, Sebastiano Vigna, \nPaolo Boldi, Hans Vandierendonck, Peter Kilpatrick, March 13, 2024, \n\"MS-BioGraphs: Trillion-Scale Sequence Similarity Graph Datasets\", \nIEEE Dataport, doi: https:\/\/doi.org\/10.21227\/gmd9-1534.<\/pre><\/td><\/tr>\n<tr><td>Bibtex<\/td><td style='font-weight:normal'><pre>@data{gmd9-1534-24,\ndoi = {10.21227\/gmd9-1534},\nurl = {https:\/\/doi.org\/10.21227\/gmd9-1534},\nauthor = {Koohi Esfahani, Mohsen and Vigna, Sebastiano and Boldi, \nPaolo and Vandierendonck, Hans and Kilpatrick, Peter},\npublisher = {IEEE Dataport},\ntitle = {MS-BioGraphs: Trillion-Scale Sequence Similarity Graph Datasets},\nyear = {2024} }<\/pre><\/td><\/tr>\n<\/table><br><br><h2>Files<\/h2><table id='files'>\n\n\t\t<tr>\n\t\t\t<td>Underlying Graph<\/td>\n\t\t\t<td>\n\t\t\t\tThe underlying graph in WebGraph format: \n\t\t\t\t<ul>\n\t\t\t\t\t<li>File: MSA200-underlying.graph, Size: 1,558,147,532,780 Bytes<\/li>\n\t\t\t\t\t<li>File: MSA200-underlying.offsets, Size: 4,319,801,854 Bytes<\/li>\n\t\t\t\t\t<li>File: MSA200-underlying.properties, Size: 1,517 Bytes<\/li>\n\t\t\t\t\t\n\t\t\t\t<\/ul>\n\t\t\t\tTotal Size: 1,562,467,336,151 Bytes<br>\n\t\t\t\tThese files are validated using &#8216;Edge Blocks SHAs File&#8217; as follows.\n\t\t\t<\/td>\n\t\t<\/tr>\n\n\t\t<tr>\n\t\t\t<td>Weights (Labels)<\/td>\n\t\t\t<td>\n\t\t\t\tThe weights of the graph in WebGraph format: \n\t\t\t\t<ul>\n\t\t\t\t\t<li>File: MSA200-weights.labels, Size: 1,105,784,580,128 Bytes<\/li>\n\t\t\t\t\t<li>File: MSA200-weights.labeloffsets, Size: 4,123,546,304 Bytes<\/li>\n\t\t\t\t\t<li>File: MSA200-weights.properties, Size: 187 Bytes<\/li>\n\t\t\t\t<\/ul>\n\t\t\t\tTotal Size: 1,109,908,126,619 Bytes<br>\n\t\t\t\tThese files are validated using &#8216;Edge Blocks SHAs File&#8217; as follows.\n\t\t\t<\/td>\n\t\t<\/tr>\n\n\t\t<tr>\n\t\t\t<td>Edge Blocks SHAs File (Text) <\/td>\n\t\t\t<td>\n\t\t\t\tThis file contains the shasums of edge blocks where each block contains \n\t\t\t\t64 Million continuous edges and has one shasum for its 64M endpoints and \n\t\t\t\tone for its 64M edge weights. <br>\n\t\t\t\tThe file is used to validate the underlying graph and the weights.\n\t\t\t\tFor further explanation about validation process,  please visit\n\t\t\t\tthe <a href='https:\/\/blogs.qub.ac.uk\/DIPSA\/MS-BioGraphs-Validation'>\n\t\t\t\thttps:\/\/blogs.qub.ac.uk\/DIPSA\/MS-BioGraphs-Validation<\/a>.\n\n\t\t\t\t<ul>\n\t\t\t\t\t<li>Name: MSA200_edges_shas.txt<\/li>\n\t\t\t\t\t<li>Size: 895,200 Bytes<\/li>\n\t\t\t\t\t<li>SHASUM: de1ac0ddce536168881ca2e49e6d5f0cf5b82bb5<\/li>\n\t\t\t\t\t\n\t\t\t\t<\/ul>\n\t\t\t<\/td>\n\t\t<\/tr>\n\n\t\t<tr>\n\t\t\t<td>Offsets (Binary)<\/td>\n\t\t\t<td>\n\t\t\t\tThe offsets array of the CSX (Compressed Sparse Rows\/Columns) graph in binary \n\t\t\t\tformat and little endian order. It consists of |V|+1 8-Bytes elements. <br>\n\t\t\t\tThe first and last values are 0 and |E|, respectively.<br>\n\t\t\t\tThis array helps converting the graph (or parts of it) from WebGraph format \n\t\t\t\tto binary format by one pass over (related) edges.<br>\n\t\t\t\t<ul>\n\t\t\t\t\t<li>Name: MSA200_offsets.bin<\/li>\n\t\t\t\t\t<li>Size: 14,058,588,216 Bytes<\/li>\n\t\t\t\t\t<li>SHASUM: c241d2dc4bdf46f60c1cd889ac367504d3f58805<\/li>\n\t\t\t\t\t\n\t\t\t\t<\/ul>\n\t\t\t<\/td>\n\t\t<\/tr>\n\n\t\t<tr>\n\t\t\t<td>WCC (Binary)<\/td>\n\t\t\t<td>\n\t\t\t\tThe Weakly-Connected Compontent (WCC) array in binary format and little endian order.<br>\n\t\t\t\tThis array consists of |V| 4-Bytes elements\n\t\t\t\tThe vertices in the same component have the same values in the WCC array.<br>\n\t\t\t\t<ul>\n\t\t\t\t\t<li>Name: MSA200-wcc.bin<\/li>\n\t\t\t\t\t<li>Size: 7,029,294,104 Bytes<\/li>\n\t\t\t\t\t<li>SHASUM: 2cb256d5e49e5dd0989715cb909fd8f27bfbd04c<\/li>\n\t\t\t\t\t\n\t\t\t\t<\/ul>\n\t\t\t<\/td>\n\t\t<\/tr>\n\n\t\t\t<tr>\n\t\t\t\t<td>Transposed&#8217;s Offsets (Binary)<\/td>\n\t\t\t\t<td>\n\t\t\t\t\tThe offsets array of the transposed graph in binary format and little endian order.\n\t\t\t\t\tIt consists of |V|+1 8-Bytes elements. The first and last values are 0 and |E|, respectively.<br>\n\t\t\t\t\tIt helps to transpose the graph by performing one pass over edges.<br>\n\n\t\t\t\t\t<ul>\n\t\t\t\t\t\t<li>Name: MSA200_trans_offsets.bin<\/li>\n\t\t\t\t\t\t<li>Size: 14,058,588,216 Bytes<\/li>\n\t\t\t\t\t\t<li>SHASUM: 47787ac64fb4485da02e3bcdc1696a814adfdb86<\/li>\n\t\t\t\t\t\t\n\t\t\t\t\t<\/ul>\n\t\t\t\t<\/td>\n\t\t\t<\/tr>\n\n\t\t<tr>\n\t\t\t<td>Names (tar.gz) <\/td>\n\t\t\t<td>\n\t\t\t\tThis compressed file contains 120 files in CSV format using &#8216;;&#8217; as the separator.\n\t\t\t\tEach row has two columns: ID of vertex and name of the sequence.<br>\n\t\t\t\tNote: If the graph has a &#8216;N2O Reordering&#8217; file, the n2o array should \n\t\t\t\tbe used to convert the vertex ID to old vertex ID which is used for identifying\n\t\t\t\tname of the protein in the `names.tar.gz` file.\n\t\t\t\t\n\t\t\t\t<ul>\n\t\t\t\t\t<li>Name: names.tar.gz<\/li>\n\t\t\t\t\t<li>Size: 27,130,045,933 Bytes<\/li>\n\t\t\t\t\t<li>SHASUM: ba00b58bbb2795445554058a681b573c751ef315<\/li>\n\t\t\t\t\t\n\t\t\t\t<\/ul>\n\t\t\t<\/td>\n\t\t<\/tr>\n\n\t\t<tr>\n\t\t\t<td>OJSON<\/td>\n\t\t\t<td>\n\t\t\t\tThe charactersitics of the graph and shasums of the files.<br>\n\t\t\t\tIt is in the open json format and needs a closing brace (}) to be appended\n\t\t\t\tbefore being passed to a json parser.\n\n\t\t\t\t<ul>\n\t\t\t\t\t<li>Name: MSA200.ojson<\/li>\n\t\t\t\t\t<li>Size: 897 Bytes<\/li>\n\t\t\t\t\t<li>SHASUM: 18e371cbb4bd9dbe6515e4528956ff32fb2e30c4<\/li>\n\t\t\t\t\t\n\t\t\t\t<\/ul>\n\t\t\t<\/td>\n\t\t<\/tr>\n<\/table><br><br><h2>Plots<\/h2><p style='font-size:1.2em'>For the explanation about the plots, please refer to the <a href='https:\/\/blogs.qub.ac.uk\/DIPSA\/MS-BioGraphs-Sequence-Similarity-Graph-Datasets'>MS-BioGraphs paper<\/a>.<br>To have a better resolution, please click on the images.<\/p><table id='plots'><tr><td>In-Degree Distribution<\/td><td><a target='_blank' href='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/MSA200-1.png' rel=\"noopener\"><img src='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/MSA200-1.png' \/><\/a><\/td><\/tr>\n<tr><td>Out-Degree Distribution<\/td><td><a target='_blank' href='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/MSA200-2.png' rel=\"noopener\"><img src='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/MSA200-2.png' \/><\/a><\/td><\/tr>\n<tr><td>Weight Distribution<\/td><td><a target='_blank' href='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/MSA200-3.png' rel=\"noopener\"><img src='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/MSA200-3.png' \/><\/a><\/td><\/tr>\n<tr><td>Vertex-Relative Weight Distribution<\/td><td><a target='_blank' href='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/MSA200-4.png' rel=\"noopener\"><img src='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/MSA200-4.png' \/><\/a><\/td><\/tr>\n<tr><td>Degree Decomposition<\/td><td><a target='_blank' href='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/MSA200-5.png' rel=\"noopener\"><img src='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/MSA200-5.png' \/><\/a><\/td><\/tr>\n<tr><td>Push and Pull Locality<\/td><td><a target='_blank' href='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/MSA200-6.png' rel=\"noopener\"><img src='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/MSA200-6.png' \/><\/a><\/td><\/tr>\n<tr><td>Cell-Binned Average Weight  Degree Distribution<\/td><td><a target='_blank' href='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/MSA200-7.png' rel=\"noopener\"><img src='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/MSA200-7.png' \/><\/a><\/td><\/tr>\n<tr><td>Weakly-Connected Components Size Distribution<\/td><td><a target='_blank' href='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/wcc-asym.png' rel=\"noopener\"><img src='https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/wcc-asym.png' \/><\/a><\/td><\/tr>\n<\/table><\/div>\n\n\n\n<p class=\"has-medium-font-size\"><br><strong><a rel=\"noreferrer noopener\" href=\"https:\/\/blogs.qub.ac.uk\/DIPSA\/MS-BioGraphs\/\" target=\"_blank\">MS-BioGraphs<\/a><\/strong><\/p>\n\n\n\n<p class=\"has-medium-font-size\"><br><strong>Related Posts<\/strong><\/p>\n\n\n<ul class=\"wp-block-latest-posts__list has-dates wp-block-latest-posts\"><li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2024\/08\/trees-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/minimum-spanning-forest-of-ms-biographs\/\">Minimum Spanning Forest of MS-BioGraphs<\/a><time datetime=\"2024-08-09T14:11:36+01:00\" class=\"wp-block-latest-posts__post-date\">9 August 2024<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2024\/04\/ivy-2-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/ms-biographs-on-ieee-dataport\/\">MS-BioGraphs on IEEE DataPort<\/a><time datetime=\"2024-04-17T07:26:23+01:00\" class=\"wp-block-latest-posts__post-date\">17 April 2024<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2024\/02\/poplar2-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/paragrapher-source-code-for-webgraph-types\/\">ParaGrapher Source Code For WebGraph Types<\/a><time datetime=\"2024-02-16T08:13:13+00:00\" class=\"wp-block-latest-posts__post-date\">16 February 2024<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/11\/goldcrest-1-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/on-overcoming-hpc-challenges-of-trillion-scale-real-world-graph-datasets\/\">On Overcoming HPC Challenges of  Trillion-Scale Real-World Graph Datasets \u2013 BigData&#8217;23 (Short Paper)<\/a><time datetime=\"2023-12-15T02:47:00+00:00\" class=\"wp-block-latest-posts__post-date\">15 December 2023<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/10-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/dataset-announcement-ms-biographs-trillion-scale-public-real-world-sequence-similarity-graphs\/\">Dataset Announcement: MS-BioGraphs, Trillion-Scale Public Real-World Sequence Similarity Graphs &#8211; IISWC&#8217;23 (Poster)<\/a><time datetime=\"2023-10-02T00:26:00+01:00\" class=\"wp-block-latest-posts__post-date\">2 October 2023<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/2-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/ms-biographs-sequence-similarity-graph-datasets\/\">MS-BioGraphs: Sequence Similarity Graph Datasets<\/a><time datetime=\"2023-08-30T06:52:00+01:00\" class=\"wp-block-latest-posts__post-date\">30 August 2023<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/1-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/ms-biographs-ms\/\">MS-BioGraphs MS<\/a><time datetime=\"2023-08-10T09:53:42+01:00\" class=\"wp-block-latest-posts__post-date\">10 August 2023<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/6-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/ms-biographs-msa500\/\">MS-BioGraphs MSA500<\/a><time datetime=\"2023-08-10T09:52:00+01:00\" class=\"wp-block-latest-posts__post-date\">10 August 2023<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/3-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/ms-biographs-ms200\/\">MS-BioGraphs MS200<\/a><time datetime=\"2023-08-10T09:51:00+01:00\" class=\"wp-block-latest-posts__post-date\">10 August 2023<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/7-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/ms-biographs-msa200\/\">MS-BioGraphs MSA200<\/a><time datetime=\"2023-08-10T09:50:00+01:00\" class=\"wp-block-latest-posts__post-date\">10 August 2023<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/4-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/ms-biographs-ms50\/\">MS-BioGraphs MS50<\/a><time datetime=\"2023-08-10T09:49:00+01:00\" class=\"wp-block-latest-posts__post-date\">10 August 2023<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/8-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/ms-biographs-msa50\/\">MS-BioGraphs MSA50<\/a><time datetime=\"2023-08-10T09:48:00+01:00\" class=\"wp-block-latest-posts__post-date\">10 August 2023<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/9-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/ms-biographs-msa10\/\">MS-BioGraphs MSA10<\/a><time datetime=\"2023-08-10T09:44:41+01:00\" class=\"wp-block-latest-posts__post-date\">10 August 2023<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/5-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/ms-biographs-ms1\/\">MS-BioGraphs MS1<\/a><time datetime=\"2023-08-10T09:41:14+01:00\" class=\"wp-block-latest-posts__post-date\">10 August 2023<\/time><\/li>\n<li><div class=\"wp-block-latest-posts__featured-image alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/11-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" style=\"max-width:60px;max-height:60px;\" \/><\/div><a class=\"wp-block-latest-posts__post-title\" href=\"https:\/\/blogs.qub.ac.uk\/dipsa\/ms-biographs-validation\/\">MS-BioGraphs Validation<\/a><time datetime=\"2023-08-10T09:40:00+01:00\" class=\"wp-block-latest-posts__post-date\">10 August 2023<\/time><\/li>\n<\/ul>\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Name MS-BioGraphs &#8211; MSA200 URL https:\/\/blogs.qub.ac.uk\/DIPSA\/MS-BioGraphs-MSA200 Download Link https:\/\/doi.org\/10.21227\/gmd9-1534 Script for Downloading All Files https:\/\/blogs.qub.ac.uk\/DIPSA\/MS-BioGraphs-on-IEEE-DataPort\/ Validating and Sample Code https:\/\/blogs.qub.ac.uk\/DIPSA\/MS-BioGraphs-Validation\/ Graph Explanation Vertices represent proteins and each edge represents the sequence similarity between its two endpoints Edge Weighted Yes Directed Yes Number of Vertices 1,757,323,526 Number of Edges 500,444,322,597 Maximum In-Degree 658,879 Maximum Out-Degree 709,176 [&hellip;]<\/p>\n","protected":false},"author":1315,"featured_media":2291,"comment_status":"open","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[63],"tags":[116,68,67,35,38,64,66,65],"class_list":{"0":"post-2288","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-ms-biographs","8":"tag-biological-networks","9":"tag-dataset","10":"tag-graph-datasets","11":"tag-graph-processing","12":"tag-high-performance-computing","13":"tag-high-performance-graph-processing","14":"tag-real-world-graphs","15":"tag-sequence-similarity-graphs","16":"czr-hentry"},"jetpack_featured_media_url":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-content\/uploads\/sites\/14\/2023\/08\/7.jpg","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/posts\/2288","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/users\/1315"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/comments?post=2288"}],"version-history":[{"count":11,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/posts\/2288\/revisions"}],"predecessor-version":[{"id":3100,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/posts\/2288\/revisions\/3100"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/media\/2291"}],"wp:attachment":[{"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/media?parent=2288"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/categories?post=2288"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.qub.ac.uk\/dipsa\/wp-json\/wp\/v2\/tags?post=2288"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}