There is a [GPraphBLAS implementation which supports MPI](https://people.eecs.berkeley.edu/~aydin/CombBLAS/html/index.html). We can use it to implement distributed CFPQ. - [ ] Pair of matrices per node for independent pairs if exists. - [ ] Distributed multiplication of two matrices. - [ ] Evaluate implementations on the [data set](https://github.com/SokolovYaroslav/CFPQ-on-GPGPU/tree/master/data/graphs) and compare with other implementations. - [ ] Evaluate scalability on MPI cluster.