PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs

Latest News

[14/04/2014] +Async: a new hybrid asynchronous engine on PowerLyra.
[27/02/2014] BiGraph: a set of new distributed graph partition algorithms designed for bipartite applications
[19/11/2013] PowerLyra: a new graph analytics engine that differentiates graph computation and partitioning.

Related Projects

Hybrid Paritioning and Computation

Natural graphs with skewed distribution raise unique challenges to graph computation and partitioning. Existing graph analytics frameworks usually use a "ONE size fits ALL" design that uniformly processes all vertices, which either suffer from notable load imbalance and high contention for high-degree vertices (e.g., Pregel and GraphLab), or incur high communication cost among vertices even for low-degree vertices (e.g., PowerGraph and GraphX).

We argue that skewed distribution in natural graphs also calls for differentiated processing of high-degree and low-degree vertices. We then introduce PowerLyra, a new graph analytics engine that embraces the best of both worlds of existing frameworks, by dynamically applying different computation and partitioning strategies for different vertices. PowerLyra uses Pregel/GraphLab-like computation models for process low-degree vertices to minimize computation, communication and synchronization overhead, and uses PowerGraph-like computation model for process high-degree vertices to reduce load imbalance, contention and memory pressure. PowerLyra follows the interface of GAS (Gather, Apply and Scatter) model and can seamlessly support various graph algorithms (e.g., all GraphLab toolkits).

PowerLyra additionally proposes a new hybrid graph cut algorithm that embraces the best of both worlds in edge-cut and vertex-cut, which evenly distributes low-degree vertices along with their edges like edge-cut, and evenly distributes edges of high-degree vertices like vertex-cut. Both theoretical analysis and empirical validation show that the expected replication factor of random hybrid-cut is alway better than both random (Hash-based), contrained (e.g., Grid), and heuristic (e.g., Oblivious or Coordinated) vertex-cut for skewed power-law graphs. We also develop a new distributed greedy heuristic hybrid-cut algorithm, namely Ginger, inspired by Fennel. Note that hybrid graph partitioning algorithms also work with existing synchronous and asynchronous enignes of GraphLab.

Finally, PowerLyra adopts locality-conscious data layout optimization in graph ingress phase to mitigate poor locality during communication. For graph computation that processes a graph multiple iterations and even multiple times in memory, we argue that a small increase of graph ingress time is more worthwhile for an often larger speedup in execution time.

PowerLyra is implemented as separate execution engines and partitioning algorithms of GraphLab, and can seamlessly support all GraphLab toolkits. A detailed evaluation on a VM-based 48-node EC2-like cluster and a physical 6-node in-house cluster using three graph analytics and two MLDM algorithms show that PowerLyra outperforms PowerGraph (Mar. 26, 2014. commit:6f787c7e35) by up to 5.53X (from 1.24X) and 3.26X (from 1.49X) for real-world and synthetic graphs accordingly, and is much faster than other systems (e.g., Giraph, GPS, GraphX and CombBLAS) in both graph ingress and computation time, yet with less memory consumption.

Bipartite-aware Paritioning

Many machine learning and data mining (MLDM) problems like recommendation, topic modeling and medical diagnosis can be modeled as computing on bipartite graphs, whose vertices can be separated as two disjoint sets U and V and every edge connects a vertex in U and V. Due to the wide existence and importance of bipartite graphs, there have been a number of popular MLDM algorithms that operate on such graphs, including Singular Value Decomposition (SVD), Alternating Least Squares (ALS), Stochastic Gradient Descent (SGD), Belief Propagation (BP) and latent Dirichlet allocation (LDA).

However, most existing systems simply apply general graph partitioning algorithms that are oblivious to the unique features of bipartite graphs. This results in suboptimal graph partition with significant replicas of vertices and/or edges, leading to not only redundant computation, but also excessive network communication to synchronize graph states.

we make a systematic study on the characteristics of real-world bipartite graphs and the related MLDM algorithms, and describe why existing online distributed graph partitioning algorithms fail to produce an optimal graph partition for bipartite graphs. First, real-world bipartite graphs for MLDM are usually imbalanced. This means that the size of two subsets in bipartite graph is significantly skewed, even in the scale of several orders of magnitude. Second, the computation load of many MLDM algorithms for bipartite graphs may also be skewed among vertices from the two subsets. Finally, the size of data associated with vertices from the two subsets can be significantly skewed.

Based on the study, we argue that the unique properties of bipartite graphs and special requirement of bipartite algorithms demand differentiated partitioning of the two disjoint sets of vertices. We introduce BiGraph, a set of distributed graph partition algorithms designed for bipartite applications. The key of BiGraph is to partition graphs in a differentiated way and loading data according to the data affinity.

BiGraph is implemented as separate graph partition modules of GraphLab, and can seamlessly work with existing synchronous and asynchronous engines. Our experimental results on a VM-based 48-node EC2-like cluster and a physical 6-node in-house cluster using three typical MLDM algorithms (SVD, ALS and SGD) show that BiGraph reduces up to 81% vertex replication, and saves up to 96% network traffic. This transforms to a speedup up to 17.75X (from 1.16X) compared with the default partitioning algorithms (e.g., Oblivious and Grid) using the same default synchronous engine.