Hybrid Paritioning and Computation
Natural graphs with skewed distribution raise unique challenges to graph computation and partitioning.
Existing graph analytics frameworks usually use a "ONE size fits ALL"
design that uniformly processes all vertices, which either suffer from notable load imbalance and high contention for high-degree vertices
(e.g., Pregel and
or incur high communication cost among vertices even for low-degree vertices
We argue that skewed distribution in natural graphs also calls for differentiated processing of high-degree and low-degree vertices.
We then introduce PowerLyra,
a new graph analytics engine that embraces the best of both worlds of existing frameworks,
by dynamically applying different computation and partition strategies for different vertices.
PowerLyra uses Pregel/GraphLab like computation models for process low-degree vertices to minimize computation, communication and synchronization overhead,
and uses PowerGraph-like computation model for process high-degree vertices to reduce load imbalance and contention.
To seamless support all PowerLyra application, PowerLyra further introduces an adaptive unidirectional graph communication.
PowerLyra additionally proposes a new hybrid graph cut algorithm that
embraces the best of both worlds in edge-cut and vertex-cut, which adopts edge-cut for low-degree vertices and vertex-cut for high-degree vertices.
Both theoretical analysis and empirical validation show that the expected replication factor of random hybrid-cut is alway better than both random (Hash-based)
and heuristic (e.g., Oblivious or Grid) vertex-cut for skewed power-law graphs.
We also develop a new distributed greedy heuristic hybrid-cut algorithm, namely Ginger,
inspired by Fennel.
Compared to Gird vertex-cut, Ginger can reduce the replication factor by up to 3.11X (from 1.26X) and 2.92X (from 2.03X) for real-world and synthetic graphs accordingly.
Note that hybrid graph partitions also work with existing synchronous and asynchronous enignes of GraphLab.
Finally, PowerLyra adopts locality-conscious data layout optimization in graph ingress phase to mitigate poor locality during vertex communication.
we argue that a small increase of graph ingress time (less than 5% and 10% for real-world and power-law graphs respectively) is more worthwhile for an often larger speedup
in execution time (usually more than 10% speedup, specially 21% for Twitter follow graph).
PowerLyra is implemented as an execution engine and graph partitions of GraphLab, and can seamlessly support all GraphLab applications.
A detailed evaluation on a 48-node cluster using three different graph algorithms (PageRank, Approximate Diameter and Connected Components) show that
PowerLyra outperforms current synchronous engine with Grid partition of PowerGraph (Jul. 8, 2013. commit:fc3d6c6) by up to 5.53X (from 1.97X) and 3.26X (from 1.49X)
for real-world (Twitter, UK-2005,
and synthetic (10-million vertex power-law graph ranging from 1.8 to 2.2) graphs accordingly,
due to significantly reduced replication factor, less communication cost and improved load balance.
Many machine learning and data mining (MLDM) problems like recommendation, topic modeling and medical diagnosis can be modeled as computing on
bipartite graphs, whose vertices can be separated as two disjoint sets U and V and
every edge connects a vertex in U and V. Due to the wide existence and importance of bipartite graphs, there have been a number of popular MLDM algorithms
that operate on such graphs, including Singular Value Decomposition (SVD), Alternating Least Squares (ALS), Stochastic Gradient Descent (SGD),
Belief Propagation (BP) and latent Dirichlet allocation (LDA).
However, most existing systems simply apply general graph partitioning algorithms that are oblivious to the
unique features of bipartite graphs.
This results in suboptimal graph partition with significant replicas of vertices and/or edges, leading to not only redundant computation,
but also excessive network communication to synchronize graph states.
we make a systematic study on the characteristics of real-world bipartite graphs and the related MLDM algorithms,
and describe why existing online distributed graph partitioning algorithms fail to produce an optimal graph partition for bipartite graphs.
First, real-world bipartite graphs for MLDM are usually imbalanced.
This means that the size of two subsets in bipartite graph is significantly skewed, even in the scale of several orders of magnitude.
Second, the computation load of many MLDM algorithms for bipartite graphs may also be skewed among vertices from the two subsets.
Finally, the size of data associated with vertices from the two subsets can be significantly skewed.
Based on the study, we argue that the unique properties of bipartite graphs and special requirement of bipartite algorithms demand differentiated partitioning
of the two disjoint sets of vertices. We introduce BiGraph,
a set of distributed graph partition algorithms designed for bipartite applications.
The key of BiGraph is to partition graphs in a differentiated way and loading data according to the
BiGraph is implemented as separate graph partition modules of GraphLab, and can seamlessly work with existing synchronous
and asynchronous engines. Our experiment using three MLDM algorithms (SVD, ALS and SGD) on an in-house 6-machine multicore cluster with 144 CPU cores
shows that BiGraph reduces up to 62% vertex replication, and saves up to 96% network traffic. This transforms to a speedup up to 17.75X (from 1.38X)
compared to default partitioning algorithms (e.g., Oblivious and Grid) using the same default synchronous engine.
PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs (slides) updated version
Rong Chen, Jiaxin Shi, Yanzhe Chen, Haibing Guan, Binyu Zang and Haibo Chen.
Institute of Parallel and Distributed Systems Technical Report, Number: IPADSTR-2013-001.
Shanghai Jiao Tong University, November 2013.
BiGraph: Bipartite-aware Distributed Graph Partition for Big Learning
Rong Chen, Jiaxin Shi, Haibo Chen and Haibing Guan.
Institute of Parallel and Distributed Systems Technical Report, Number: IPADSTR-2013-002.
Shanghai Jiao Tong University, December 2013.
If you use PowerLyra in your work or research, please let us know about it.
We also encourage you to reference our papers.
The PowerLyra source code is distributed with a Apache license 2.0. The copyright is held by Shanghai Jiao Tong University.
PowerLyra is provided "as is" without any warranties and conditions of any kind.
The latest release is based on GraphLab 2.2 (Nov. 17, 2013. commit:42bfa8e0b252a0f36), but without locality-conscious graph layout optimization.
You can check out the newest code from IPADS's gitlab server:
git clone http://ipads.se.sjtu.edu.cn:1312/opensource/powerlyra.git
or from the github mirror.
If you are interested in estimating or working with powerlyra hybrid engine and graph partitions in paper,
you can also obtain a snapshot (~80MB)
from Sep. 25, 2013, which is based on GraphLab 2.2 (Jul. 8, 2013. commit:fc3d6c6).
NOTE: the name of engines and graph partitions are different in latest release and snapshot code (sorry for confusion).
Once you've downloaded PowerLyra, you can find instructions for installing and building it on the GraphLab's documentation page.
You just need to set the hybrid engine and graph partition in command line
Usage#1: Running PageRank algorithm on Twitter Follower graph using PowerLyra engine with Ginger graph partition.
$mpiexec -f ~/mpd.hosts -n 48 ./pagerank --format=snap --graph=/path/to/twitter_
--engine plsync --engine_opts max_iterations=20
--engine engine : set execution engine (plsync: powerlyra synchronous engine)
--graph_opts : set graph options
ingress : set ingress algorithms (hybrid: random | hybrid_ginger: heuristic)
threshold : set threshold to differentiate low-degree and high-degree vertices, default is 100
interval : set interval of coordination, default is never (only for hybrid_ginger)
nedges : set total number of edges for input graph (only for hybrid_ginger)
nverts : set total number of vertices for input graph (only for hybrid_ginger)
Usage#2: Running Approximate Diameter algorithm on synthetic power-law graph using PowerLyra engine with Hybrid graph partition.
$mpiexec -f ~/mpd.hosts -n 48 ./approximate_diameter --format=snap --graph=/path/to/powerlaw-2.2-10m_
--graph_opts ingress=hybrid,threshold=100 --engine plsync
Usage#3: Running LBP algorithm on synthetic matrix graph using PowerLyra engine with Ginger graph partition.
$mpiexec -f ~/mpd.hosts -n 48 ./lbp_structured_prediction --graph /path/to/synth_edata2500w_ --engine plsync
--prior /path/to/synth_vdata2500w_ --graph_opts ingress=hybrid_ginger,interval=1000,nedges=49990000,nverts=25000000
Usage#4: Running ALS algorithm on Small Netflix graph using GraphLab engine with Bipartite graph partition.
$mpiexec -f ~/mpd.hosts -n 6 ./als --matrix /path/to/smallnetflix/smallnetflix_mm --max_iter=5 --ncpus=24
--predictions=out_predictions --minval=1 --maxval=5 --D=100 --engine sync
--graph_opts : set graph options
ingress : set ingress algorithms (bipartite: random | bipartite_aweto: heuristic)
favorite : set favorite subset of bipartite graph ("source" or "target"), default is source
affinity : set data affinity or not for bipartite partition, default is 0 (only for bipartite)
Usage#5: Running SVD algorithm on AS-Skitter graph using GraphLab engine with Aweto graph partition.
$mpiexec -f ~/mpd.hosts -n 6 ./svd --matrix /path/to/as --max_iter=3 --ncpus=24 --rows=1696416 --cols=1696417
--nsv=2 --nv=7 --tol=1e-2 --binary=true --save_vectors=0 --input_file_offset=0 --ortho_repeats=1 --engine sync
Please contact Rong Chen for further information.
Welcome to let us know about bug reports and suggested improvements, while we may not be able to provide technical support timely.
If you make some correction or improvement to PowerLyra, we would appreciate receiving a patch that we can include in the next release.
PowerSwitch: adaptive prediction and mode switch (synchronous and asynchronous) on parallel graph computation.
This page has been visited
times since 11/9/2013.
Stats provided by the Easy Counter