Theses Doctoral

# New Primitives for Tackling Graph Problems and Their Applications in Parallel Computing

Zhong, Peilin

We study fundamental graph problems under parallel computing models. In particular, we consider two parallel computing models: Parallel Random Access Machine (PRAM) and Massively Parallel Computation (MPC). The PRAM model is a classic model of parallel computation. The efficiency of a PRAM algorithm is measured by its parallel time and the number of processors needed to achieve the parallel time. The MPC model is an abstraction of modern massive parallel computing systems such as MapReduce, Hadoop and Spark. The MPC model captures well coarse-grained computation on large data --- data is distributed to processors, each of which has a sublinear (in the input data) amount of local memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. We usually desire fully scalable MPC algorithms, i.e., algorithms that can work for any local memory size. The efficiency of a fully scalable MPC algorithm is measured by its parallel time and the total space usage (the local memory size times the number of machines).

Consider an 𝑛-vertex 𝑚-edge undirected graph 𝐺 (either weighted or unweighted) with diameter 𝐷 (the largest diameter of its connected components). Let 𝑁=𝑚+𝑛 denote the size of 𝐺. We present a series of efficient (randomized) parallel graph algorithms with theoretical guarantees. Several results are listed as follows:

1) Fully scalable MPC algorithms for graph connectivity and spanning forest using 𝑂(𝑁) total space and 𝑂(log 𝐷loglog_{𝑁/𝑛} 𝑛) parallel time.

2) Fully scalable MPC algorithms for 2-edge and 2-vertex connectivity using 𝑂(𝑁) total space where 2-edge connectivity algorithm needs 𝑂(log 𝐷loglog_{𝑁/𝑛} 𝑛) parallel time, and 2-vertex connectivity algorithm needs 𝑂(log 𝐷⸱log²log_{𝑁/𝑛} n+\log D'⸱loglog_{𝑁/𝑛} 𝑛) parallel time. Here 𝐷' denotes the bi-diameter of 𝐺.

3) PRAM algorithms for graph connectivity and spanning forest using 𝑂(𝑁) processors and 𝑂(log 𝐷loglog_{𝑁/𝑛} 𝑛) parallel time.

4) PRAM algorithms for (1 + 𝜖)-approximate shortest path and (1 + 𝜖)-approximate uncapacitated minimum cost flow using 𝑂(𝑁) processors and poly(log 𝑛) parallel time.

These algorithms are built on a series of new graph algorithmic primitives which may be of independent interests.

## Files

• Zhong_columbia_0054D_16549.pdf application/pdf 2.3 MB Download File