Path: utzoo!mnetor!uunet!husc6!bbn!gatech!hubcap!ut-sally!kumar From: ut-sally!kumar@uunet.UU.NET (Vipin Kumar) Newsgroups: comp.parallel Subject: Re: Breakthrough at Sandia? + Iso-efficiency function Message-ID: <1373@hubcap.UUCP> Date: 12 Apr 88 12:33:07 GMT Sender: fpst@hubcap.UUCP Lines: 64 Summary: Iso-efficiency functions & Parallel Efficient Algorithms Approved: parallel@hubcap.clemson.edu In article <1317@hubcap.UUCP>, kruskal@mimsy.UUCP (Clyde P. Kruskal) writes: > > > We define an algorithm to be in the class PE (parallel efficient) > if the parallel time for a problem of size N on P processors is > T(P,N) = O(t(N)/P + P^k) , > where t(N) is the time of the ("best") sequential algorithm. > I.e., PE captures the problems with parallel algorithms > that achieve optimal speedup over the optimal sequential time > for all problems whose execution times are relatively (polynomially) > large with respect to the number of processors. Notice that > it is sequential time t(N) we compare to rather problem size N; this > is relevant for problems that have super polynomial executions times. > (Actually, for other reasons, we define PE somewhat differently > but this definition is equivalent.) We argue that this is the concept > of parallel performance that users of parallel machine desire. > > We prove that the class PE is invariant under the shared > memory models (PRAMs), independent of whether the machines > are synchronous or asynchronous, SIMD or MIMD, or have small > or large latency for the processors to communicate. > We also show that PE is invariant under the completely > connected memory module machines for probabilistic algorithms. > According to the above definition of PE, any parallel algorithm with polynomial iso-efficiency function is a parallel efficient algorithm. This definition of PE appears to be too weak. The ideal parallel formulation should have a linear iso-efficiency function (O(P), where P is the number of processors). In references [1] and [2], we have 3 different parallel formulations of depth-first search on the ring architecture with iso-efficiency functions of O(k^P), O(P^3 log P), O(P^2 log P), respectively. Clearly the first formulation is not in PE (its iso-efficiency function is exponential). The second and third formulations are in PE, but the third formulation is far superior to the second, as it has a better iso-efficiency function. (See [1] and [2]). The choice of architecture is also very important. For a shared memory machine (with message combining), we have a parallel depth-first search formulation whose iso-efficiency function is O(P log P), which is close to linear (and is significantly better than O(P^2 log P)). I think that the users of parallel machines should look for parallel formulations whose iso-efficiency functions are linear (or almost linear). Vipin ----- References: [1] Parallel Depth First Search on the Ring Architecture by Vipin Kumar, V. Nageshwara Rao and K. Ramesh. To appear in the proc. of 1988 Parallel Processing Conference. [2] Parallel Depth-First on Multiprocessors, Part II: Analysis by Vipin Kumar and V. Nageshwar Rao. To appear in International Journal of Parallel programming.