Compiler-Controlled Extraction of ComputationCommunication Overlap in MPI Applications

Das, Dibyendu; Gupta, Manish; Ravindran, Rajan; Shivani, W; Sivakeshava, P; Uppal, Rishabh

Please use this identifier to cite or link to this item: http://localhost:8080/xmlui/handle/123456789/1549

Full metadata record

DC Field	Value	Language
dc.contributor.author	Das, Dibyendu	-
dc.contributor.author	Gupta, Manish	-
dc.contributor.author	Ravindran, Rajan	-
dc.contributor.author	Shivani, W	-
dc.contributor.author	Sivakeshava, P	-
dc.contributor.author	Uppal, Rishabh	-
dc.date.accessioned	2024-11-19T11:32:58Z	-
dc.date.available	2024-11-19T11:32:58Z	-
dc.date.issued	2008	-
dc.identifier.citation	10.1109/IPDPS.2008.4536193	en_US
dc.identifier.uri	http://localhost:8080/xmlui/handle/123456789/1549	-
dc.description	NITW	en_US
dc.description.abstract	Exploiting computation-communication overlap is a well- known requirement to speed up distributed applications. However, efforts till now use programmer expertise, rather than any automatic tool to do this. In our work we propose the use of an aggressive optimizing compiler (IBM's xl series) to automatically extract opportunities for computation communication overlap. We depend on aggressive inlining, dominator trees and SSA based use-def analyses provided by the compiler framework for exploiting such overlap. Our target is MPI applications. In such applications, we try to automatically move mpi_waits as well as split blocking mpi_send/recv to create more opportunities for overlap. Our objective is two-fold: firstly, our tool should relieve the programmer from the burden of hunting for overlap manually as much as possible, and secondly, it should aid in converging on parallel applications which benefit from such overlap quickly. These are necessary as MPI applications are quickly becoming complex and huge and manual overlap extraction is becoming cumbersome. Our early experience shows that it is not necessary that exploiting an overlap always leads to performance improvement. This corroborates with the fact that if we have an automatic tool, then, we can quickly discard such applications (or certain configurations of such applications) without spending person-hours to manually rewrite MPI applications for introducing non-blocking calls. Our initial experiments with the industry-standard NAS parallel benchmarks show that we can get small-to-moderate improvements by utilizing overlap even in such highly tuned benchmarks. This augurs well for real-world applications that do not exploit overlap optimally.	en_US
dc.language.iso	en	en_US
dc.publisher	IPDPS Miami 2008 - Proceedings of the 22nd IEEE International Parallel and Distributed Processing Symposium, Program and CD-ROM	en_US
dc.subject	Computation-Communication Overlap	en_US
dc.subject	Compiler Optimization	en_US
dc.title	Compiler-Controlled Extraction of ComputationCommunication Overlap in MPI Applications	en_US
dc.type	Other	en_US
Appears in Collections:	Computer Science & Engineering

Files in This Item:

File	Description	Size	Format
Compiler-controlled_extraction_of_computation-communication_overlap_in_MPI_applications.pdf		530.9 kB	Adobe PDF	View/Open

Show simple item record