Algorithm-based fault-tolerant parallel sorting
by Edson T. Camargo; Elias P. Duarte Junior
International Journal of Critical Computer-Based Systems (IJCCBS), Vol. 11, No. 1/2, 2024

Abstract: High performance computing (HPC) systems often require substantial resources, and can take up to several hours or days to execute. Upon a failure, it is important to loose as little computation as possible. In this work we present an algorithm-based fault tolerance (ABFT) strategy for hypercube-based parallel algorithms. The strategy assumes the virtual VCube topology, which has several logarithmic properties that are preserved even as nodes fail. The strategy guarantees that the algorithm does not halt even after up to (N - 1) nodes crash, in a system of N nodes. We use parallel sorting as a case study, describing how to make a fault-tolerant version of three parallel sorting algorithms: HyperQuickSort, QuickMerge and Bitonic Sort. The algorithms were implemented in MPI using ULMF to handle faults. Experimental results are presented showing the performance and robustness of the solution for sorting up to a billion integers in scenarios with faults.

Online publication date: Thu, 13-Jun-2024

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

 
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Critical Computer-Based Systems (IJCCBS):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?


Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email subs@inderscience.com