سال انتشار: ۱۳۸۴

محل انتشار: یازدهمین کنفرانس سالانه انجمن کامپیوتر ایران

تعداد صفحات: ۵

نویسنده(ها):

Mehdi Aminian – Amirkabir University of Technology Computer Engineering and IT Department
Mohammad K. Akbari – Department of Computer Engineering and IT Amirkabir University of Technology
Bahman Javadi –

چکیده:

Execution of MPI applications on Clusters and Grid deployments suffering from node and network failures which motivates the use of fault tolerant MPI implementations. Two category techniques have been introduced to make these systems fault-tolerant. The first one is checkpoint-based technique and the other is log-based recovery protocol. Sender-based pessimistic logging which falls in the second category is the best choice from recovery view point but not fault-free execution. This is due to timely overhead of synchronous logging. Therefore the relaxing logging atomicity (RLA) method was studied and modified to decrease this overhead performance. Our method was tested on MPICH-V2 tool, a free platform implementing pessimistic logging with uncoordinated checkpoint. Experimental results showed a considerable decrease of run-time for MPI applications with more communication