Abstract:In today's cloud computing environment, Hadoop has become the fact standard for big data processing. However, cloud computing has the characteristics of large scale, high complexity and dynamic characteristics, fault occurrence is common, but it often affects the operation of jobs on the Hadoop. Although the Hadoop has a built-in fault detection and recovery mechanism, but the cloud environment, the changes in the load of different nodes, the job is scheduled to still lead to failure. The proposed self response of fault aware scheduling detection method, according to different load capacity of heterogeneous environment, and make fast server nodes and slow nodes judgment, the scheduling of job allocated to the appropriate node , adjust the decision task to prevent mission failure occurred. Finally in the Hadoop framework and basic scheduler were experimental performance compared results show that the method reduce job failure rate of up to 19% and shorten job execution time, and also reduce the CPU and memory usage.