Abstract:With the massive growth of data, data clustering algorithm research is facing the challenge of mass data mining and processing.For K - means clustering algorithm to the dependence of the initial clustering center is too strong, and poor global search ability shortcomings, will be an improved artificial colony algorithm combined with K - means algorithm, ABC_Kmeans clustering algorithm is proposed, in order to improve the performance of clustering.In order to improve the clustering algorithm’s ability to deal with huge amounts of data, uses the MapReduce model for parallel processing to ABC_Kmeans, design the Map, Combine and Reduce function respectively,Through the experiments on several huge amounts of data collection,show ABC_Kmeans parallel design of algorithm has good speedup and scalability, applicable to today's huge amounts of data mining and processing.