Abstract:In the process of multi sequence alignment of biological genes, parallel algorithms only calculate a single Spark cluster parameter, resulting in poor parallel performance of the algorithm. For this purpose, a parallel algorithm for multi sequence alignment of biological genes based on Spark cloud computing was designed. Based on the obtained biological genetic sequence data, it was optimized and the dynamic planning of the biological gene multi sequence alignment task was carried out by calculating the matching degree between different sequences. Utilize Spark cloud computing technology to build Spark clusters and calculate the parameters of multiple Spark clusters. By utilizing the similarities and differences between multiple biological gene sequences to select the optimal matching path, a parallel computing model for aligning multiple biological gene sequences is established and solved to obtain corresponding parallel algorithms for aligning multiple sequences. The experimental results show that the algorithm has better parallelism and can effectively improve the parallel performance of the algorithm.