Abstract:Due to the increasing risk of privacy leakage, data collectors are reluctant to share their data private data, which leads to result in "data silos". Federated learning enables data sharing without leaving the local area, but there are still some problems. On the one hand, centralized processing of central server suffers from expensive time cost and single points of failure. On the other hand, for multi-institutional data sharing, model training might be affected by participating nodes mixed with malicious nodes, which leads to data privacy leakage. Therefore, in this paper, a new distributed federated learning architecture is proposed to combine blockchain and federated learning for efficient node selection and communication. And it enables direct communication between participation nodes instead relying on central server. Based on the proposed architecture, a reputation-based node selection algorithm scheme (RBLNS) is proposed to screen the participating nodes and ensure the privacy and security of the participating nodes. The experimental results demonstrate that our RBLNS is capable of improving model performance.