Abstract:Medical image retrieval is the basis of effective utilization of medical resources, and the massive and incremental medical image brings new challenges and requirements for image retrieval. In order to improve the efficiency of medical image retrieval process, a Flink based medical image retrieval system is designed and implemented. Firstly, the system uses web application as the users" operation entry, and builds data platform and business cluster at the back end. Secondly, HBase is used to store the medical image data in a distributed way, and extracts the medical image features by using the deep convolution neural network model. Thirdly, the extracted medical image feature data is multiplied and encoded, and stored by HBase. Finally, Kafka is used for real-time image retrieval through memory computing and then consumed on Flink, as well as feature index coding for batch imported images. The system deployed a distributed cluster on four nodes of servers and tested with real medical image data set. The comparison experiment under MapReduce and Spark shows that the system has better retrieval efficiency.