an improved algorithm for data mining is proposed. The first use of ICTCLAS system for text preprocessing, construct the term vector in frequency characteristics; then the fusion frequency characteristics and frequency - inverse document frequency features, construct the characteristic matrix of the training sample set; then the matrix singular value decomposition, get the semantic space for semantic space transform of text feature vector, semantic vector; the construction of combined support vector machine classifier, automatic classification of semantic vector corresponding to the Chinese bibliography. At last, a lot of simulation experiments have been done, and the experimental results show that the classification accuracy of this method is higher than that of the existing methods.