Abstract:Big data era has a revolution on the data processing mode,and the way dealing with bigdata by Hadoop distributed framework becomes one of the most popular research topic.Cloud computing model of clusters covers the shortage of the large amount of calculation and time-consuming of traditional non-distributed algorithm, meanwhile huge amounts of unstructured data increases the difficulty of data utilization.Aimed at the problem of solving the mass classification in data mining, this essay puts forward a algorithm, i.e. Bi-Measurement Central Index KNN Classification. And the algorithm mainly deals with in the field of the cross or overlap data. First, the essay is to find center of training data, then calculate the Euclidean distance between classifying data and training sites, and determine the most similar to the three categories. In addition, the essay selects k nearest neighbor points by the cosine distance metric, and computes the results by MapReduce. Finally, the UCI database is compared with and verified. The results show that though the amplitude of improving the accuracy by the proposed algorithm is not very great, the efficiency of the algorithm is greatly improved.