SURVEY ON K-NEAREST NEIGHBOR APPROACH FOR BIG DATA CLASSIFICATION BASED ON MAP REDUCE
Vigneshwaran.R1, Balaji.V2, Manikandan.A3, Dr. Danapaquiame.N3Journal Title | : | Asian Journal of Applied Research |
---|---|---|
DOI | : | |
Page No | : | 1-4 |
Volume | : | 3 |
Issue | : | 1 |
Month/Year | : | 1/2017 |
Keywords
: Classification algorithm, Mapreduce, Paralelism algorithm, algorithm based on Mapreduce
Abstract
In the data mining one of the most well known methods is K-Nearest Neighbor classifier because of its simple and effectiveness. Due to its way of working, the applications of this classifier may be restricted the problems with a creation number of examples, especially, when the runtime matters. However, the classification of large amounts of data is becoming a necessary task in a great number of real world applications. This paper uses a variety of datasets, and analyzes the impact of data volume, data dimension and the value of k from many perspectives like time and space complexity, and accuracy. We then analyze each step from load balancing, accuracy and complexity aspects. We identify three generic steps for KNN computations on map reduce: data preprocessing, data partitioning and computation. Overall, this paper can be used as to tackle KNN-based practical problems in the context of big data