High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

Smart Removal of Redundant Data using Progressive Techniques

Author(s):

Kirti Mane , S.B Patil COE Indapur; Kirti Mane, S.B Patil COE Indapur; Manisha More, S.B Patil COE Indapur; Vishakha Bansode, S.B Patil COE Indapur; Pallavi Gawade, S.B Patil COE Indapur

Keywords:

Data Cleaning, Data Duplication, Progressiveness

Abstract

Data are among the most important assets of a company .but due to data changes and sloppy data entry, errors such as duplicate entries might occur, making data clean sing and in particular duplicate detection indispensable. However, the poor size of todays data sets render duplicate detection processes expensive. Online retailers, for e.g., offer hug catalogs comprising a constantly growing set of items from many different suppliers. As independent persons change the product portfolio, duplicate rise. Although there is an obvious need for de-duplication. Progressive duplicate detection identifies most duplicate pairs early in the detection process. Instead of reducing the overall time needed to finish the entire process, progressive approaches try to reduce the average time after which a duplicate is found early termination, in particular, then yields more complete results on a progressive algorithm than on any traditional approach.

Other Details

Paper ID: IJSRDV4I100125
Published in: Volume : 4, Issue : 10
Publication Date: 01/01/2017
Page(s): 762-763

Article Preview

Download Article