Smart Removal of Redundant Data using Progressive Techniques |
Author(s): |
| Kirti Mane , S.B Patil COE Indapur; Kirti Mane, S.B Patil COE Indapur; Manisha More, S.B Patil COE Indapur; Vishakha Bansode, S.B Patil COE Indapur; Pallavi Gawade, S.B Patil COE Indapur |
Keywords: |
| Data Cleaning, Data Duplication, Progressiveness |
Abstract |
|
Data are among the most important assets of a company .but due to data changes and sloppy data entry, errors such as duplicate entries might occur, making data clean sing and in particular duplicate detection indispensable. However, the poor size of todays data sets render duplicate detection processes expensive. Online retailers, for e.g., offer hug catalogs comprising a constantly growing set of items from many different suppliers. As independent persons change the product portfolio, duplicate rise. Although there is an obvious need for de-duplication. Progressive duplicate detection identiï¬es most duplicate pairs early in the detection process. Instead of reducing the overall time needed to ï¬nish the entire process, progressive approaches try to reduce the average time after which a duplicate is found early termination, in particular, then yields more complete results on a progressive algorithm than on any traditional approach. |
Other Details |
|
Paper ID: IJSRDV4I100125 Published in: Volume : 4, Issue : 10 Publication Date: 01/01/2017 Page(s): 762-763 |
Article Preview |
|
|
|
|
