Upload
sherin-vs
View
215
Download
0
Embed Size (px)
DESCRIPTION
Data Preparation in Strategic Business Intelligence
Citation preview
August 10, 2015Data Mining: Concepts and Techniques 1Why Data Preprocessing?Data in the rea !ord is dirtyinco"pete: ac#ing attri$ute %aues, ac#ing certain attri$utes o& interest, or containing ony aggregate datae'g', occupation() *noisy: containing errors or outierse'g', +aary(),10*inconsistent: containing discrepancies in codes or na"ese'g', Age()-2* .irthday()0/00101221*e'g', Was rating )1,2,/*, no! rating )A, ., C*e'g', discrepancy $et!een dupicate recordsAugust 10, 2015Data Mining: Concepts and Techniques 2Why 3s Data Dirty?3nco"pete data "ay co"e &ro")4ot appica$e* data %aue !hen coectedDi5erent considerations $et!een the ti"e !hen the data !as coected and !hen it is anay6ed'7u"an0hard!are0so&t!are pro$e"s4oisy data 8incorrect %aues9 "ay co"e &ro":auty data coection instru"ents7u"an or co"puter error at data entry;rrors in data trans"ission3nconsistent data "ay co"e &ro"Di5erent data sources:unctiona dependency %ioation 8e'g', "odi&y so"e in#ed data9Dupicate records aso need data ceaningAugust 10, 2015Data Mining: Concepts and Techniques /Why 3s Data Preprocessing 3"portant?4o quaity data, no quaity "ining resuts