Upload
ngoxuyen
View
213
Download
0
Embed Size (px)
Citation preview
Big Data Analytics!!
Special Topics for Computer Science CSE 4095-001 CSE 5095-005
Fei Wang Associate Professor
Department of Computer Science and Engineering [email protected]
Jan 27
TF-IDFTerm Frequency: TF(t,d): The frequency term t appeared in document d
Inverse Document Frequency: IDF(t,D): Logarithmically scaled fraction of the documents that contain the wordlog(# Documents in D/# Documents having term t)
TF-IDF(t,d,D)=TF(t,d)*IDF(t,D)
Text Representation
Myslín, Mark, et al. "Using Twitter to examine smoking behavior and perceptions of emerging tobacco products." Journal of medical Internet research 15.8 (2013).
Electronic Health RecordsAn Electronic Health Record (EHR) is an evolving concept defined as a systematic collection of electronic health information about individual patients or populations
Jensen, Peter B., Lars J. Jensen, and SØren Brunak. "Mining electronic health records: towards better research applications and clinical care." Nature Reviews Genetics (2012).
Vector Based Representation
v = [v1,v2,…,vd]A collection of numbers
v1 v2
.
.
. vd
v’ =Row Vector
Column Vector
Dimensionality
Transpose
Patient Diagnosis Vector: d is the number of distinct diagnosis code, xi represents the frequency of the i-th diagnosis code in his/her historical records
Matrix Based Representation
Observation window
Patient EHR Matrix
Time
Raw Medical Features Patient Feature Vector
Patient Similarity
Predictive Modeling
Risk Stratification
. . .
• Jimeng Sun, Fei Wang, Jianying Hu, Shahram Edabollahi: Supervised patient similarity measure of heterogeneous patient records. SIGKDD Explorations 14(1): 16-24 (2012)
• Fei Wang, Jimeng Sun, Shahram Ebadollahi: Composite distance metric integration by leveraging multiple experts' inputs and its application in patient similarity assessment. Statistical Analysis and Data Mining 5(1): 54-69 (2012)
• J.Wu, J. Roy,W. F. Stewart, Prediction modeling using ehr data: challenges, strategies, and a comparison of machine learning approaches, Medical Care 48 S106–S113 (2010)
Temporal Matrices Aggregated Vectors
Sequence Based Representation
Diagnosis A Medication B Lab Test C
Diagnosis D Medication B . . .
The sequentiality of those events may indicate some impending disease conditions
How to interpret and make use of the sequentiality of the events?
Feature Distribution
https://www.linkedin.com/pulse/20140215200145-131079-the-myth-of-the-bell-curve
Power-law distributions in empirical data 3
Box 1: Recipe for analyzing power-law distributed data
This paper contains much technical detail. In broad outline, however, the recipe wepropose for the analysis of power-law data is straightforward and goes as follows.
1. Estimate the parameters xmin and α of the power-law model using the methodsdescribed in Section 3.
2. Calculate the goodness-of-fit between the data and the power law using themethod described in Section 4. If the resulting p-value is greater than 0.1 thepower law is a plausible hypothesis for the data, otherwise it is rejected.
3. Compare the power law with alternative hypotheses via a likelihood ratio test,as described in Section 5. For each alternative, if the calculated likelihood ratiois significantly different from zero, then its sign indicates whether the alternativeis favored over the power-law model or not.
Step 3, the likelihood ratio test for alternative hypotheses, could in principle be replacedwith any of several other established and statistically principled approaches for modelcomparison, such as a fully Bayesian approach [32], a cross-validation approach [59], or aminimum description length approach [20], although none of these methods are describedhere.
In the discrete case, x can take only a discrete set of values. In this paper weconsider only the case of integer values with a probability distribution of the form
p(x) = Pr(X = x) = Cx−α . (2.3)
Again this distribution diverges at zero, so there must be a lower bound xmin > 0 onthe power-law behavior. Calculating the normalizing constant, we then find that
p(x) =x−α
ζ(α, xmin), (2.4)
where
ζ(α, xmin) =∞!
n=0
(n + xmin)−α (2.5)
is the generalized or Hurwitz zeta function. Table 2.1 summarizes the basic functionalforms and normalization constants for these and several other distributions that willbe useful.
In many cases it is useful to consider also the complementary cumulative distri-bution function or CDF of a power-law distributed variable, which we denote P (x)and which for both continuous and discrete cases is defined to be P (x) = Pr(X ≥ x).For instance, in the continuous case
P (x) =
" ∞
xp(x′) dx′ =
#
x
xmin
$−α+1
. (2.6)
In the discrete case
P (x) =ζ(α, x)
ζ(α, xmin). (2.7)
Feature Normalization
!"# $!"#%& '(($%)*+, !$ !"# #-.*%*('/ %#!%*#0'/ %#1&2/!& !"'! 3*// 4# .%#&#+!#) *+ 5#(!*$+ 67
!"#" $%&'() *+(,%&- ./ 0&%. )(&-'
8*0#+ ' /$3#% 4$2+) , '+) '+ 2..#% 4$2+) 09$% ' 9#'!2%# ($-.$+#+! 1:
!! ! !" "#" "
#;$
%#&2/!& *+ !! 4#*+, *+ !"# <=:;> %'+,#7
!"2" $%&'() *+(,%&- ./ 0&%. 3()%(&+'
?+$!"#% +$%-'/*@'!*$+ .%$(#)2%# *& !$ !%'+&19$%- !"# 9#'!2%# ($-.$+#+! 1 !$ ' %'+)$- 0'%*1'4/# 3*!" @#%$ -#'+ '+) 2+*! 0'%*'+(# '&
!! ! !" !"
! #A$
3"#%# ! '+) " '%# !"# &'-./# -#'+ '+) !"# &'-./#&!'+)'%) )#0*'!*$+ $9 !"'! 9#'!2%#: %#&.#(!*0#/BCD'*+ '+) E24#&: ;FGGH7
I9 3# '&&2-# !"'! #'(" 9#'!2%# *& +$%-'//B )*&1!%*42!#): !"# .%$4'4*/*!B $9 !! 4#*+, *+ !"# <!;:;>%'+,# *& JGK7 ?+ '))*!*$+'/ &"*9! '+) %#&('/*+, '&
!! ! #!" !$"L"% ;
A#L$
,2'%'+!##& FFK $9 !! !$ 4# *+ !"# <=:;> %'+,#7 M#('+ !"#+ !%2+('!# !"# $2!1$91%'+,# ($-.$+#+!& !$#*!"#% = $% ;7
!"!" 4)(&*5/)6(.%/& ./ ( 7&%5/)6 89:#; )(&</63()%(=,'
8*0#+ ' %'+)$- 0'%*'4/# 1 3*!" (2-2/'!*0#)*&!%*42!*$+ 92+(!*$+ $!#!$: !"# %'+)$- 0'%*'4/# !!%#&2/!*+, 9%$- !"# !%'+&9$%-'!*$+ !! ! $!#!$ *&2+*9$%-/B )*&!%*42!#) *+ !"# <=:;> %'+,# CN'.$2/*&:;FF;H7
!">" ?(&@ &/)6(,%A(.%/&
8*0#+ !"# &'-./# 9$% ' 9#'!2%# ($-.$+#+! 9$%'// *-',#& '& !;! " " " ! !%: O%&! 3# O+) !"# $%)#%&!'!*&!*(& !#;$! " " " ! !#%$ '+) !"#+ %#./'(# #'(" *-',#P&
9#'!2%# 0'/2# 4B *!& ($%%#&.$+)*+, +$%-'/*@#)%'+Q: '&
!!& !%'+Q!;!"""!!%
#!&$ " ;
%" ;! #R$
3"#%# !& *& !"# 9#'!2%# 0'/2# 9$% !"# %!" *-',#7 S"*&.%$(#)2%# 2+*9$%-/B -'.& '// 9#'!2%# 0'/2#& !$ !"#<=:;> %'+,#7 M"#+ !"#%# '%# -$%# !"'+ $+# *-',#3*!" !"# &'-# 9#'!2%# 0'/2#: 9$% #T'-./# '9!#%U2'+!*@'!*$+: !"#B '%# '&&*,+#) !"# '0#%',# %'+Q9$% !"'! 0'/2#7
!"B" C/)6(,%A(.%/& (5.') D..%&- <%*.)%=0.%/&*
S"# !%'+&9$%-'!*$+& *+ 5#(!*$+ L7A '&&2-#!"'! ' 9#'!2%# "'& ' V$%-'/ C!! "AH )*&!%*42!*$+7S"# &'-./# 0'/2#& ('+ 4# 2&#) !$ O+) 4#!!#%#&!*-'!#& 9$% !"# 9#'!2%# )*&!%*42!*$+&7 S"#+:!"#&# #&!*-'!#& ('+ 4# 2&#) !$ O+) +$%-'/*@'!*$+-#!"$)& 4'&#) .'%!*(2/'%/B $+ !"#&# )*&!%*421!*$+&7
S"# 9$//$3*+, &#(!*$+& )#&(%*4# "$3 !$ O!V$%-'/: W$,+$%-'/: XT.$+#+!*'/ '+) 8'--')#+&*!*#& !$ ' %'+)$- &'-./#7 M# '/&$ ,*0# !"#)*Y#%#+(# )*&!%*42!*$+& 4#('2&# !"# *-',# &*-*/'%1*!B -#'&2%#& 2&# 9#'!2%# )*Y#%#+(#&7 ?9!#% #&!*1-'!*+, !"# .'%'-#!#%& $9 ' )*&!%*42!*$+: !"# (2!1$Y0'/2# !"'! *+(/2)#& FFK $9 !"# 9#'!2%# 0'/2#& *&9$2+) '+) !"# &'-./# 0'/2#& '%# &('/#) '+) !%2+1('!#) &$ !"'! #'(" 9#'!2%# ($-.$+#+! "'0# !"#&'-# %'+,#7
5*+(# !"# $%*,*+'/ 9#'!2%# 0'/2#& '%# .$&*!*0#: 3#2&# $+/B !"# .$&*!*0# &#(!*$+ $9 !"# V$%-'/ )#+&*!B'9!#% O!!*+,7 W$,+$%-'/: XT.$+#+!*'/ '+) 8'--')#+&*!*#& '%# )#O+#) 9$% %'+)$- 0'%*'4/#& 3*!"$+/B .$&*!*0# 0'/2#&7 Z!"#% )*&!%*42!*$+& !"'! '%#($--$+/B #+($2+!#%#) *+ !"# &!'!*&!*(& /*!#%'!2%#'%# !"# [+*9$%-: #A '+) M#*42// C3"*(" '%# &.#(*'/('&#& $9 8'--'H: \#!' C3"*(" *& )#O+#) $+/B 9$%<=:;>H '+) ]'2("B C3"$&# -$-#+!& )$ +$! #T*&!H7?/!"$2," !"#&# )*&!%*42!*$+& ('+ '/&$ 4# 2&#) 4BO%&! #&!*-'!*+, !"#*% .'%'-#!#%& '+) !"#+ O+)*+,!"# (2!1$Y 0'/2#&: 3# 3*// &"$3 !"'! !"# )*&!%*421!*$+& 2&#) *+ !"*& .'.#% ('+ U2*!# ,#+#%'//B -$)#/9#'!2%#& 9%$- )*Y#%#+! 9#'!2%# #T!%'(!*$+ '/,$1%*!"-&7
E" F@*/G: ?"H" I()(,%+@ J K(..')& ?'+/-&%.%/& $'..')* 22 L299#M BN!OBP2 6J6
!"# $!"#%& '(($%)*+, !$ !"# #-.*%*('/ %#!%*#0'/ %#1&2/!& !"'! 3*// 4# .%#&#+!#) *+ 5#(!*$+ 67
!"#" $%&'() *+(,%&- ./ 0&%. )(&-'
8*0#+ ' /$3#% 4$2+) , '+) '+ 2..#% 4$2+) 09$% ' 9#'!2%# ($-.$+#+! 1:
!! ! !" "#" "
#;$
%#&2/!& *+ !! 4#*+, *+ !"# <=:;> %'+,#7
!"2" $%&'() *+(,%&- ./ 0&%. 3()%(&+'
?+$!"#% +$%-'/*@'!*$+ .%$(#)2%# *& !$ !%'+&19$%- !"# 9#'!2%# ($-.$+#+! 1 !$ ' %'+)$- 0'%*1'4/# 3*!" @#%$ -#'+ '+) 2+*! 0'%*'+(# '&
!! ! !" !"
! #A$
3"#%# ! '+) " '%# !"# &'-./# -#'+ '+) !"# &'-./#&!'+)'%) )#0*'!*$+ $9 !"'! 9#'!2%#: %#&.#(!*0#/BCD'*+ '+) E24#&: ;FGGH7
I9 3# '&&2-# !"'! #'(" 9#'!2%# *& +$%-'//B )*&1!%*42!#): !"# .%$4'4*/*!B $9 !! 4#*+, *+ !"# <!;:;>%'+,# *& JGK7 ?+ '))*!*$+'/ &"*9! '+) %#&('/*+, '&
!! ! #!" !$"L"% ;
A#L$
,2'%'+!##& FFK $9 !! !$ 4# *+ !"# <=:;> %'+,#7 M#('+ !"#+ !%2+('!# !"# $2!1$91%'+,# ($-.$+#+!& !$#*!"#% = $% ;7
!"!" 4)(&*5/)6(.%/& ./ ( 7&%5/)6 89:#; )(&</63()%(=,'
8*0#+ ' %'+)$- 0'%*'4/# 1 3*!" (2-2/'!*0#)*&!%*42!*$+ 92+(!*$+ $!#!$: !"# %'+)$- 0'%*'4/# !!%#&2/!*+, 9%$- !"# !%'+&9$%-'!*$+ !! ! $!#!$ *&2+*9$%-/B )*&!%*42!#) *+ !"# <=:;> %'+,# CN'.$2/*&:;FF;H7
!">" ?(&@ &/)6(,%A(.%/&
8*0#+ !"# &'-./# 9$% ' 9#'!2%# ($-.$+#+! 9$%'// *-',#& '& !;! " " " ! !%: O%&! 3# O+) !"# $%)#%&!'!*&!*(& !#;$! " " " ! !#%$ '+) !"#+ %#./'(# #'(" *-',#P&
9#'!2%# 0'/2# 4B *!& ($%%#&.$+)*+, +$%-'/*@#)%'+Q: '&
!!& !%'+Q!;!"""!!%
#!&$ " ;
%" ;! #R$
3"#%# !& *& !"# 9#'!2%# 0'/2# 9$% !"# %!" *-',#7 S"*&.%$(#)2%# 2+*9$%-/B -'.& '// 9#'!2%# 0'/2#& !$ !"#<=:;> %'+,#7 M"#+ !"#%# '%# -$%# !"'+ $+# *-',#3*!" !"# &'-# 9#'!2%# 0'/2#: 9$% #T'-./# '9!#%U2'+!*@'!*$+: !"#B '%# '&&*,+#) !"# '0#%',# %'+Q9$% !"'! 0'/2#7
!"B" C/)6(,%A(.%/& (5.') D..%&- <%*.)%=0.%/&*
S"# !%'+&9$%-'!*$+& *+ 5#(!*$+ L7A '&&2-#!"'! ' 9#'!2%# "'& ' V$%-'/ C!! "AH )*&!%*42!*$+7S"# &'-./# 0'/2#& ('+ 4# 2&#) !$ O+) 4#!!#%#&!*-'!#& 9$% !"# 9#'!2%# )*&!%*42!*$+&7 S"#+:!"#&# #&!*-'!#& ('+ 4# 2&#) !$ O+) +$%-'/*@'!*$+-#!"$)& 4'&#) .'%!*(2/'%/B $+ !"#&# )*&!%*421!*$+&7
S"# 9$//$3*+, &#(!*$+& )#&(%*4# "$3 !$ O!V$%-'/: W$,+$%-'/: XT.$+#+!*'/ '+) 8'--')#+&*!*#& !$ ' %'+)$- &'-./#7 M# '/&$ ,*0# !"#)*Y#%#+(# )*&!%*42!*$+& 4#('2&# !"# *-',# &*-*/'%1*!B -#'&2%#& 2&# 9#'!2%# )*Y#%#+(#&7 ?9!#% #&!*1-'!*+, !"# .'%'-#!#%& $9 ' )*&!%*42!*$+: !"# (2!1$Y0'/2# !"'! *+(/2)#& FFK $9 !"# 9#'!2%# 0'/2#& *&9$2+) '+) !"# &'-./# 0'/2#& '%# &('/#) '+) !%2+1('!#) &$ !"'! #'(" 9#'!2%# ($-.$+#+! "'0# !"#&'-# %'+,#7
5*+(# !"# $%*,*+'/ 9#'!2%# 0'/2#& '%# .$&*!*0#: 3#2&# $+/B !"# .$&*!*0# &#(!*$+ $9 !"# V$%-'/ )#+&*!B'9!#% O!!*+,7 W$,+$%-'/: XT.$+#+!*'/ '+) 8'--')#+&*!*#& '%# )#O+#) 9$% %'+)$- 0'%*'4/#& 3*!"$+/B .$&*!*0# 0'/2#&7 Z!"#% )*&!%*42!*$+& !"'! '%#($--$+/B #+($2+!#%#) *+ !"# &!'!*&!*(& /*!#%'!2%#'%# !"# [+*9$%-: #A '+) M#*42// C3"*(" '%# &.#(*'/('&#& $9 8'--'H: \#!' C3"*(" *& )#O+#) $+/B 9$%<=:;>H '+) ]'2("B C3"$&# -$-#+!& )$ +$! #T*&!H7?/!"$2," !"#&# )*&!%*42!*$+& ('+ '/&$ 4# 2&#) 4BO%&! #&!*-'!*+, !"#*% .'%'-#!#%& '+) !"#+ O+)*+,!"# (2!1$Y 0'/2#&: 3# 3*// &"$3 !"'! !"# )*&!%*421!*$+& 2&#) *+ !"*& .'.#% ('+ U2*!# ,#+#%'//B -$)#/9#'!2%#& 9%$- )*Y#%#+! 9#'!2%# #T!%'(!*$+ '/,$1%*!"-&7
E" F@*/G: ?"H" I()(,%+@ J K(..')& ?'+/-&%.%/& $'..')* 22 L299#M BN!OBP2 6J6!"# $!"#%& '(($%)*+, !$ !"# #-.*%*('/ %#!%*#0'/ %#1&2/!& !"'! 3*// 4# .%#&#+!#) *+ 5#(!*$+ 67
!"#" $%&'() *+(,%&- ./ 0&%. )(&-'
8*0#+ ' /$3#% 4$2+) , '+) '+ 2..#% 4$2+) 09$% ' 9#'!2%# ($-.$+#+! 1:
!! ! !" "#" "
#;$
%#&2/!& *+ !! 4#*+, *+ !"# <=:;> %'+,#7
!"2" $%&'() *+(,%&- ./ 0&%. 3()%(&+'
?+$!"#% +$%-'/*@'!*$+ .%$(#)2%# *& !$ !%'+&19$%- !"# 9#'!2%# ($-.$+#+! 1 !$ ' %'+)$- 0'%*1'4/# 3*!" @#%$ -#'+ '+) 2+*! 0'%*'+(# '&
!! ! !" !"
! #A$
3"#%# ! '+) " '%# !"# &'-./# -#'+ '+) !"# &'-./#&!'+)'%) )#0*'!*$+ $9 !"'! 9#'!2%#: %#&.#(!*0#/BCD'*+ '+) E24#&: ;FGGH7
I9 3# '&&2-# !"'! #'(" 9#'!2%# *& +$%-'//B )*&1!%*42!#): !"# .%$4'4*/*!B $9 !! 4#*+, *+ !"# <!;:;>%'+,# *& JGK7 ?+ '))*!*$+'/ &"*9! '+) %#&('/*+, '&
!! ! #!" !$"L"% ;
A#L$
,2'%'+!##& FFK $9 !! !$ 4# *+ !"# <=:;> %'+,#7 M#('+ !"#+ !%2+('!# !"# $2!1$91%'+,# ($-.$+#+!& !$#*!"#% = $% ;7
!"!" 4)(&*5/)6(.%/& ./ ( 7&%5/)6 89:#; )(&</63()%(=,'
8*0#+ ' %'+)$- 0'%*'4/# 1 3*!" (2-2/'!*0#)*&!%*42!*$+ 92+(!*$+ $!#!$: !"# %'+)$- 0'%*'4/# !!%#&2/!*+, 9%$- !"# !%'+&9$%-'!*$+ !! ! $!#!$ *&2+*9$%-/B )*&!%*42!#) *+ !"# <=:;> %'+,# CN'.$2/*&:;FF;H7
!">" ?(&@ &/)6(,%A(.%/&
8*0#+ !"# &'-./# 9$% ' 9#'!2%# ($-.$+#+! 9$%'// *-',#& '& !;! " " " ! !%: O%&! 3# O+) !"# $%)#%&!'!*&!*(& !#;$! " " " ! !#%$ '+) !"#+ %#./'(# #'(" *-',#P&
9#'!2%# 0'/2# 4B *!& ($%%#&.$+)*+, +$%-'/*@#)%'+Q: '&
!!& !%'+Q!;!"""!!%
#!&$ " ;
%" ;! #R$
3"#%# !& *& !"# 9#'!2%# 0'/2# 9$% !"# %!" *-',#7 S"*&.%$(#)2%# 2+*9$%-/B -'.& '// 9#'!2%# 0'/2#& !$ !"#<=:;> %'+,#7 M"#+ !"#%# '%# -$%# !"'+ $+# *-',#3*!" !"# &'-# 9#'!2%# 0'/2#: 9$% #T'-./# '9!#%U2'+!*@'!*$+: !"#B '%# '&&*,+#) !"# '0#%',# %'+Q9$% !"'! 0'/2#7
!"B" C/)6(,%A(.%/& (5.') D..%&- <%*.)%=0.%/&*
S"# !%'+&9$%-'!*$+& *+ 5#(!*$+ L7A '&&2-#!"'! ' 9#'!2%# "'& ' V$%-'/ C!! "AH )*&!%*42!*$+7S"# &'-./# 0'/2#& ('+ 4# 2&#) !$ O+) 4#!!#%#&!*-'!#& 9$% !"# 9#'!2%# )*&!%*42!*$+&7 S"#+:!"#&# #&!*-'!#& ('+ 4# 2&#) !$ O+) +$%-'/*@'!*$+-#!"$)& 4'&#) .'%!*(2/'%/B $+ !"#&# )*&!%*421!*$+&7
S"# 9$//$3*+, &#(!*$+& )#&(%*4# "$3 !$ O!V$%-'/: W$,+$%-'/: XT.$+#+!*'/ '+) 8'--')#+&*!*#& !$ ' %'+)$- &'-./#7 M# '/&$ ,*0# !"#)*Y#%#+(# )*&!%*42!*$+& 4#('2&# !"# *-',# &*-*/'%1*!B -#'&2%#& 2&# 9#'!2%# )*Y#%#+(#&7 ?9!#% #&!*1-'!*+, !"# .'%'-#!#%& $9 ' )*&!%*42!*$+: !"# (2!1$Y0'/2# !"'! *+(/2)#& FFK $9 !"# 9#'!2%# 0'/2#& *&9$2+) '+) !"# &'-./# 0'/2#& '%# &('/#) '+) !%2+1('!#) &$ !"'! #'(" 9#'!2%# ($-.$+#+! "'0# !"#&'-# %'+,#7
5*+(# !"# $%*,*+'/ 9#'!2%# 0'/2#& '%# .$&*!*0#: 3#2&# $+/B !"# .$&*!*0# &#(!*$+ $9 !"# V$%-'/ )#+&*!B'9!#% O!!*+,7 W$,+$%-'/: XT.$+#+!*'/ '+) 8'--')#+&*!*#& '%# )#O+#) 9$% %'+)$- 0'%*'4/#& 3*!"$+/B .$&*!*0# 0'/2#&7 Z!"#% )*&!%*42!*$+& !"'! '%#($--$+/B #+($2+!#%#) *+ !"# &!'!*&!*(& /*!#%'!2%#'%# !"# [+*9$%-: #A '+) M#*42// C3"*(" '%# &.#(*'/('&#& $9 8'--'H: \#!' C3"*(" *& )#O+#) $+/B 9$%<=:;>H '+) ]'2("B C3"$&# -$-#+!& )$ +$! #T*&!H7?/!"$2," !"#&# )*&!%*42!*$+& ('+ '/&$ 4# 2&#) 4BO%&! #&!*-'!*+, !"#*% .'%'-#!#%& '+) !"#+ O+)*+,!"# (2!1$Y 0'/2#&: 3# 3*// &"$3 !"'! !"# )*&!%*421!*$+& 2&#) *+ !"*& .'.#% ('+ U2*!# ,#+#%'//B -$)#/9#'!2%#& 9%$- )*Y#%#+! 9#'!2%# #T!%'(!*$+ '/,$1%*!"-&7
E" F@*/G: ?"H" I()(,%+@ J K(..')& ?'+/-&%.%/& $'..')* 22 L299#M BN!OBP2 6J6
Linear Scaling to [0,1]
Scaling to standard normal distribution with zero mean and
unit variance
99% of the data in [0,1] range