Upload
bharat-khatri
View
31
Download
0
Embed Size (px)
Citation preview
Machine LearningOverview
Let’s attempt a definition ...
“ Algorithms for inferring unknowns from knowns ”
What type of inference are we talking about ?
Statistical Inference
Where do I spot Machine Learning?● Spam Identification● Handwriting Recognition● Image Recognition ● Speech Recognition ● Recommendation Systems● Climate Modelling
Can I group these applications into abstract categories?
● Supervised Learning● Unsupervised Learning
Supervised Learning
● Classification● Regression
Unsupervised Learning
● Clustering● Density Estimation● Dimensionality Reduction
More abstract categories ...
● Semi-supervised Learning● Active Learning● Reinforcement Learning
Generative vs Discriminative Models
Generative models contrast with discriminative models, in that a generative model is a full probabilistic model of all variables, whereas a discriminative model provides a model only for the target variable(s) conditional on the observed variables.
Discriminative model uses P(y|x)Generative model uses P(x,y)
P(x,y) = P(x|y) * P(y) = f(x|y) * P(y) = P(y|x) * P(x) = P(y|x) * f(x)
Thus a generative model can be used, for example, to simulate (i.e. generate) values of any variable in the modelwhereas a discriminative model allows only sampling of the target variables conditional on the observed quantities.
Generative and Discriminative in Classification
Generative model: are typically more flexible than discriminative models in expressing dependencies in complex learning tasks.
more powerful as it models all variables.estimating densities takes a lot of data and might be difficult to model and so could have worse performance.
Examples: Naive Bayes, Hidden Markov Model
Discriminative model:For tasks such as classification and regression that do not require the joint distribution, discriminative models can yield superior performance.Examples: Linear Regression, Logistic Regression
k Nearest NeighbourD = {(x1,y1); (x2,y2); …; (xn,yn) }
where xi belongs to Rd , y is 0 or 1 // binary classification.classifies a new point x according to majority vote of the k nearest points in D.
defines some distance metric d(xi, xj) , example euclidean distance
Probabilistic Interpretation for some fix parameter kY is a random variable that has pmf defined as
P(y) = P(y | x, D) = fraction of points x i in Nk(x) such that yi = yyest. = arg-max ( P (y | x, D))discriminative model as we don’t have any distribution for generating x
parameter k should be chosen according to bias variance trade off or other cross validation techniques