Upload
rahul-jaiman
View
110
Download
5
Embed Size (px)
Citation preview
Machine LearningA YEAR SPENT INARTIFICIAL INTELLIGENCE ENOUGH TO MAKE ONE BELIEVE IN GOD” – ALAN PERLIS
Agenda
Introduction Basics Types of Machine Learning Machine Learning Technologies Application Vision in next few years
Quick Questionnaire
How many people have heard about Machine Learning ?
How many people know about Machine Learning ?
How many people are using Machine Learning ?
What is Machine Learning ?
Subfield of Artificial Intelligence.
First Arthur Samuel gave the concept of Machine Learning, In 1959.
"Field of study that gives computers the ability to learn without being explicitly programmed“.
Computer program is said to be learn from Experience (E) with some class of tasks (T) and performance measure (P) if its performance at tasks in T as measured by P improves with E.
What is Machine Learning ?
Explores the study and construction of algorithms that can learn from and make predictions on data.
Algorithms operate by building a model from example inputs.
Data driven predictions or decisions. Unlike strictly static program instructions as we do.
Artificial Intelligence
Machine Learning is the branch of the Artificial Intelligence. Inserting the learning capabilities just like humans into machines. Even the fastest supercomputer is 32 times slower than Human Brain. Predictions says that in 2o6o , we are able to form the digital brain like humans. NLP (Natural Language Processing ) is also based on the Machine Learning , more the
data the machine has , more its prediction goes to perfect. Titanic Disaster could be saved through Machine Learning.
Use of Machine Learning
Google Search, Google News ,Page Ranking decided by Machine Learning.
Upload images , automatically detects the face of your friend. Spam filter which is used to filter our mails from tones of spam mails. Right product for the right customers.
More applications
Speech and hand-writing recognition Autonomous robot control Data mining and bioinformatics: motifs, alignment, … Playing games Fault detection Clinical diagnosis Credit scoring, fraud detection Web mining: search engines Market basket analysis
Why Machine Learning
Human expertise does not exist (navigating on Mars)- TARS in Interstellar. Humans are unable to explain their expertise (speech recognition).
Solution changes in time (routing on a computer network).
Solution needs to be adapted to particular cases (user biometrics).
Terminology / Basic Terms
Features – The numbers of features and distinct traits that can be used to describe each item in quantitative manner.
Samples – Sample is an item to process. It can document, picture, sound, video or any other file contains data.
Feature Vector – n dimensional vector that represents some object.
Training Set – Set of data to discover potentially predictive relationships.
Terminology with Example
FeaturesColor – RedType- LogoShape
Features Color – Light BlueType – LogoShape
Here sample are –both apples, Feature Vector =[Color, Type, Shape] , Training Set- Taken all at time
Categories
Types of Problems and Tasks
Depending on the nature of the learning "signal" or "feedback" available to a learning system.
Supervised LearningUnsupervised Learning
Reinforcement Learning
Example of Supervised Learning
Supervised Learning
Learning from labelled data, and different set of training examples. Input and output is fixed. the goal is to learn a general rule that maps inputs to outputs. Or find the correlation to between input and output to find the algo
which is general to all the training examples. Input data called Vector & Output value called Supervisory signal. Presence of Expert or Teacher. E.g.- Neural Networks , Decision Trees , Bayesian Classification.
To solve Supervised Learning problem
Determine the type of training examples.
Decide what kind of data is to be used as a training set.
Gather a training set. Set of input object and corresponding output is gathered.
Determine the input feature representation of the learned function. The input object is transformed into a feature vector, which contains a number
of features that are descriptive of the object.
To solve Supervised Learning problem
Determine the structure of the learned function and corresponding learning algorithm.
Find out the function or algorithm which maps all the training sets. Just like bridge how input is connected with output.
Complete the design. Addition of some control parameters & adjusted by optimizing performance.
Evaluate the accuracy of the learned function. Check it is working properly or not, if not redesign again.
Supervised Learning Flow Chart
Raw Data AlgorithmSample Data Trained Product
Verification Production
Application
Bioinformatics Database marketing Handwriting recognition Spam detection Pattern Recognition Speech Recognition
Unsupervised Learning
No labels are given to the learning algorithm. Find structure in its input with the help of Clustering. Discover hidden patterns in data and find the suitable algorithm. As input is unlabeled, there is no error or reward signal to evaluate a potential solution.
This makes it different form others. Self guided learning algorithm. Plays important role in data mining methods to preprocess the data. Approaches to Unsupervised Learning – K means, hierarchical clustering, mixture models.
Unsupervised Learning
K- means / Hierarchical
K means is a method of vector quantization. Partition of n observation into k cluster, and it belongs to nearest mean Popular of clustering analysis in data mining. NP Hard Problem.
Hierarchical clustering builds a hierarchy of clusters. Agglomerative (Bottom Up Approach) Divisive (Top down Approach)
Applications
Difference Supervised Vs Unsupervised
Reinforcement Learning
Program interacts with a dynamic environment. No explicit instructions. Decide its own whether it is near to goal or not. “Approximate Dynamic Programming” Unlike supervised learning correct input/output pairs are never presented. No optimization step is there like supervised learning to tell we have reached up to
our goal. There is a focus on on-line performance. Finds a balance between exploration (of uncharted territory) and exploitation (of
current knowledge)
Basic Reinforcement Learning Model
Set of environment states S.
Set of actions A.
Rules of transitions between states.
Rules that determine the scalar immediate reward of transition.
Rules that describe what the agent observes.
Algorithms used for Reinforcement Learning
Criterion of optimality the problem studied is episodic, an episode ending when some terminal
state is reached. Brute force (2 Step Policies)
For each possible policy, sample returns while following it. Choose the policy with the largest expected return. 1.Value function estimation 2. Direct policy search
Value function approaches It finds the policy which return maximize but maintaining sets. Based on MKP(Markov Decision Parameters)
Applications of Reinforcement Learning
Game theory Control theory Operations research Information theory Simulation-based optimization Multi-agent systems Swarm intelligence Statistics Genetic algorithms
Semi-Supervised Learning
Semi-supervised learning is a class of supervised learning tasks. But it uses large amount of unlabelled data with the labelled data. Actually it falls between supervised learning and supervised learning. Assumptions used in semi-supervised learning.
Smoothness assumption - Points which are close to each other are more likely to share a label.
Cluster assumption - The data tend to form discrete clusters, and points in the same cluster are more likely to share a label
Manifold assumption - The data lie approximately on a manifold of much lower dimension than the input space.
How ML used in Hospitals
Machine Learning Methods based on
output of a machine-learned system
Another Categorization
Based on “desired output” of a machine-learned system
Classification Regression
Clustering
Classification
Predict class from observations. Inputs are divided into two or more classes. Model assigns unseen inputs to one (or multi-label classification) or more of these
classes. Spam filtering is an example of classification, where the inputs are email (or other)
messages and the classes are "spam" and "not spam"
Regression
Relation between mean value of one variable and corresponding value of another variable.
Statistical method to find the relation between different variables. Predict the output with the training data and observations. Popular method – Logistic Regression or binary regression. The outputs are continuous rather than discrete.
Clustering
Grouping a set of objects in such way that objects in the same group are similar to each other.
Objects are not predefined. Grouping in meaningful group. Unlike in classification, the groups are not known beforehand, making this typically an
unsupervised task. Example – Man’s shoes , woman’s shoes , man’s t-shirt, woman’s t-shirts. So they are two category “man & woman” and “t-shirts & shoes”.
Popular Framework / Tools
Weka Carrot2 Gate OpenNLP LingPipe Mallet – Topic Modelling Gensim – Topic Modelling (Python) Apache Mahout Mlib – Apache Spark Scikit learn – Python
Difference
Classification Classification means to group the
output into class. Classification to predict the type of
tumor i.e. harmful or not using the training data sets.
If it is discrete / categorical variable , then it is classification problem.
Regression Regression means to predict the
output value using training data. Regression to predict the price of the
house from training data sets.
If it is real / continuous then it is regressions problem.
Approaches
Decision Tree Learning
Predictive model. Maps observations about an item to conclusions about the item's target value. Used in Statistics and data mining. Tree models where the target variable can take a finite set of values are
called classification trees. Leaves represent class labels & branches represent conjunctions. When target variable can take continuous values - regression trees. In data mining, a decision tree describes data but not decisions. Example – Wikipedia
Artificial Neural Networks
Inspired by Biological Neural Networks(Central Nervous System of animal). Used when there are large number of inputs and generally unknown. ANNs are generally presented as systems of interconnected "neurons" which exchange
messages between each other. Used to solve computer vision, speech recognition and handwriting recognition. Eg. In handwriting recognition 1. Input neuron activated by the pixels of an input image. 2. Weighted and transformed by a function, the activations of these neurons are then
passed on to other neurons. 3. This process is repeated until finally, an output neuron is activated. This determines
which character was read.
Artificial Neural Networks Structure
Any Queries ?
for more information :-
machinecanthink.blogspot.in