Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
What is Machine Learning ?
Machine Learning
A machine learns when
• it improves it performance
• on a specific task
• with experience
Central to Artificial Intelligence
There can be no intelligence without learning
Machine = AlphaGo Player Program
Task = playing GO
Performance = % of won games
Experience = huge data base of games + self play
Lee Sedol
AlphaGo
Machine = e-mail program, spamfilter
Task = classify e-mails
Performance = accuracy
Experience = your past input
Spam Filter
Automating Science
Eve, an artificially-intelligent ‘robot scientist’, can make drug discovery faster and much cheaper. [King et al. Nature 04, Science 09]
Robot Scientist Automating Data Science
Why is it useful ?
Why Machine Learning ?
It applies to any application where there is (a lot of) data …
It is very practical
• some programs too complex to program by hand
• easier to generate data than to build programs by hand
• adaptation and personalisation
Why Machine Learning ?
It applies to any application where there is (a lot of) data …
It is very practical
• some programs too complex to program by hand
• easier to generate data than to build programs by hand
• adaptation and personalisation
The enabling technology in
• natural language processing, web search / information retrieval
• computer vision & speech understanding
• robotics (& self-driving cars)
• bioinformatics
• analysing medical EHR & images
• …
How does it work ?
How does it work? Machine learning is all about learning functions
• different types of functions
• different types of data (supervised, unsupervised, reinforcement …)
• different criteria (loss or value function)
• Different schools in machine learning make different choices
f(input) => output.
Where does the data come from ?
learning from examples (supervised / unsupervised)
• good/bad moves ? just moves ?
learning by imitation (Behavioral cloning)
• imitate de world champion
learning from rewards (Reinforcement learning)
• just play, reward = board config. / wins / losses
• the whole AI problem in a nutshell
Donald Michie’s Menace
Donald Michie (2007) Menace (1961)
Machine Educable Noughts And Crosses Engine
slides Menace : thanks to Johannes Fürnkranz
XX
OOX’s move
XX
OO XX
OOX’s move
XX
OO XX
OOX’s move
XXOO
XXO
O
XX O
O
XX
OO XX
OOX’s move
XXOO
XXO
O
XX O
O
XX
OO
Choose box on the basis of current
position
XX
OOX’s move
XXOO
XXO
O
XX O
O
XX
OO
Choose box on the basis of current
position
XX
OOX’s move
XXOO
XXO
O
XX O
O
XX
OO
Choose box on the basis of current
position
XX
OOX’s move
XXOO
XXO
O
XX O
O
XX
OO
Choose box on the basis of current
position Execute move
XX
OOX’s move
XXOO
XXO
O
XX O
O
XX
OO
Choose box on the basis of current
position
X
Execute move
XX
OOX’s move
Menace Machine = 287 “boxes” + pearls
Encodes probabilistic function
• P(box, color) = probability of move
Learning a function
• upon loss: retain all used pearls
• upon winning: put used pearls back + an extra one of the same color
Q�(s, a) = R(s, a) + ��
s�
P (s⇥|s, a) maxa�
Q�(s⇥, a⇥)Richard Belmann
Menace Machine = 287 “boxes” + pearls
Encodes probabilistic function
• P(box, color) = probability of move
Learning a function
• upon loss: retain all used pearls
• upon winning: put used pearls back + an extra one of the same color
Q�(s, a) = R(s, a) + ��
s�
P (s⇥|s, a) maxa�
Q�(s⇥, a⇥)
Q�(s, a) = R(s, a) + ��
s�
P (s⇥|s, a) maxa�
Q�(s⇥, a⇥)Richard Belmann
Three important points
Learning AND Reasoning needed
• System 1 — thinking fast — can do things like solve “2+2=?” and recognise a car
• System 2 — thinking slow — can reason about complex logic problems (IQ tests) and reason about priority in traffic
• Alternative terms: learning vs reasoning, data-driven vs knowledge driven, symbolic vs sub
• AlphaGo incorporates learning and reasoning
• Machine learned video games — cannot change the rules of the game
There are five schools in ML
Pedro Domingos found it both exciting and scary to see that
president Xi Jinping of China reads his book
Tribe Origins Master Algorithm
Symbolists Logic, philosophy Inverse deduction
Connectionists Neuroscience Backpropagation
Evolutionaries Evolutionary biology Genetic programming
Bayesians Statistics Probabilistic inference
Analogizers Psychology Kernel machines
There are many remaining challenges
• Getting the right data
• bias, fairness, privacy, etc. (ethical concerns)
• Combining learning and reasoning
• Providing explanations and interpretable models
• beyond the deep neural network black-boxes
• Providing guarantees for software
• verification and validation
N. Akhtar, A. Mian: Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey
perturbations to make them imperceptible for humans. How-ever, Papernot et al. [60] also created an adversarial attackby restricting the `0-norm of the perturbations. Physically,it means that the goal is to modify only a few pixels inthe image instead of perturbing the whole image to foolthe classifier. The crux of their algorithm to generate thedesired adversarial image can be understood as follows.The algorithm modifies pixels of the clean image one at atime and monitors the effects of the change on the resultingclassification. The monitoring is performed by computinga saliency map using the gradients of the outputs of thenetwork layers. In this map, a larger value indicates a higherlikelihood of fooling the network to predict `target as thelabel of the modified image instead of the original label `.Thus, the algorithm performs targeted fooling. Once the maphas been computed, the algorithm chooses the pixel that ismost effective to fool the network and alters it. This processis repeated until either the maximum number of allowablepixels are altered in the adversarial image or the foolingsucceeds.
5) ONE PIXEL ATTACKAn extreme case for the adversarial attack is when only onepixel in the image is changed to fool the classifier. Inter-estingly, Su et al. [68] claimed successful fooling of threedifferent network models on 70.97% of the tested images bychanging just one pixel per image. They also reported that theaverage confidence of the networks on the wrong labels wasfound to be 97.47%. We show representative examples of theadversarial images from [68] in Fig. 3. Su et al. computedthe adversarial examples by using the concept of DifferentialEvolution [146]. For a clean image Ic, they first created aset of 400 vectors in R5 such that each vector containedxy-coordinates and RGB values for an arbitrary candidatepixel. Then, they randomly modified the elements of thevectors to create children such that a child competes with itsparent for fitness in the next iteration, while the probabilisticpredicted label of the network is used as the fitness criterion.The last surviving child is used to alter the pixel in the image.
FIGURE 3. Illustration of one pixel adversarial attacks [68]: The correctlabel is mentioned with each image. The corresponding predicted label isgiven in parentheses.
Evenwith such a simple evolutionary strategy Su et al. [68]were able to show successful fooling of deep networks.Notice that, differential evolution enables their approach togenerate adversarial examples without having access to anyinformation about the network parameter values or their gra-dients. The only input their technique requires is the proba-bilistic labels predicted by the targeted model.
6) CARLINI AND WAGNER ATTACKS (C&W)A set of three adversarial attacks were introduced by Carliniand Wagner [36] in the wake of defensive distillation againstthe adversarial perturbations [38]. These attacks make theperturbations quasi-imperceptible by restricting their `2, `1and `0 norms, and it is shown that defensive distillation forthe targeted networks almost completely fails against theseattacks. Moreover, it is also shown that the adversarial exam-ples generated using the unsecured (un-distilled) networkstransfer well to the secured (distilled) networks, which makesthe computed perturbations suitable for black-box attacks.Whereas it is more common to exploit the transferabil-
ity property of adversarial examples to generate black-boxattacks, Chen et al. [41] also proposed ‘Zeroth Order Opti-mization (ZOO)’ based attacks that directly estimate the gra-dients of the targeted model for generating the adversarialexamples. These attacks were inspired by C&W attacks.We refer to the original papers for further details on C&Wand ZOO attacks.
7) DEEPFOOLMoosavi-Dezfooli et al. [72] proposed to compute a minimalnorm adversarial perturbation for a given image in an iterativemanner. Their algorithm, i.e. DeepFool initializes with theclean image that is assumed to reside in a region confined bythe decision boundaries of the classifier. This region decidesthe class-label of the image. At each iteration, the algorithmperturbs the image by a small vector that is computed totake the resulting image to the boundary of the polyhydronthat is obtained by linearizing the boundaries of the regionwithin which the image resides. The perturbations added tothe image in each iteration are accumulated to compute thefinal perturbation once the perturbed image changes its labelaccording to the original decision boundaries of the network.The authors show that the DeepFool algorithm is able tocompute perturbations that are smaller than the perturbationscomputed by FGSM [23] in terms of their norm, while havingsimilar fooling ratios.
8) UNIVERSAL ADVERSARIAL PERTURBATIONSWhereas the methods like FGSM [23], ILCM [35], Deep-Fool [72] etc. compute perturbations to fool a network on asingle image, the ‘universal’ adversarial perturbations com-puted by Moosavi-Dezfooli et al. [16] are able to fool anetwork on ‘any’ image with high probability. These image-agnostic perturbations also remain quasi-imperceptible forthe human vision system, as can be observed in Fig. 1.To formally define these perturbations, let us assume that
14414 VOLUME 6, 2018
Akhtar & Mian, IEEE Access
What to expect ?
What does this imply ?
“AI is the new electricity” (Andrew Ng)
Much like the rise of electricity, which started about 100 years ago; AI will revolutionize every major industry. (Industry 4.0)
We will see many intelligent assistants for specific (routine) tasks;
There is a really high potential, AI can bring a lot of good to society;
there are also some caveats
What does this imply ?AI as the magic wand
• There is a lot of hype; the expectations are often unrealistic
• The press (and the GAFA companies doing AI) create sensational stories — on purpose (?)
• Abuse of the term AI:
• everything is AI and everybody is jumping on the wagon
AI summers and winters
cf. Gartner hype cycle for emerging technologies
Take away
• Insight into the nature of AI and ML
• AI & ML have a lot of potential, they are here to stay
• Go for a broad view on AI, we need all schools of ML, we need learning and reasoning, there are remaining challenges
• Beware of the hype & learn from the past !