View
217
Download
4
Category
Preview:
Citation preview
Hubert Curien LaboratoryUMR CNRS 5516
University of Saint-Etienne
Marc Sebban and Alain Tremeau
Hubert Curien Laboratory, UMR CNRS 5516University Jean Monnet - Saint-Etienne (France)
Induction week - September 7-11, 2015
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 1 / 22
Hubert Curien Laboratory (LAHC)
Synopsis of the LaHC
The LAHC (Head: Florent Pigeon) is a joint research unit, created in2006, within the Jean Monnet University, Saint-Etienne and theCNRS.
100 researchers and research lecturers.25 engineers and administrative staff.
100 PhD and postdoc students.This makes the LAHC with a total of +230 staff the most importantof all Saint-Etiennes university poles.
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 2 / 22
Synopsis of the LaHC
Two scientific departments
Optics, Photonics & Hyper-frequencies (Head: A. Boukenter)
Computer Science, Telecom & Image (Head: M. Sebban)
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 3 / 22
Project-Team Model
Project-Team Model
The Hubert Curien laboratory makes use of the notion of project-team:
made up of 6-10 permanent staff,
well defined scientific objectives,
well defined lifecycles,
own budget.
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 4 / 22
Project-Team Model
5 Project-Teams in Computer Science
Machine Learning
Data Mining & Information Retrieval.
Knowledge Representation
Multi-agents systems
Virtual Communities and Social Networks
3 Project-Teams in Image Science and Computer Vision
Optical Design and Image Reconstruction
Macroscopic Modelisation of Images
Image analysis and understanding
with strong interdisciplinary collaborations in computer vision betweenComputer Science and Image Processing
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 5 / 22
Research activities inMachine Learning and Data Mining
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 6 / 22
Research activities in Machine Learning and Data Mining
13 permanent members - 12 PhD students
Permanent staff Phd StudentsLeonor Becerra Irina NicolaeMarc Bernard Valentina ZantedeschiCatherine Combes Michael Perrot
Elisa Fromont Maria BatistaMathias Gery Romain DevilleAmaury Habrard Adrien DulacFrancois Jacquenet Damien FourureBaptiste Jeudy O. BenyahiaChristine Largeron Jordan FreryEmilie Morvant Guillaume MetzlerMarc Sebban Riadhy BenamarRemi Emonet Anil GoyalMichel Beigbeder
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 7 / 22
What is Machine Learning?
Field of study that gives computers the ability to learn without beingexplicitly programmed.
Machine learning explores the construction and study of algorithmsthat can learn from training data and make predictions on test data
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 8 / 22
Some Machine Learning applications
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 9 / 22
Main research topics in Machine Learning in Saint-Etienne
Main research topics
Domain Adaptation and Transfer Learning
Metric Learning
Ensemble methods - Theory of Boosting - PAC Bayesian Theory
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 10 / 22
Domain Adaptation and Transfer Learning
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 11 / 22
Domain Adaptation and Transfer Learning
Assumption in Machine Learning
Training and test data must be in the same feature space and have thesame distribution. Otherwise, transfer learning is required.
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 12 / 22
Domain Adaptation
Definition
Domain adaptation is a transfer learning subfield which makes use ofsome labeled source (training) data and many unlabeled target(test) data.Typically, it requires to move the two source and target distributions closerto each other.
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 13 / 22
Matching between the main strategies in TL and our skills
Sample bias
Covariate shift
Instance weighting
Feature Representation
Domain invariant features
Latent features
Iterative Models
Selftraining
EMbased methods
Statistical Learning Theory
PACBayesian Theory
Boostingbased models
Metric Learning
Subspace Alignement
Latent Pattern Mining
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 14 / 22
Metric Learning
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 15 / 22
How to discriminate between humans and dogs?
Predicted label?
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 16 / 22
Limitations of standard metrics (e.g. Euclidean distance)
Its not what it looks Like...
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 17 / 22
Limitations of standard metrics
Why standard metrics are not able to deal with such situations?
d2(x, x) =
di=1
(xi x i )2.
These distances cannot take into account ground truth. Therefore, theyoften fail to capture the idiosyncrasies of the data of interest.
An important part of the research activity of the Machine Learning teamin Saint-Etienne is dedicated in the design of new metric learningalgorithms.
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 18 / 22
Metric Learning in a Nutshell
Metric learning aims at optimizing parameterized distances like theMahalanobis distance.
Metric Learning
It typically induces a change of representation space to satisfyconstraints.
Metric Learning
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 19 / 22
Very popular approach
Find the matrix M Rdd of Mahalanobis distance
dM(x, x) =
(x x)TM(x x),
such that d2M satisfies best the constraints and where M is PSD.
Mahalanobis distance learning = Learning a linear projection
Using Cholesky decomposition, one can rewrite M as LTL.
dM(x, x) =
(x x)TLTL(x x)
=
(Lx Lx)T (Lx Lx)
A Mahalanobis distance implicitly corresponds to computing theEuclidean distance after the linear projection of the data by L.
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 20 / 22
Illustration
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 21 / 22
animation_lunes.movMedia File (video/quicktime)
Main scientific topics in MLDM
Metric Learning: optimization of metrics to improve classification tasks.
Domain Adaptation: transfer learning from a source domain to a targetdomain.
Ensemble methods: Theory of Boosting - PAC Bayesian Theory
Pattern Mining: temporal motif mining, social network mining.
Internship proposals on these topics will be soon available!
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 22 / 22
Recommended