33
Mead learning– Big data & deep learning projects 1

Mead Learning - Projects

Embed Size (px)

DESCRIPTION

Projects

Citation preview

Page 1: Mead Learning - Projects

Mead learning– Big data & deep learning projects

1

Page 2: Mead Learning - Projects

1.B2B Recommendation (similar to Indiamart)

● Dataset : weblog file with approximate 700000 searching

queries of different users in different categories of B2B search engine.

● Each search contains following data in weblog file: date time s-computername s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Cookie) cs(Referer) cs-host sc-status sc-substatus sc-win32-status sc-bytes cs-bytes time-taken

2

Page 3: Mead Learning - Projects

Tools and Technology

● Dataset : Weblog of Fibre2Fashion domain ● Technology : JAVA ● Algorithm: Collaborative Deep Learning for Recommendation Systems

3

Page 4: Mead Learning - Projects

B2B search engine

4

Cs-uri-query

Page 5: Mead Learning - Projects

Weblog Analysis

5

Page 6: Mead Learning - Projects

Unique page visitor data with timestamp

6

Unique Page in weblog

Page 7: Mead Learning - Projects

Visitor client address

7

Unique Client IP in weblog

Page 8: Mead Learning - Projects

Recommendation

8

Server IP

Client IP Page

Query Time taken

This figure shows the highest time spend pages by the clients.

Page 9: Mead Learning - Projects

2.Pingax Recommendation ● We created a Google chrome plug-in with

recommendation system which gives the best recommendation as per the user search for all the items in different websites.

● Technology: Java, Apache Mahout, Apache Hadoop, Json,

JavaScript

9

Page 10: Mead Learning - Projects

Pingax Digital Recommendation Settings

10

Page 11: Mead Learning - Projects

Recommendation

11

1st Recommendation

Page 12: Mead Learning - Projects

Recommendation

12

2nd Recommendation

Page 13: Mead Learning - Projects

Recommendation

13

Recommended item

Page 14: Mead Learning - Projects

3.IMDB movie review classification ● Dataset: IMDB movie Review text files ● Tools & technology: Eclipse , JAVA

● Machine Learning & Deep Learning Technique: Deep

Learning using Linear Support Vector Machines and Conditional Random Fields as Recurrent Neural Networks

● Class labels: Positive , Negative , Neutral

14

Page 15: Mead Learning - Projects

Dataset

15

POSITIVE

NEGATIVE

Page 16: Mead Learning - Projects

SVM train

16

Training Parameters

Page 17: Mead Learning - Projects

Accuracy

17

Page 18: Mead Learning - Projects

CRF Train

18

CRF Train parameters

Page 19: Mead Learning - Projects

CRF Test

19

CRF Test parameters

Page 20: Mead Learning - Projects

Accuracy

20

Page 21: Mead Learning - Projects

4.A Collaborative Approach for Web Personalized Recommendation System

● Dataset: Movielens data ● Technology: JAVA ● Algorithm: User based collaborative Deep Learning

filtering technique .

21

Page 22: Mead Learning - Projects

Dataset

22

•  The full data set, 100000 ratings by 943 users on

1682 items. •  Each user has rated at least 20 movies. •  Users and items are numbered consecutively

from 1. •  The data is randomly ordered and 80%/20%

splits of the data into training and test data.

User ID, Item(Movie) Id, Rating

Page 23: Mead Learning - Projects

Evaluation of System

23

Page 24: Mead Learning - Projects

Analysis with different similarity measures

24

With Preprocessing Without preprocesing Neighbour 20 100 1000 20 100 1000 Pearson Coefficient

SCORE: 0.80281969 RMSE: 0.807538041

SCORE: 0.82893594 RMSE: 0.83615161

SCORE: 0.9604098849 RMSE: 0.957224878

SCORE: 0.9146494635 RMSE: 0.882693501

SCORE: 0.843550186 RMSE: 0.8463632529

SCORE: 0.86570804 RMSE: 0.86579376

Euclidean Distance

SCORE: 0.7694310985 RMSE: 0.767654578

SCORE: 0.7399951798 RMSE: 0.746315315

SCORE: 0.724689055 RMSE: 0.740014834

SCORE: 0.89754363 RMSE: 0.900235117

SCORE: 0.83637523 RMSE: 0.8443356

SCORE: 0.89751163 RMSE: 0.90023111

Log Likely-hood

SCORE: 0.795861932 RMSE: 0.7785749268

SCORE: 0.7467147307 RMSE: 0.759681185

SCORE: 0.726946821 RMSE: 0.74111319

SCORE: 0.823518133 RMSE: 0.82825220

SCORE: 0.8034275434 RMSE: 0.80880505

SCORE: 0.805596293 RMSE: 0.814149532

Tanimoto Coefficient

SCORE: 0.7945015649 RMSE: 0.7798661428

SCORE: 0.7490188732 RMSE: 0.76143536582

SCORE: 0.7290753292 RMSE: 0.74004491

SCORE: 0.832473341 RMSE: 0.8394239511

SCORE: 0.8064418071 RMSE: 0.813111737

SCORE: 0.8037747631 RMSE: 0.811900568

Page 25: Mead Learning - Projects

5.Flower grain image classification using supervised classification algorithms

● Dataset: Magnified images of flowers ● Technology: JAVA

● Algorithm: Grain Analysis (Image Processing)

● Machine learning Technique: Neural Network and Deep Belief network ,SVM

● Microscope Magnification: 100X

25

Page 26: Mead Learning - Projects

Deep Neural Network Here we extended simple neural network architecture to deep belief network.

Simple Neural Network Deep Belief Network

26

Page 27: Mead Learning - Projects

Flower and it’s Grain images

27

Page 28: Mead Learning - Projects

Model Parameter

28

SVM training Parameters

Page 29: Mead Learning - Projects

Code & Accuracy

29

Page 30: Mead Learning - Projects

6. Large scale medical text classification and

identification in Healthcare

● Dataset : Medical Text files

30

Page 31: Mead Learning - Projects

healthcare ENTITY RELATIONSHIP DETECTION FROM LARGE TEXT FILES

In this project, we developed algorithm that will predict if a relationship exists between two entities from medical text files. (Like Leg pain or pain in

leg). We used deep learning Support Vector Machine algorithm (Binary

Classifier) to accurately identify it.

Accuracy: 87% on testing (unknown) data

31

Page 32: Mead Learning - Projects

Healthcare

ENTITY DETECTION IN CLINICAL DOMAIN In this Project, we detected different keywords (modifiers) like Negation, Conditional, Severity, Temporal, Body measurements, some Disease name

and others from large medical text files using NLP Algorithms and classified

it using probabilistic graphical model like Deep CRF networks and Hidden Markov Model.

Accuracy: 93% on testing (unknown) data.

32

Page 33: Mead Learning - Projects

7.Neural network design for rock image

recognition The objective of this project is to develop the method for Rock Image Classification system using microscopic imaging of surface parameter. Rock surface parameters are color,grain and texture. The combined feature extracted from each of this parameter is used to uniquely identify rock type or to recognize its signature. We designed and developed multi layer feed-forward deep neural network to classified non-linear complex data.

 Accuracy: 95% on testing untrained rock images

33