Mead Learning - Projects

Mead learning– Big data & deep learning projects

1

1.B2B Recommendation (similar to Indiamart)

● Dataset : weblog file with approximate 700000 searching

queries of different users in different categories of B2B search engine.

● Each search contains following data in weblog file: date time s-computername s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Cookie) cs(Referer) cs-host sc-status sc-substatus sc-win32-status sc-bytes cs-bytes time-taken

2

Tools and Technology

● Dataset : Weblog of Fibre2Fashion domain ● Technology : JAVA ● Algorithm: Collaborative Deep Learning for Recommendation Systems

3

B2B search engine

4

Cs-uri-query

Weblog Analysis

5

Unique page visitor data with timestamp

6

Unique Page in weblog

Visitor client address

7

Unique Client IP in weblog

Recommendation

8

Server IP

Client IP Page

Query Time taken

This figure shows the highest time spend pages by the clients.

2.Pingax Recommendation ● We created a Google chrome plug-in with

recommendation system which gives the best recommendation as per the user search for all the items in different websites.

● Technology: Java, Apache Mahout, Apache Hadoop, Json,

JavaScript

9

Pingax Digital Recommendation Settings

10

Recommendation

11

1st Recommendation

Recommendation

12

2nd Recommendation

Recommendation

13

Recommended item

3.IMDB movie review classification ● Dataset: IMDB movie Review text files ● Tools & technology: Eclipse , JAVA

● Machine Learning & Deep Learning Technique: Deep

Learning using Linear Support Vector Machines and Conditional Random Fields as Recurrent Neural Networks

● Class labels: Positive , Negative , Neutral

14

Dataset

15

POSITIVE

NEGATIVE

SVM train

16

Training Parameters

Accuracy

17

CRF Train

18

CRF Train parameters

CRF Test

19

CRF Test parameters

Accuracy

20

4.A Collaborative Approach for Web Personalized Recommendation System

● Dataset: Movielens data ● Technology: JAVA ● Algorithm: User based collaborative Deep Learning

filtering technique .

21

Dataset

22

•  The full data set, 100000 ratings by 943 users on

1682 items. •  Each user has rated at least 20 movies. •  Users and items are numbered consecutively

from 1. •  The data is randomly ordered and 80%/20%

splits of the data into training and test data.

User ID, Item(Movie) Id, Rating

Evaluation of System

23

Analysis with different similarity measures

24

With Preprocessing Without preprocesing Neighbour 20 100 1000 20 100 1000 Pearson Coefficient

SCORE: 0.80281969 RMSE: 0.807538041

SCORE: 0.82893594 RMSE: 0.83615161

SCORE: 0.9604098849 RMSE: 0.957224878

SCORE: 0.9146494635 RMSE: 0.882693501

SCORE: 0.843550186 RMSE: 0.8463632529

SCORE: 0.86570804 RMSE: 0.86579376

Euclidean Distance

SCORE: 0.7694310985 RMSE: 0.767654578

SCORE: 0.7399951798 RMSE: 0.746315315

SCORE: 0.724689055 RMSE: 0.740014834

SCORE: 0.89754363 RMSE: 0.900235117

SCORE: 0.83637523 RMSE: 0.8443356

SCORE: 0.89751163 RMSE: 0.90023111

Log Likely-hood

SCORE: 0.795861932 RMSE: 0.7785749268

SCORE: 0.7467147307 RMSE: 0.759681185

SCORE: 0.726946821 RMSE: 0.74111319

SCORE: 0.823518133 RMSE: 0.82825220

SCORE: 0.8034275434 RMSE: 0.80880505

SCORE: 0.805596293 RMSE: 0.814149532

Tanimoto Coefficient

SCORE: 0.7945015649 RMSE: 0.7798661428

SCORE: 0.7490188732 RMSE: 0.76143536582

SCORE: 0.7290753292 RMSE: 0.74004491

SCORE: 0.832473341 RMSE: 0.8394239511

SCORE: 0.8064418071 RMSE: 0.813111737

SCORE: 0.8037747631 RMSE: 0.811900568

5.Flower grain image classification using supervised classification algorithms

● Dataset: Magnified images of flowers ● Technology: JAVA

● Algorithm: Grain Analysis (Image Processing)

● Machine learning Technique: Neural Network and Deep Belief network ,SVM

● Microscope Magnification: 100X

25

Deep Neural Network Here we extended simple neural network architecture to deep belief network.

Simple Neural Network Deep Belief Network

26

Flower and it’s Grain images

27

Model Parameter

28

SVM training Parameters

Code & Accuracy

29

6. Large scale medical text classification and

identification in Healthcare

● Dataset : Medical Text files

30

healthcare ENTITY RELATIONSHIP DETECTION FROM LARGE TEXT FILES

In this project, we developed algorithm that will predict if a relationship exists between two entities from medical text files. (Like Leg pain or pain in

leg). We used deep learning Support Vector Machine algorithm (Binary

Classifier) to accurately identify it.

Accuracy: 87% on testing (unknown) data

31

Healthcare

ENTITY DETECTION IN CLINICAL DOMAIN In this Project, we detected different keywords (modifiers) like Negation, Conditional, Severity, Temporal, Body measurements, some Disease name

and others from large medical text files using NLP Algorithms and classified

it using probabilistic graphical model like Deep CRF networks and Hidden Markov Model.

Accuracy: 93% on testing (unknown) data.

32

7.Neural network design for rock image

recognition The objective of this project is to develop the method for Rock Image Classification system using microscopic imaging of surface parameter. Rock surface parameters are color,grain and texture. The combined feature extracted from each of this parameter is used to uniquely identify rock type or to recognize its signature. We designed and developed multi layer feed-forward deep neural network to classified non-linear complex data.

Accuracy: 95% on testing untrained rock images

33

Documents

Mead Learning - Projects