14
CS525 DATA MINING CS525 DATA MINING COURSE INTRODUCTION COURSE INTRODUCTION YÜCEL SAYGIN YÜCEL SAYGIN SABANCI UNIVERSITY SABANCI UNIVERSITY

CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

Embed Size (px)

Citation preview

Page 1: CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

CS525 DATA MINING CS525 DATA MINING COURSE INTRODUCTIONCOURSE INTRODUCTION

YÜCEL SAYGINYÜCEL SAYGIN

SABANCI UNIVERSITYSABANCI UNIVERSITY

Page 2: CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

2Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program

Contact InfoContact Info

[email protected] http://people.sabanciuniv.edu/~ysaygin Tel : 9576Tel : 9576 No Specific office hours. You can drop by No Specific office hours. You can drop by

anytime you like. Email or call me to make anytime you like. Email or call me to make sure I am at the office.sure I am at the office.

Page 3: CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

3Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program

Course InfoCourse Info

Reference BookReference Book:: Data Mining Concepts and Techniques

Author: Jiawei Han and Micheline Kamber

Publisher: Morgan Kaufmann

Page 4: CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

4Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program

Course InfoCourse Info

Grading: Midterm : 30% (April 14-18) Homework : 10% Project : 30% Paper presentation : 10% Term Paper : 10% Attendance during paper presentations:

10%

Page 5: CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

5Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program

Topics that will be coveredTopics that will be covered

Different Data Mining TechniquesDifferent Data Mining Techniques Association RulesAssociation Rules ClassificationClassification ClusteringClustering

Data Mining and Security IssuesData Mining and Security Issues Applications of Data MiningApplications of Data Mining Data WarehousingData Warehousing

Page 6: CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

6Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program

Aim of the course Aim of the course

Knowledge:Knowledge: To introduce data mining conceptsTo introduce data mining concepts

Skills:Skills: paper reading and presentationpaper reading and presentation research and/or project workresearch and/or project work

Page 7: CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

7Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program

A Rough ScheduleA Rough Schedule

March, April, First Week of May:March, April, First Week of May: Lectures on various data mining techniquesLectures on various data mining techniques Invited Speakers form Industry to share Invited Speakers form Industry to share

their experiencestheir experiences Remaining 4 weeks: Paper Remaining 4 weeks: Paper

presentations and discussions in class presentations and discussions in class about research issuesabout research issues

Page 8: CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

8Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program

What I will doWhat I will do

Give the basics on data miningGive the basics on data mining broad data mining concepts broad data mining concepts research issuesresearch issues

Project supervisionProject supervision Give directions and advise on the projects I Give directions and advise on the projects I

proposed (will be provided in the next slides)proposed (will be provided in the next slides) Coordination of the presentationsCoordination of the presentations

Page 9: CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

9Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program

What I expect you to doWhat I expect you to do

I expect you to do things wrt your I expect you to do things wrt your background and expertise.background and expertise.

Students with CS background will do Students with CS background will do projects involving implementation and/or projects involving implementation and/or researchresearch

Others can do application projects Others can do application projects On a real applicationOn a real application That will involve data collection, cleaning etcThat will involve data collection, cleaning etc With at least two data mining tools that will be With at least two data mining tools that will be

compared in terms of functionality for the chosen compared in terms of functionality for the chosen applicationapplication

Page 10: CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

10Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program

What I expect you to doWhat I expect you to do

Understand the basic data mining conceptsUnderstand the basic data mining concepts Choose a specific area and two related papers Choose a specific area and two related papers

on the same topic for presentation in classon the same topic for presentation in class Attendance is required for paper presentations Attendance is required for paper presentations

and you will loose 2% of your overall for each and you will loose 2% of your overall for each presentation you missed.presentation you missed.

Write a term paper on the two papers Write a term paper on the two papers presented.presented.

Do a project and a final report describing what Do a project and a final report describing what you learned or achieved in the scope of the you learned or achieved in the scope of the project.project.

Page 11: CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

11Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program

ProjectsProjects

Data Mining and Game Theory. Will be co-Data Mining and Game Theory. Will be co-supervised with Ozgur Kibris from Economics supervised with Ozgur Kibris from Economics ((Mostly research, and survey, may involve Mostly research, and survey, may involve algorithms design. Good for students in SLPalgorithms design. Good for students in SLP))

Implementation of algorithms for data security Implementation of algorithms for data security against data mining methods (against data mining methods (pure algorithms pure algorithms survey and implementation, good for CS survey and implementation, good for CS students who like implementationstudents who like implementation))

Page 12: CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

12Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program

ProjectsProjects

Development of algorithms for protecting Development of algorithms for protecting sensitive data against various data mining sensitive data against various data mining algorithms (algorithms (research and implementation, research and implementation, good for CS studentsgood for CS students)) Hiding Sequential patterns in temporal data Hiding Sequential patterns in temporal data

by changing time granularities is an by changing time granularities is an exampleexample

Survey and Implementation of the existing Survey and Implementation of the existing Privacy preserving data mining methods (Privacy preserving data mining methods (pure pure implementation, good for CS studentsimplementation, good for CS students))

Page 13: CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

13Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program

Introduction to Data MiningIntroduction to Data Mining

Why do we collect and process historical Why do we collect and process historical data?data?

What is the purpose of data mining?What is the purpose of data mining? What are the applications?What are the applications?

Page 14: CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY

14Faculty of Engineering and Natural Sciences, Computer Science and Engineering Program

Introduction to Data MiningIntroduction to Data Mining

Data is mostly stored in data Data is mostly stored in data warehouseswarehouses

Data Mining Techniques are used to Data Mining Techniques are used to analyse the data:analyse the data: Association rule finding from transactional Association rule finding from transactional

datadata Clustering of data with multiple dimensions Clustering of data with multiple dimensions Classification of given data into predefined Classification of given data into predefined

classesclasses