Romi Dm 05 Research June2012

Embed Size (px)

Citation preview

  • 8/13/2019 Romi Dm 05 Research June2012

    1/19

    Data Mining:Penelitian Data Mining

    Romi Satria [email protected]://romisatriawahono.net

    +6281586220090

  • 8/13/2019 Romi Dm 05 Research June2012

    2/19

    SD Sompok Semarang (1987)

    SMPN 8Semarang (1990)

    SMA Taruna Nusantara, Magelang (1993)

    S1, S2 dan S3 (on-leave)

    Department of Computer SciencesSaitama University, Japan (1994-2004)

    Research Interests: Software EngineeringandIntelligent Systems

    Founder IlmuKomputer.Com

    Peneliti LIPI (2004-2007)

    Founder dan CEO PT Brainmatics Cipta Informatika

    Romi Satria Wahono

  • 8/13/2019 Romi Dm 05 Research June2012

    3/19

    Course Outline

    1. Pengenalan Data Mining

    2. Proses Data Mining

    3. Evaluasi dan Validasi pada Data Mining

    4. Metode dan Algoritma Data Mining

    5. Penelitian Data Mining

  • 8/13/2019 Romi Dm 05 Research June2012

    4/19

    Penelitian Data Mining

  • 8/13/2019 Romi Dm 05 Research June2012

    5/19

    Penelitian Data Mining

    1. Standard Proses Penelitian pada Data Mining

    2. Journal Publications on Data Mining

    3. Research on Classification

    4. Research on Clustering

    5. Research on Prediction

    6. Research on Association Rule

  • 8/13/2019 Romi Dm 05 Research June2012

    6/19

    Standard Proses Penelitianpada Data Mining

  • 8/13/2019 Romi Dm 05 Research June2012

    7/19

    Data Mining Standard Process (CRISPDM)

    A cross-industry standard was clearly required

    that is industry neutral, tool-neutral, andapplication-neutral

    The Cross-Industry Standard Process for Data

    Mining (CRISP

    DM) was developed in 1996(Chapman, 2000)

    CRISP-DM provides a nonproprietary and freely

    available standard process for fitting datamining into the general problem-solvingstrategy of a business or research unit

  • 8/13/2019 Romi Dm 05 Research June2012

    8/19

    CRISP-DM

  • 8/13/2019 Romi Dm 05 Research June2012

    9/19

    1. Business Understanding Phase

    Enunciate the project objectives and requirements

    clearly in terms of the business or research unit as awhole

    Translate these goals and restrictions into the

    formulation of a data mining problem definition Prepare a preliminary strategy for achieving these

    objectives

  • 8/13/2019 Romi Dm 05 Research June2012

    10/19

    2. Data Understanding Phase

    Collect the data

    Use exploratory data analysis to familiarize yourselfwith the data and discover initial insights

    Evaluatethe quality of the data

    If desired, select interesting subsets that maycontain actionable patterns

  • 8/13/2019 Romi Dm 05 Research June2012

    11/19

    3. Data Preparation Phase

    Prepare from the initial raw data the final data set

    that is to be used for all subsequent phases. Thisphase is very labor intensive

    Select the cases and variables you want to analyze

    and that are appropriate for your analysis Perform transformations on certain variables, if

    needed

    Clean the raw data so that it is ready for themodeling tools

  • 8/13/2019 Romi Dm 05 Research June2012

    12/19

    4. Modeling phase

    Select and apply appropriate modeling techniques

    Calibrate model settingsto optimize results

    Remember that often, several different techniquesmay be usedfor the same data mining problem

    If necessary, loop back to the data preparationphaseto bring the form of the data into line withthe specific requirements of a particular data

    mining technique

  • 8/13/2019 Romi Dm 05 Research June2012

    13/19

    5. Evaluation phase

    Evaluate the one or more models delivered in the

    modeling phase for quality and effectivenessbefore deploying them for use in the field

    Determine whether the model in fact achieves the

    objectives set for it in the first phase Establish whether some important facet of the

    business or research problem has not beenaccounted for sufficiently

    Come to a decision regarding use of the datamining results

  • 8/13/2019 Romi Dm 05 Research June2012

    14/19

    6. Deployment phase

    Make use of the models created: Model creation

    does not signify the completion of a project

    Example of a simple deployment: Generate areport

    Example of a more complex deployment:Implement a parallel data mining process inanother department

    For businesses, the customer often carries out thedeployment based on your model

  • 8/13/2019 Romi Dm 05 Research June2012

    15/19

    Latihan

    Pelajari dan pahami Case Study 1-5 dari buku

    Larose (2005) Chapter 1

    Pelajari dan pahami bagaimana menerapkanCRISP-DMpada tesis Firmansyah (2011) tentangpenerapan algoritma C4.5 untuk penentuankelayakan kredit

  • 8/13/2019 Romi Dm 05 Research June2012

    16/19

    Journal Publications on Data Mining

  • 8/13/2019 Romi Dm 05 Research June2012

    17/19

    Transactions and Journals

    Review Paper (survey and state-of-the-art):

    ACM Computing Surveys (CSUR)

    Research Paper (technical):

    ACM Transactions on Knowledge Discovery from Data (TKDD)

    ACM Transactions on Information Systems (TOIS)

    IEEE Transactions on Knowledge and Data Engineering

    Springer Data Mining and Knowledge Discovery

    International Journal of Business Intelligence and Data Mining(IJBIDM)

  • 8/13/2019 Romi Dm 05 Research June2012

    18/19

    Cognitive Assignment III

    1. Baca 1 paper ilmiah yang diterbitkan di journal 2010-2012 yang

    berhubungan dengan metode data mining yang sudah kitapelajari

    2. Rangkumkan masing-masing dalam bentuk slide denganstruktur:

    1. Latar Belakang Masalah (Research Background)2. Pernyataan Masalah (Problem Statements)

    3. Pertanyaan Penelitian (Research Questions)

    4. TujuanPenelitian (Research Objective)

    5. Metode-Metode yang Sudah Ada(Existing Methods)6. Metode yang Diusulkan (Proposed Method)

    7. Hasil (Results)

    8. Kesimpulan(Conclusion)

    3. Presentasikan di depan kelas pada mata kuliah berikutnya

  • 8/13/2019 Romi Dm 05 Research June2012

    19/19

    Referensi

    1. Ian H. Witten, Frank Eibe, Mark A. Hall, Data mining: Practical

    Machine Learning Tools and Techniques 3rd Edition, Elsevier,2011

    2. Daniel T. Larose, Discovering Knowledge in Data: anIntroduction to Data Mining,John Wiley & Sons, 2005

    3. Florin Gorunescu, Data Mining: Concepts, Models andTechniques, Springer, 2011

    4. Jiawei Han and Micheline Kamber, Data Mining: Concepts andTechniques Second Edition, Elsevier, 2006

    5. Oded Maimon and Lior Rokach, Data Mining and KnowledgeDiscovery Handbook Second Edition, Springer, 2010

    6. Warren Liao and Evangelos Triantaphyllou (eds.), RecentAdvances in Data Mining of Enterprise Data: Algorithms andApplications, World Scientific, 2007