7
IMPORTANT QUESTIONS OF UNIT-I TO UNIT-IV Prepare 3 star question for external lab & remaining prepare for end external exams ***1.Define the terms Data Mining and data warehousing? ***2.Explain fundamentals of Data mining? ***3.Explain data mining functionalities? ***4.Explain Major issues of data mining? ***5.Explain classification of a data mining system? ***6. Explain Data Preprocessing Techniques? ***7.Explain Data cleaning and data integration? ***8.Explain Data Tranformation and data reduction? ***9.Explain data mining Task primitives? 10.Explain discritization and concept hierarchy generation? 11.Explain Multidimensional Data model? 12.Define the terms lattice of cubiod?OLAM,MOLAP,HOLAP,ROLAP? 13.Explain slice,dice and pivot operations? ***14.compare OLAP V/S OLTP? 15 or Differentiate operational Database versus datawarehouse? ***16.Explain the importance of Data Mining query Language? ***17.Explain starschema,snowflake schema and fact constellation schema? 18.Explain major types of concept hierarchies? 19.Explain central tendency?(not syllabus but previous jntu question) 20.Explain dispersion of data? ?(not syllabus but previous jntu question)

IMPORTANT QUESTIONS OF dwdmdwd.docx

Embed Size (px)

DESCRIPTION

dwdm

Citation preview

Page 1: IMPORTANT QUESTIONS OF dwdmdwd.docx

IMPORTANT QUESTIONS OF UNIT-I TO UNIT-IV

Prepare 3 star question for external lab & remaining prepare for end external exams

***1.Define the terms Data Mining and data warehousing?

***2.Explain fundamentals of Data mining?

***3.Explain data mining functionalities?

***4.Explain Major issues of data mining?

***5.Explain classification of a data mining system?

***6. Explain Data Preprocessing Techniques?

***7.Explain Data cleaning and data integration?

***8.Explain Data Tranformation and data reduction?

***9.Explain data mining Task primitives?

10.Explain discritization and concept hierarchy generation?

11.Explain Multidimensional Data model?

12.Define the terms lattice of cubiod?OLAM,MOLAP,HOLAP,ROLAP?

13.Explain slice,dice and pivot operations?

***14.compare OLAP V/S OLTP?

15 or Differentiate operational Database versus datawarehouse?

***16.Explain the importance of Data Mining query Language?

***17.Explain starschema,snowflake schema and fact constellation schema?

18.Explain major types of concept hierarchies?

19.Explain central tendency?(not syllabus but previous jntu question)

20.Explain dispersion of data? ?(not syllabus but previous jntu question)

***21.Explain Data warehouse architecture?

***22. Explain data warehouse implementation?

23.Explain further Development of data cube technology?

Page 2: IMPORTANT QUESTIONS OF dwdmdwd.docx

24.Explain attribute oriented induction technique?

25.Explain Mining frequent patterns?

25.Define association rule and explain types of association rules?

26.Explain constraint based association mining?

***27.Explain apriori algorithm?

***28.Explain fp growth algorithm?

29.Explain classification and prediction techniques?

30.Explain Bayesian classification technique?

31.Explain support vector machine?

***32.Explain backpropogation algorithm?

33.Explain support vector machine?

34.Explain other classification methods?

***35.Explain decision tree and rule based classification?

36.*** Explain the steps for KDD?

Viva questions?

1.data mining

2.data warehousing

3.KDD

4.KDD steps

5.preprocessing

6.Data cleaning

7.Data Integration?

8.Data Transformation

9.Data reduction?

10.cluster

Page 3: IMPORTANT QUESTIONS OF dwdmdwd.docx

11.principle of clustering

12.regression

13.classification

14.prediction

15.oltp

16.olap

17.data mining applications

18.weka

19.arff

20.csv

21.crossvalidation

22.decision tree algorithms

23.visualization tools

24.data mining tools

25.data warehousing tools

26.categorical attribute

27.dmql

28.what is accuracy

29.redundancy

30.types of data bases

31.types of data minings

32.graph mining

33.spatial and multimedia dm

34.sequence data mining and time series dm

35.ranking mechanism

Page 4: IMPORTANT QUESTIONS OF dwdmdwd.docx

36.process of creating a arff file?

1.a 1. List all the categorical (or nominal) attributes and the real-valued attributes separately from credit risk assessment arff file.

b. **.Explain fundamentals of Data mining?

2.a What attributes do you think might be crucial in making the credit assessment? Come up with some simple rules in plain English using your selected attributes.

b. ***.Explain data mining functionalities?

3. a. One type of model that you can create is a Decision Tree - train a Decision Tree using the complete dataset as the training data. Report the model obtained after training.

b. ***.Explain Major issues of data mining?

4. a Create an arff file for credit risk assessment? And perform classification?display a decision tree?

b. ***.Explain classification of a data mining system?

5. a. Suppose you use your above model trained on the complete dataset, and classify credit good/bad for each of the examples in the dataset. What % of examples can you classify correctly? (This is also called testing on the training set) Why do you think you cannot get 100 % training accuracy?

b. Explain Data Preprocessing Techniques?

6.a. One approach for solving the problem encountered in the previous question is using cross-validation? Describe what cross-validation is briefly. Train a Decision Tree again using cross-validation and report your results. Does your accuracy increase/decrease? Why?

b. Explain data mining Task primitives?

7. Check to see if the data shows a bias against "foreign workers" (attribute 20),or "personal-status"(attribute 9). One way to do this (Perhaps rather simple

minded) is to remove these attributes fromthedataset and see if the decision tree created in those cases is significantly different from the full dataset case which you have already done. To remove an attribute you can use the reprocess tab in WEKA's GUI Explorer. Did removing these attributes have any significant effect? Discuss.

b. compare OLAP V/S OLTP?

Or Differentiate operational Database versus datawarehouse?

8.a Another question might be, do you really need to input so many attributes toget good results? Maybe only a few would do. For example, you could try just having attributes 2, 3, 5, 7, 10, 17(consider your own attributes) (and21, the class attribute (naturally)). Try out some combinations. (You had removed two attributes in problem 7Remember to reload the ARFF data file to get all the

Page 5: IMPORTANT QUESTIONS OF dwdmdwd.docx

attributes initially beforeyou start selecting the ones you want.)b. Explain the importance of Data Mining query Language?

9. a Sometimes, the cost of rejecting an applicant who actually has a good creditCase 1. might be higher than accepting an applicant who has bad

creditCase 2.Instead of counting the misclassifications equally in both cases, give a higher

cost to the first case (say cost 5) and lower cost to the second case. You can do this by using a cost matrix in WEKA.

Train your Decision Tree again and report the Decision Tree and cross-validationresults. Are they significantly different from

results obtained in problem 6 (using equal cost)?b Explain Data warehouse architecture?

10. a Do you think it is a good idea to prefer simple decision trees instead of having long complex decision trees? How does the complexity of a Decision Tree relate to the bias of the model?

b. Explain apriori algorithm?

.Explain fp growth algorithm?

11. .a List all the categorical (or nominal) attributes and the real-valued attributes separately from weather data set arff file?

b. **. Explain backpropogation algorithm?

12.a What attributes do you think might be crucial in making the student arfft? Come up with some simple rules in plain English using your selected attributes in a student arff file?

b. Explain decision tree and rule based classification?

13. a. One type of model that you can create is a Decision Tree - train a Decision Tree using the complete dataset as the training data. Report the model obtained after training from student data set arff file

b.. .Define the terms Data Mining and data warehousing?

14. a Create an arff file for employee data set arff file?perform classification?display a decision tree?

b. ***17.Explain starschema,snowflake schema and fact constellation schema?

15 a.Diferentiate arff file and csv file with one example each and execute in weka tool?

b. Explain steps for knowledge discovery of data?

Page 6: IMPORTANT QUESTIONS OF dwdmdwd.docx