Upload
dasariorama
View
570
Download
16
Embed Size (px)
DESCRIPTION
dm
Citation preview
IMPORTANT QUESTIONS OF UNIT-I TO UNIT-IV
Prepare 3 star question for external lab & remaining prepare for end external exams
***1.Define the terms Data Mining and data warehousing?
***2.Explain fundamentals of Data mining?
***3.Explain data mining functionalities?
***4.Explain Major issues of data mining?
***5.Explain classification of a data mining system?
***6. Explain Data Preprocessing Techniques?
***7.Explain Data cleaning and data integration?
***8.Explain Data Tranformation and data reduction?
***9.Explain data mining Task primitives?
10.Explain discritization and concept hierarchy generation?
11.Explain Multidimensional Data model?
12.Define the terms lattice of cubiod?OLAM,MOLAP,HOLAP,ROLAP?
13.Explain slice,dice and pivot operations?
***14.compare OLAP V/S OLTP?
15 or Differentiate operational Database versus datawarehouse?
***16.Explain the importance of Data Mining query Language?
***17.Explain starschema,snowflake schema and fact constellation schema?
18.Explain major types of concept hierarchies?
19.Explain central tendency?(not syllabus but previous jntu question)
20.Explain dispersion of data? ?(not syllabus but previous jntu question)
***21.Explain Data warehouse architecture?
***22. Explain data warehouse implementation?
23.Explain further Development of data cube technology?
24.Explain attribute oriented induction technique?
25.Explain Mining frequent patterns?
25.Define association rule and explain types of association rules?
26.Explain constraint based association mining?
***27.Explain apriori algorithm?
***28.Explain fp growth algorithm?
29.Explain classification and prediction techniques?
30.Explain Bayesian classification technique?
31.Explain support vector machine?
***32.Explain backpropogation algorithm?
33.Explain support vector machine?
34.Explain other classification methods?
***35.Explain decision tree and rule based classification?
36.*** Explain the steps for KDD?
Viva questions?
1.data mining
2.data warehousing
3.KDD
4.KDD steps
5.preprocessing
6.Data cleaning
7.Data Integration?
8.Data Transformation
9.Data reduction?
10.cluster
11.principle of clustering
12.regression
13.classification
14.prediction
15.oltp
16.olap
17.data mining applications
18.weka
19.arff
20.csv
21.crossvalidation
22.decision tree algorithms
23.visualization tools
24.data mining tools
25.data warehousing tools
26.categorical attribute
27.dmql
28.what is accuracy
29.redundancy
30.types of data bases
31.types of data minings
32.graph mining
33.spatial and multimedia dm
34.sequence data mining and time series dm
35.ranking mechanism
36.process of creating a arff file?
1.a 1. List all the categorical (or nominal) attributes and the real-valued attributes separately from credit risk assessment arff file.
b. **.Explain fundamentals of Data mining?
2.a What attributes do you think might be crucial in making the credit assessment? Come up with some simple rules in plain English using your selected attributes.
b. ***.Explain data mining functionalities?
3. a. One type of model that you can create is a Decision Tree - train a Decision Tree using the complete dataset as the training data. Report the model obtained after training.
b. ***.Explain Major issues of data mining?
4. a Create an arff file for credit risk assessment? And perform classification?display a decision tree?
b. ***.Explain classification of a data mining system?
5. a. Suppose you use your above model trained on the complete dataset, and classify credit good/bad for each of the examples in the dataset. What % of examples can you classify correctly? (This is also called testing on the training set) Why do you think you cannot get 100 % training accuracy?
b. Explain Data Preprocessing Techniques?
6.a. One approach for solving the problem encountered in the previous question is using cross-validation? Describe what cross-validation is briefly. Train a Decision Tree again using cross-validation and report your results. Does your accuracy increase/decrease? Why?
b. Explain data mining Task primitives?
7. Check to see if the data shows a bias against "foreign workers" (attribute 20),or "personal-status"(attribute 9). One way to do this (Perhaps rather simple
minded) is to remove these attributes fromthedataset and see if the decision tree created in those cases is significantly different from the full dataset case which you have already done. To remove an attribute you can use the reprocess tab in WEKA's GUI Explorer. Did removing these attributes have any significant effect? Discuss.
b. compare OLAP V/S OLTP?
Or Differentiate operational Database versus datawarehouse?
8.a Another question might be, do you really need to input so many attributes toget good results? Maybe only a few would do. For example, you could try just having attributes 2, 3, 5, 7, 10, 17(consider your own attributes) (and21, the class attribute (naturally)). Try out some combinations. (You had removed two attributes in problem 7Remember to reload the ARFF data file to get all the
attributes initially beforeyou start selecting the ones you want.)b. Explain the importance of Data Mining query Language?
9. a Sometimes, the cost of rejecting an applicant who actually has a good creditCase 1. might be higher than accepting an applicant who has bad
creditCase 2.Instead of counting the misclassifications equally in both cases, give a higher
cost to the first case (say cost 5) and lower cost to the second case. You can do this by using a cost matrix in WEKA.
Train your Decision Tree again and report the Decision Tree and cross-validationresults. Are they significantly different from
results obtained in problem 6 (using equal cost)?b Explain Data warehouse architecture?
10. a Do you think it is a good idea to prefer simple decision trees instead of having long complex decision trees? How does the complexity of a Decision Tree relate to the bias of the model?
b. Explain apriori algorithm?
.Explain fp growth algorithm?
11. .a List all the categorical (or nominal) attributes and the real-valued attributes separately from weather data set arff file?
b. **. Explain backpropogation algorithm?
12.a What attributes do you think might be crucial in making the student arfft? Come up with some simple rules in plain English using your selected attributes in a student arff file?
b. Explain decision tree and rule based classification?
13. a. One type of model that you can create is a Decision Tree - train a Decision Tree using the complete dataset as the training data. Report the model obtained after training from student data set arff file
b.. .Define the terms Data Mining and data warehousing?
14. a Create an arff file for employee data set arff file?perform classification?display a decision tree?
b. ***17.Explain starschema,snowflake schema and fact constellation schema?
15 a.Diferentiate arff file and csv file with one example each and execute in weka tool?
b. Explain steps for knowledge discovery of data?