11
Model Question Paper Subject Code: MC0088 Book ID : B1009 Subject Name: Data Mining Credits: 4 Marks: 140 Part A (One mark questions) (50 * 1 = 50 Marks) 1. Typical techniques for data mining involve ___________. A. Decision trees B. Neural networks C. Genetic algorithms D. All of the above 2. POS collects the information on the item_________. A. Brand name B. Size C. Category D. All of the above 3. Which of the following industries use data mining techniques? A. Chemical B. Finance C. Marketing D. None of the above 4. ______is the process of analyzing data from different perspectives and summarizing into useful information. A. Data process B. Data management C. Data mining D. Database management 5. _________data is included in the transactional data. A. Sales B. Cost C. Inventory D. All of the above 6. Data warehousing deals with the subjects’ like__________. A. Supplier B. Product C. Sales D. A,B and C

MC0088-MQP-

Embed Size (px)

DESCRIPTION

mca

Citation preview

Page 1: MC0088-MQP-

Model Question Paper

Subject Code: MC0088 Book ID : B1009

Subject Name: Data Mining

Credits: 4 Marks: 140

Part A (One mark questions) (50 * 1 = 50 Marks)

1. Typical techniques for data mining involve ___________.

A. Decision trees

B. Neural networks

C. Genetic algorithms

D. All of the above

2. POS collects the information on the item_________.

A. Brand name

B. Size

C. Category

D. All of the above

3. Which of the following industries use data mining techniques?

A. Chemical

B. Finance

C. Marketing

D. None of the above

4. ______is the process of analyzing data from different perspectives and summarizing into useful information. A. Data process B. Data management C. Data mining D. Database management

5. _________data is included in the transactional data. A. Sales B. Cost C. Inventory D. All of the above

6. Data warehousing deals with the subjects’ like__________. A. Supplier B. Product C. Sales D. A,B and C

Page 2: MC0088-MQP-

7. ________operation is used by the data warehousing in accessing of data. A. Data control B. Data access C. Data mining D. Data processing

8. Data warehousing brings_________ performance to the integrated heterogeneous database system. A. High B. Low C. Moderate D. Very less

9. Which of the following encompasses a broad range of analytical software and provide solution for gathering information? A. RI B. BI C. KI D. None of the above

10. Identify business intelligence tool from the following. A. OLAP B. Data mining tools C. Query tools D. All of the above

11. Data cleaning helps to fill _______values. A. Missing B. Routine C. Old D. New

12. Data mining techniques are classified based on_____________. A. Database B. Knowledge to be discovered C. Techniques to be utilized D. All of the above

13. ____________software provides the ability to store, access and modify the data. A. DBMS B. RDBMS C. Data warehousing D. None of the above

14. _______ Language is supported by DBMS. A. C++ B. Query C. C D. Java

15. Detecting anomalies is a___________ technique. A. Data ware housing B. Data stage C. Data mining D. Data cleaning

Page 3: MC0088-MQP-

16. Identify data mining techniques from the following. A. Clustering B. Data summarization C. Classification D. All of the above

17. _______can manage the data on physical storage devices. A. Data stage B. DBMS C. Data mining D. Data warehousing

18. Who proposed priori algorithm? A. Agarwal B. Srikanth C. Srikaran D. Agarwal and Srikanth

19. _______rule can describe associations between quantitative items or attributes. A. Qualitative B. Quantitative C. Multilevel D. Multidimensional

20. Identify the prediction technique from the following. A. Nearest neighbor B. Clustering C. Data mining D. Data warehousing

21. Identify a link analysis algorithm from the following. A. Web Agent B. Page rank C. Page view D. None of the above

22. ____________ Approach is mainly concentrates on improving information finding and filtering. A. Agent-based approach B. Database approach C. Attribute selection approach D. Testing approach

23. Which of the following system can store and manage a large collection of multimedia data? A. DBMS B. RDBMS C. Multimedia database D. ROM

24. ___________are the queries of content based image retrieval system.

A. Image sample-based queries

B. Image feature specification queries

C. Both a and b

D. Image sample specification queries

Page 4: MC0088-MQP-

25. ______strategy has the capacity to reduce the overall data mining cost without loss of the

quality.

A. Resolution

B. Mining

C. Multi resolution mining

D. Multi resolution

26. _______involving multimedia objects can be mined in image and video database.

A. Data mining

B. Association rules

C. Web mining

D. Clustering

27. Multimedia database is a combination of_______.

A. Audio

B. Video

C. Image

D. All of the above

28. ________can build indices.

A. Description based retrieval systems

B. Content based retrieval systems

C. Data mining

D. Relational database management systems

29. _________ are the approaches of similarity based retrieval in image databases. A. Color histogram–based signature B. Multi feature composed signature C. Wavelet-based signature D. All of the above

30. ___________ Model is helpful to predict the future purchasing behavior of an individual. A. Data mining B. Clustering C. Predictive profile D. Web mining

31. Telecommunications and media deals with__________. A. Response Scoring B. Customer segmentation C. Profitability analysis D. All of the above

32. _____________contains customer-shopping transactions. A. Retail databases B. Relational databases C. Data mining D. Relational database management system

33. Which of the following methods are constructed to predict the outcomes of variety of decision alternatives? A. Predictive

Page 5: MC0088-MQP-

B. Database C. Multi feature D. None of the above

34. __________ Industries are mainly rely on data analysis in order to take the profitable business decisions. A. Insurance B. Direct mail C. Both A and B D. Chemical

35. Identify the various tasks that risk management deals with. A. Forecasting B. Customer retention C. Improved underwriting D. All of the above

36. _______techniques are used by the investors in order to predict the stock performance. A. Web mining B. Data mining C. Data warehousing D. Data stage

37. The individual tuples making up the training set are referred to as______________. A. Training samples B. Samples C. Data samples D. Nearest samples

38. _________method uses test set of class-labeled samples.

A. Holdout method B. Labeled method C. Clustering technique D. Classifier

39. If the class label of each training sample is known then it is referred as_________. A. Supervised learning B. Unsupervised learning C. Clustering D. None of the above

40. Which of the following measure can be used to select the test attribute at each node in the

tree?

A. Attribute measure

B. Selection measure C. Attribute selection measure D. Testing measure

41. Which of the following is also an input to the POS?

A. Customers program

B. Rewards program

C. Customer rewards program

D. Induction program

42. Which of the following is the bottom-line of business intelligence?

Page 6: MC0088-MQP-

A. Data stage

B. Database management system

C. Data mining

D. Database management

43. Which of the following technique is useful to ensure consistency in naming conventions?

A. Data mining

B. Data processing

C. Data warehousing

D. Data cleaning

44. ___________types of operations are required by data warehousing in data accessing.

A. Three

B. Four

C. Two

D. None of the above

45. _________tools software allows the user to ask questions about the patterns or details in

the data.

A. Query

B. Data mining

C. OLAP

D. OLTP

46. Which of the following language is supported by DBMS?

A. C++

B. C

C. Java

D. Query

47. A priori algorithm is a ______approach.

A. Top-down

B. Down

C. Top

D. Down-top

48. Which of the following are the drawbacks of k-means clustering?

A. It does not do well with overlapping clusters.

B. The clusters are easily pulled off-center by outliers.

C. Each record is either inside or outside of a given.

D. A,B and C

49. The data tuples analyzed to build the model collectively form the______.

A. Training

B. Data

C. Training data set

D. Data set

Page 7: MC0088-MQP-

50. Which of the following approach is mainly concentrates on improving information finding and

filtering?

A. Agent-based approach

B. Database approach

C. Attribute selection approach

D. Testing approach

Part B (Two mark questions) (25 * 2 = 50)

51. ___________ helps the people to do business in an effective way.

A. Data mining

B. DBMS

C. RDBMS

D. Data processing

52. Which of the following can automatically discover the new relationships?

A. Data mining

B. Data processing

C. Database

D. None of the above

53. ________ are the association techniques of data mining.

A. One dimensional

B. Multidimensional

C. Multilevel

D. All of the above

54. Which of the following is defined as a process of centralized data management and

retrieval?

A. Data mining

B. Data warehousing

C. Data stage

D. Database management

55. Data mining is also known as____________.

A. Knowledge discovery in databases

B. Known discovery in databases

C. Knowledge domain databases

D. Knowledge discovery in data

56. Data mining tasks can be specified by using________ language.

A. Data mining

B. Data mining query

C. Query

D. None of the above

57. Identify the major task of on-line operational database system?

A. Online interaction

Page 8: MC0088-MQP-

B. Line transformation

C. Query processing

D. Online operations

58. Identify the most popular schema from the following?

A. Star schema

B. Snow flake schema

C. Snow Schema

D. None of the above

59. _________ can improve the effectiveness of marketing campaigns.

A. Business intelligence

B. Query tools

C. OLAP

D. OLTP

60. Which of the following layer consists of relational and OLAP cube services?

A. Information layer

B. Warehouse layer

C. Intelligence layer

D. Business layer

61. Business organizations can be able to gain competitive advantage with well

designed__________.

A. RI

B. BI

C. DM

D. None

62. Which of the following is helpful to identify outliners?

A. Clustering analysis

B. Combined computer and human inspection

C. Binning

D. Regression

63. Identify a problem which occurs during data integration?

A. Identification problem

B. Identity problem

C. Entity problem

D. Entity identification problem

64. ____________are the methods of data processing.

A. Data cleaning

B. Data integration and transformation

C. Data reduction

D. All of the above

65. ______ Data mining technique is used to address the problems.

A. Business intelligence

Page 9: MC0088-MQP-

B. Association

C. Clustering

D. Mining

66. Which of the following technique is used to predict group membership for data instances?

A. Smoothing

B. Classification

C. Generalization

D. Regression

67. Identify the methods that are classified based on the driven method?

A. Data driven mining

B. Query-driven mining

C. Autonomous knowledge mining

D. All of the above

68. _____algorithm can take an advantage of both top-down and bottom-up approach.

A. Partition

B. Priori

C. Pincers – Search

D. Mining

69. Which of the following is the desirable feature of an effective algorithm?

A. To reduce the I/O operations

B. To moderate I/O operations

C. To increase the I/O operations

D. Not much efficient in computing

70. __________are the dimensions for multidimensional rules.

A. Age

B. Income

C. Buys

D. All of the above

71. Which of the following can be useful in discovering similarities between websites?

A. Web structure mining

B. Content mining

C. Data mining

D. Web mining

72. Which of the following are simple text files that are automatically generated every time

someone accesses one website?

A. Log File

B. Page File

C. Page Rank

D. Web Agent

73. Which of the following queries find all the images that are similar to the given image

sample?

Page 10: MC0088-MQP-

A. Sample based

B. Image based

C. Image-sample-based

D. Image specification

74. Which of the following will works well with the standard clustering algorithm?

A. Correlation

B. Regression

C. Linear correlation

D. Clustering techniques

75. Which of the following can produce large amount of data for business processes?

A. Back office

B. Front office

C. Network applications

D. A,B and C

Part C (Descriptive questions) (4 * 10 = 40)

1. A) Explain the working and meaning of data mining [5 marks]

B) Explain the various definitions of data mining and also explain how data mining work [5

marks]

2. A) What are the various techniques of data mining [4 marks]

B) Explain the each technique of data mining briefly [6 marks]

3. A) Explain the terms classification and prediction [5 marks]

B) Discuss the various issues regarding classification and prediction [5 marks]

4. A) What are the various applications of data mining? [3 marks]

B) Discuss the various scientific applications using data mining [7 marks]

Page 11: MC0088-MQP-

Answer Keys

Part - A Part - B

Q. No. Ans. Key

Unit no./ Page no.

Q. No. Ans.Key Unit no./ Page no.

Q. No. Ans. key

Unit no./

Page no.

1 D 1/03 26 B 10/189 51 A 1/02

2 D 1/02 27 D 10/189 52 A 1/02

3 B 1/03 28 A 10/190 53 D 1/05

4 C 1/01 29 D 10/191 54 B 1/05

5 D 1/04 30 C 11/208 55 A 1/02

6 D 2/15 31 D 11/205 56 B 2/27

7 B 2/16 32 A 11/205 57 C 2/17

8 A 2/16 33 A 11/206 58 A 2/23

9 B 3/43 34 C 11/206 59 A 3/44

10 D 3/43 35 D 11/204 60 B 3/46

11 A 4/52 36 B 11/205 61 B 3/45

12 D 5/74 37 A 8/159 62 B 4/54

13 A 5/73 38 A 8/159 63 D 4/56

14 B 5/73 39 A 8/159 64 D 4/51

15 C 5/72 40 C 8/166 65 B 5/76

16 D 5/75 41 C 1/02 66 B 5/77

17 B 5/73 42 C 1/03 67 D 5/75

18 D 6/94 43 D 2/16 68 C 6/100

19 B 6/87 44 C 2/16 69 A 6/89

20 A 7/137 45 A 3/43 70 D 6/125

21 B 7/137 46 D 5/73 71 A 9/178

22 A 9/177 47 A 6/94 72 A 9/176

23 C 10/189 48 D 7/146 73 C 10/191

24 C 10/189 49 C 8/159 74 C 11/213

25 C 10/190 50 A 9/177 75 D 11/214