18
Master of Business Administration - MBA Semester III MI0036 – Business Intelligence Tools - 4 Credits Assignment - Set- 1 (60 Marks) Note: Each question carries 10 Marks. Answer all the questions. Q.1 Define the term business intelligence tools? Briefly explain how the data from one end gets transformed into information at the other end? [10 Marks] Business Intelligence (BI) is a generic term used to describe leveraging the organizational internal and external data, information for making the best possible business decisions. The field of Business intelligence is very diverse and comprises the tools and technologies used to access and analyze various types of business information. These tools gather and store the Data and allow the user to view and analyze the information from a wide variety of dimensions and thereby assist the decision- makers make better business decisions. Thus the Business Intelligence (BI) systems and tools play a vital role as far as organizations are concerned in making improved decisions in the current cut throat competitive scenario. In simple terms, Business Intelligence is an environment in which business users receive reliable, consistent, meaningful and timely information. This data enables the business users conduct analyses that yield overall understanding of how the business has been, how it is now and how it will be in the near future. Also, the BI tools monitor the financial and operational health of the organization through generation of various types of reports, alerts, alarms, key performance indicators and dashboards. Business intelligence tools are a type of application software designed to help in making better business decisions. These tools aid in the analysis and presentation of data in a more meaningful way and so play a key role in the strategic planning process of an organization. They illustrate business intelligence in the areas of market research and segmentation, customer profiling, customer support, profitability, and inventory and distribution analysis to name a few. Various types of BI systems viz. Decision Support Systems,

Mb0036 set 1&2

Embed Size (px)

Citation preview

Page 1: Mb0036 set 1&2

Master of Business Administration - MBA Semester III

MI0036 – Business Intelligence Tools - 4 CreditsAssignment - Set- 1 (60 Marks)

Note: Each question carries 10 Marks. Answer all the questions.

Q.1 Define the term business intelligence tools? Briefly explain how the data from one end gets transformed into information at the other end? [10 Marks] Business Intelligence (BI) is a generic term used to describe leveraging the organizational internal and external data, information for making the best possible business decisions. The field of Business intelligence is very diverse and comprises the tools and technologies used to access and analyze various types of business information. These tools gather and store theData and allow the user to view and analyze the information from a wide variety of dimensions and thereby assist the decision-makers make better business decisions. Thus the Business Intelligence (BI) systems and tools play a vital role as far as organizations are concerned in making improved decisions in the current cut throat competitive scenario. In simple terms, Business Intelligence is an environment in which business users receive reliable, consistent, meaningful and timely information. This data enables the business users conduct analyses that yield overall understanding of how the business has been, how it is now and how it will be in the near future. Also, the BI tools monitor the financial and operational health of the organization through generation of various types of reports, alerts, alarms, key performance indicators and dashboards. Business intelligence tools are a type of application software designed to help in making better business decisions. These tools aid in the analysis and presentation of data in a more meaningful way and so play a key role in the strategic planning process of an organization. They illustrate business intelligence in the areas of market research and segmentation, customer profiling, customer support, profitability, and inventory and distribution analysis to name a few. Various types of BI systems viz. Decision Support Systems, Executive Information Systems (EIS), Multidimensional Analysis software or OLAP (On-Line Analytical Processing) tools, data mining tools are discussed further. Whatever is the type, the Business Intelligence capabilities of the system is to let its users slice and dice the information from their organization's numerous databases without having to wait for their IT departments to develop complex queries and elicit answers.Although it is possible to build BI systems without the benefit of a data warehouse, most of the systems are an integral part of the user-facing end of the data warehouse in practice. In fact, we can never think of building a data warehouse without BI Systems. That is the reason; sometimes, the words ‘data warehousing’ and ‘business intelligence’ are being used interchangeably.

Page 2: Mb0036 set 1&2

Below Figure depicts how the data from one end gets transformed to information at the other end for business information.

Roles in Business Intelligence project:A typical BI Project consists of the following roles and the responsibilities of each of these roles are detailed below:

Project Manager:o Monitors the progress on continuum basis and is responsible for the success

of the project.

Technical Architect:o Develops and implements the overall technical architecture of the BI system,

from the backend hardware/software to the client desktop configurations.

Database Administrator (DBA):o Keeps the database available for the applications to run smoothly and also

involves in planning and executing a backup/recovery plan, as well as performance tuning.

ETL Developer:o Involves himself in planning, developing, and deploying the extraction,

transformation, and loading routine for the data warehouse from the legacy systems.

Front End Developer:o Develops the front-end, whether it be client-server or over the web.

OLAP Developer:o Dexlops the OLAP cubes.

Data Modeler:o Is responsible for taking the data structure that exists in the enterprise and

model it into a scheme that is suitable for OLAP analysis.

QA Group:o Ensures the correctness of the data in the data warehouse.

Trainer:

Page 3: Mb0036 set 1&2

o Works with the end users to make them familiar with how the front end is set up so that the end users can get the most benefit out of the system.

Q. 2 what do you mean by data ware house? What are the major concepts and terminology used in the study of data warehouse? [10 Marks]

Q.3 what are the data modeling techniques used in data warehousing environment? [10 Marks] Q.4 Discuss the categories in which data is divided before structuring it into data ware house? [10 Marks] Q.5 Discuss the purpose of executive information system in an organization? [10 Marks] Q.6 Discuss the challenges involved in data integration and coordination process? [10 Marks]

Page 4: Mb0036 set 1&2

Master of Business Administration - MBA Semester III MI0036 – Business Intelligence Tools - 4 Credits Assignment - Set- 2 (60 Marks) Note: Each question carries 10 Marks. Answer all the questions. Q.1 Explain business development life cycle in detail? [10 Marks]Business Intelligence (BI) lifecycle refers to the computer-based techniques used in gathering, evaluating business information, such as sales revenue by products and/divisions associated prices and profits. The BI lifecycle frequently aims to enhance superior business decision-making techniques. The BI lifecycle representation mainly highlights on the iterative approach which is necessary to effectively pull out the greatest profit from investment in Business Intelligence. This approach is iterative as the BI solution needs to progress as and when the business develops. A quick example is a customer who developed a good product for standard selling cost per unit. This made sense and facilitated them to discover price demands in their market and take suitable action. On the other hand when they obtained a company whose standard selling cost was 100 times greater than the metric became deformed and required a second thought. The BI lifecycle model begins with Design phase which has suitable key performance indicators; It executes the structure with proper methodologies and equipment’s. The Utilise phase involves its own performance cycle. Plan phase concludes what value the key performance indicators should be, and Monitors what they are. Analyses phase is to recognize the variations. Monitoring will use the control panels and scorecards to explain the information quickly and evidently. Analyses will take the benefit of new BI tools such as Excel Microsoft ProClarity, QlikView or other professional software to investigate the information and genuinely know trends in the essential data. A significant step is then to take stock of the procedure. It is to find out how it is performing or it requires any alterations. The Refine phase is critical as it takes BI to the next significant phase, and is often misplaced from a BI program in the excitement of a successful execution. The Business Intelligence Lifecycle acts similar to the Data Maturity1 Lifecycle to offer a complete maturation model on information intelligence. Also this model gives a roadmap to utilise certain rules which helps in the development of a business intelligence program. BI Program is a long term scheme but not a short term. As the business goes on changing, and knowledge that supports analytics also improves, this lifecycle can repeat and forms a complete new performance management. Note: Key performance indicators (KPI) are very significant factors in business intelligence operation. It is the aspect, which gives information about present status of business and future action plan to develop the business. Q.2 Discuss the various components of data warehouse? [10 Marks] The data warehouse architecture is based on a relational database management system server that functions as the central repository for informational data. Operational data and processing is completely separated from data warehouse processing. This central information repository is surrounded by a number of key components designed to make the entire environment functional, manageable and accessible by both the operational systems that source data into the warehouse and by end-user query and analysis tools.

Page 5: Mb0036 set 1&2

Data Warehouse Database The central data warehouse database is the cornerstone of the data warehousing environment. This database is almost always implemented on the relational database management system (RDBMS) technology.

Parallel relational database designs for scalability that include shared-memory, shared disk, or shared-nothing models implemented on various multiprocessor configurations (symmetric multiprocessors or SMP, massively parallel processors or MPP, and/or clusters of uni- or multiprocessors).

An innovative approach to speed up a traditional RDBMS by using new index structures to bypass relational table scans.

Multidimensional databases (MDDBs) that are based on proprietary database technology; conversely, a dimensional data model can be implemented using a familiar RDBMS. Multi-dimensional databases are designed to overcome any limitations placed on the warehouse by the nature of the relational data model.

Sourcing, Acquisition, Cleanup and Transformation Tools A significant portion of the implementation effort is spent extracting data from operational systems and putting it in a format suitable for informational applications that run off the data warehouse. The data sourcing, cleanup, transformation and migration tools perform all of the conversions, summarizations, key changes, structural changes and condensations needed to transform disparate data into information that can be used by the decision support tool. These tools also maintain the meta data. The functionality includes:

Removing unwanted data from operational databases Converting to common data names and definitions Establishing defaults for missing data Accommodating source data definition changes

The data sourcing, cleanup, extract, transformation and migration tools have to deal with some significant issues including:

Database heterogeneity. DBMSs are very different in data models, data access language, data navigation, operations, concurrency, integrity, recovery etc.

Data heterogeneity. This is the difference in the way data is defined and used in different models - homonyms, synonyms, unit compatibility (U.S. vs metric), different attributes for the same entity and different ways of modeling the same fact.

These tools can save a considerable amount of time and effort. However, significant shortcomings do exist.

Meta data Meta data is data about data that describes the data warehouse. It is used for building, maintaining, managing and using the data warehouse. Meta data can be classified into: Technical meta data, which contains information about warehouse data for use by warehouse designers and administrators when carrying out warehouse development and management tasks. Business meta data, which contains information that gives users an easy-to-understand perspective of the information stored in the data warehouse. Access Tools The principal purpose of data warehousing is to provide information to business users for strategic decision-making. These users interact with the data warehouse using front-end

Page 6: Mb0036 set 1&2

tools. Many of these tools require an information specialist, although many end users develop expertise in the tools. Tools fall into four main categories: query and reporting tools, application development tools, online analytical processing tools, and data mining tools. Query and Reporting tools can be divided into two groups: reporting tools and managed query tools. Reporting tools can be further divided into production reporting tools and report writers. Production reporting tools let companies generate regular operational reports or support high-volume batch jobs such as calculating and printing paychecks. Report writers, on the other hand, are inexpensive desktop tools designed for end-users. A critical success factor for any business today is the ability to use information effectively. Data mining is the process of discovering meaningful new correlations, patterns and trends by digging into large amounts of data stored in the warehouse using artificial intelligence, statistical and mathematical techniques. Data Marts As data warehouses contain larger amounts of data, organizations often create ‘data marts’That is precise, specific to a department or product line. Thus data mart is a physical andLogical subset of an Enterprise data warehouse and is also termed as a department-specificData warehouse. Generally, data marts are organized around a single business process.There are two types of data marts; independent and dependent. The data is fed directly fromThe legacy systems in case of an independent data mart and the data is fed from the enterpriseData warehouse in case of a dependent data mart. In the long run, the dependent data martsAre much more stable architecturally than the independent data marts.

Q.3 Discuss data extraction process? What are the various methods being used for data extraction? [10 Marks] Data Extraction is the act or the process of extracting data out of, which is usually unstructured or badly structured, data sources for added data processing or data storage or data migration. This data can be extracted from the web. The internet pages in the html1, xml2 etc can be considered to be unstructured data source because of the wide variety in the code styles. This also includes exceptions and violations of the standard coding practices.The Logical Extraction Methods There are two kinds of logical extraction methods: Full Extraction Incremental Extraction Full Extraction In full extraction the data is extracted totally from the source system. Since, this extraction reflects all the data which is presently available on the source system there will be no need to keep track of the changes to the data source since the previous successful extraction. The source data will be given as it is and no additional logical information for example timestamps is required on the source site. Incremental Extraction

Page 7: Mb0036 set 1&2

At a particular point in time, only the data that has been altered since a well-defined event back in the history will be extracted. This event might be the last time of extraction or a more difficult business event like the last booking day of a fiscal period. To recognize this delta change there must be an option to recognize all the changed information since this particular time event. This information can be either given by the source data like an application column. This might reflect the last changed timestamp or a changed table where an appropriate additional mechanism will keep track of the changes apart from the originating transactions. In most of the cases, using the latter method means adding the extraction logic to the source system. 10.5.2 Physical Extraction Methods The data can be either extracted online from the source system or from an offline structure. Such an offline structure may already exist or it may be created by an extraction routine.The following are the methods of physical extraction: Online Extraction Offline Extraction

Online Extraction In online extraction the data is extracted directly from the source system itself. The extraction process can then connect directly to the source system to access the source tables themselves or to an intermediate system that keeps the data in a preconfigured manner. For example, snapshot logs or change tables. Offline Extraction The data is not extracted directly from the source system but is kept explicitly outside the original source system. The data already has an existing structure for example, redo logs, archive logs or transportable table spaces. The following structures can be considered: Flat files: In flat files the data is in a defined, generic format. The Additional information about the source object is required for further processing. Dump files: In Dump files the information about the containing objects is included. Redo and archive logs: In redo and archive logs the information is in a special, additional dump file. Transportable table spaces: Transportable table spaces are a powerful way to extract and move large volumes of data between Oracle databases. Oracle Corporation suggests that the transportable table spaces can be used whenever possible, because they can provide significant advantages in performance and manageability over the other extraction techniques.

Q.4 Discuss the needs of developing OLAP tools in details? [10 Marks] The Online Analytic Processing is the ability to store and manage the data in a way, so that it can be used effectively to generate the actionable information. The OLAP is between the Data Warehouse and the End-user tools. OLAP can make the Business Intelligence happen by enabling the following:- changing the data into multi-dimensional cubes. summarizing the pre-aggregated and the delivered data.

Page 8: Mb0036 set 1&2

establishing a strong query management. Making multitude of calculation and modeling functions.

The following explains the OLAP architecture in BI architecture:

Multi Dimensional Online Analytical Processing (MOLAP): Storage of OLAP data in the multi-dimensional mode. There is one array for one combination of dimension and also the associated measures. In this method there is no link between the MOLAP database and the data warehouse database for the query purpose. This means that a user cannot drill down from the MOLAP summary data to the transaction level data of data warehouse.

Relational Online Analytical Processing (ROLAP): The OLAP storing the data in the relational form in the dimensional model. This is the de-normalised form in the relational data structure. The ROLAP database of the OLAP server can be linked to the Data warehouse database.

Hybrid Online Analytical Processing (HOLAP): Storing the aggregated data in the multi-dimensional model in the OLAP database and keeping the transactional level data in the relational form in the Data Warehouse database. There is a link between the summary MOLAP database of OLAP and the relational transactional database of Data warehouse. OLAP Defined The OLAP can be stated in terms of just five keywords – Fast Analysis of Shared Multidimensional Information. Fast, so that the most complex queries which requires not more than 5 seconds can be processed. Analysis is the process of analysing information of all the relevant kinds in order to process the complex queries and also to set up clear criteria for the results of such queries. The information that has to be used for analysis is normally obtained from a shared source, such as data warehouse. Presented in such a multidimensional detail, such data can be useful and important to managerial decision-making. 12.7.1 OLAP Techniques The Online analytical processing or OLAP can be implemented in many different ways. But, the most common way is to stage the information obtained from various corporate databases, for example data warehouses, is staged that is stored temporarily into the OLAP multi-dimensional databases for recovery by the front-end systems. The multidimensional database can be optimized for fast recovery. There are several techniques for speeding up the data retrieval and the analysis is implemented on the procedural side of the database management. The OLAP can be implemented by using the following techniques: Consolidation or Roll Up

The Consolidation involves data aggregation which can involve simple roll-ups or complex grouping regarding inter-related data. For example, the sales office can be rolled-up to district and the district again to regions. Drill-down

The OLAP can go in the reverse direction and can display detailed data that consists of consolidated data. This is known as drill-down. For example, consider the sales done by

Page 9: Mb0036 set 1&2

individual products or by the sales-representatives that will make up a region s total sales ‟can be easily accessed. Slicing and dicing

The slicing and dicing relates to the ability to look at the database from different viewpoints. One slice of the sales database may show all the sales of product type within a region. Another slice may show all the sales by sales channel, present within each product type. The Slicing and dicing is regularly performed along the time axis in order to analyse the trends and the final patterns. Benefits of using OLAP OLAP responsible for several benefits for businesses: OLAP helps the managers in decision making through the multidimensional data views which is capable of giving, hence increasing their productivity.

OLAP applications are self-sufficient due to the inbuilt flexibility given to the organised databases.

OLAP allows simulation of the business models and problems, through the widespread use of analysis capabilities.

OLAP in conjunction with the data warehousing can be used to give reduction in the application backlog, leading to faster information retrieval and reduction in the query drag.

Q.5 what do you understand by the term statistical analysis? Discuss the most important statistical techniques? [10 Marks] Data mining is a relatively new data analysis technique. It is very different fromquery and reporting and multidimensional analysis in that is uses what is called adiscovery technique. That is, you do not ask a particular question of the data but rather use specific algorithms that analyze the data and report what they have discovered.Unlike query and reporting and multidimensional analysis where the user has to createand execute queries based on hypotheses, data mining searches for answers to questionsthat may have not been previously asked. This discovery could take the form of findingsignificance in relationships between certain data elements, a clustering together ofspecific data elements, or other patterns in the usage of specific sets of data elements.After finding these patterns, the algorithms can infer rules. These rules can then be usedto generate a model that can predict a desired behavior, identify relationships among thedata, discover patterns, and group clusters of records with similar attributes.Data mining is most typically used for statistical data analysis and knowledge discovery.Statistical data analysis detects unusual patterns in data and applies statistical andMathematical modeling techniques to explain the patterns. The models are then used toforecast and predict. Types of statistical data analysis techniques include linear andnonlinear analysis, regression analysis, multivariate analysis, and time series analysis.Knowledge discovery extracts implicit, previously unknown information from the data.This often results in uncovering unknown business facts.Data mining is data driven (see Figure 4 on page 13). There is a high level of complexityin stored data and data interrelations in the data warehouse that are difficult to discoverwithout data mining. Data mining offers new insights into the business that may not be

Page 10: Mb0036 set 1&2

discovered with query and reporting or multidimensional analysis. Data mining can helpdiscover new insights about the business by giving us answers to questions we mightnever have thought to ask.Even within the scope of your data warehouse project, when mining data you want todefine a data scope, or possibly multiple data scopes. Because patterns are based onvarious forms of statistical analysis, you must define a scope in which a statisticallysignificant pattern is likely to emerge. For example, buying patterns that show differentproducts being purchased together may differ greatly in different geographical locations.To simply lump all of the data together may hide all of the patterns that exist in eachlocation. Of course, by imposing such a scope you are defining some, though not all, ofthe business rules. It is therefore important that data scoping be done in concert with someone knowledgeable in both the business and in statistical analysis so that artificialpatterns are not imposed and real patterns are not lost.Data architecture modeling and advanced modeling techniques such as those suitable formultimedia databases and statistical databases are beyond the scope

Q.6 what are the methods for determining the executive needs? [10 Marks]

An Executive Information System (EIS) is a set of management tools supporting theinformation and decision-making needs of management by combining information availablewithin the organisation with external information in an analytical framework.EIS are targeted at management needs to quickly assess the status of a business or section ofbusiness. These packages are aimed firmly at the type of business user who needs instant andup to date understanding of critical business information to aid decision making.The idea behind an EIS is that information can be collated and displayed to the user withoutmanipulation or further processing. The user can then quickly see the status of his chosendepartment or function, enabling them to concentrate on decision making. Generally an EISis configured to display data such as order backlogs, open sales, purchase order backlogs,shipments, receipts and pending orders. This information can then be used to make executivedecisions at a strategic level.The emphasis of the system as a whole is the easy to use interface and the integration with avariety of data sources. It offers strong reporting and data mining capabilities which canprovide all the data the executive is likely to need. Traditionally the interface was menudriven with either reports, or text presentation. Newer systems, and especially the newerBusiness Intelligence systems, which are replacing EIS, have a dashboard or scorecard typedisplay.Before these systems became available, decision makers had to rely on disparate spreadsheetsand reports which slowed down the decision making process. Now massive amounts of

Page 11: Mb0036 set 1&2

relevant information can be accessed in seconds. The two main aspects of an EIS system areintegration and visualisation. The newest method of visualisation is the Dashboard andScorecard. The Dashboard is one screen that presents key data and organisational informationon an almost real time and integrated basis. The Scorecard is another one screen display withmeasurement metrics which can give a percentile view of whatever criteria the executivechooses.Behind these two front end screens can be an immense data processing infrastructure, or acouple of integrated databases, depending entirely on the organisation that is using thesystem. The backbone of the system is traditional server hardware and a fast network. TheEIS software itself is run from here and presented to the executive over this network. Thedatabases needs to be fully integrated into the system and have real-time connections both inand out. This information then needs to be collated, verified, processed and presented to theend user, so a real-time connection into the EIS core is necessary.Executive Information Systems come in two distinct types: ones that are data driven, andones that are model driven. Data driven systems interface with databases and datawarehouses. They collate information from different sources and presents them to the user inan integrated dashboard style screen. Model driven systems use forecasting, simulations anddecision tree like processes to present the data.As with any emerging and progressive market, service providers are continually improvingtheir products and offering new ways of doing business. Modern EIS systems can alsopresent industry trend information and competitor behaviour trends if needed. They can filterand analyse data; create graphs, charts and scenario generations; and offer many other optionsfor presenting data. There are a number of ways to link decision making to organisational performance. From adecision maker's perspective these tools provide an excellent way of viewing data. Outcomesdisplayed include single metrics, trend analyses, demographics, market shares and a myriadof other options. The simple interface makes it quick and easy to navigate and call theinformation required.For a system that seems to offer business so much, it is used by relatively few organisations.Current estimates indicate that as few as 10% of businesses use EIS systems. One of thereasons for this is the complexity of the system and support infrastructure. It is difficult tocreate such a system and populate it effectively. Combining all the necessary systems anddata sources can be a daunting task, and seems to put many businesses off implementing it.The system vendors have addressed this issue by offering turnkey solutions for potential

Page 12: Mb0036 set 1&2

clients. Companies like Actuate and Oracle are both offering complete out of the boxExecutive Information Systems, and these aren't the only ones. Expense is also an issue. Oncethe initial cost is calculated, there is the additional cost of support infrastructure, training, andthe means of making the company data meaningful to the system.Does EIS warrant all of this expense? Green King certainly thinks so. They installed aCognos system in 2003 and their first few reports illustrated business opportunities in excessof £250,000. The AA is also using a Business Objects variant of an EIS system and theyexpect a return of 300% in three years. (Guardian 31/7/03)An effective Executive Information System isn't something you can just set up and leave it todo its work. Its success depends on the support and timely accurate data it gets to be able toprovide something meaningful. It can provide the information executives need to makeeducated decisions quickly and effectively. An EIS can provide a competitive edge tobusiness strategy that can pay for itself in a very short space of time.