3
Three reasons why time has come for Data Virtualization to play pivotal role in Data Management Off late, Data Management related challenges are increasing exponentially. Reasons are many i.e need for quick response, ever increasing volume of data, different types of data i.e text, objects etc., and too many types of sources (In addition to traditional sources like ERP’s, legacy systems sources like social media, web logs, sensor’s etc.). Apart from this, the number and variety of tools which are handling data is also increasing day by day. All the enterprises need to continuously evaluate and explore better data management technologies, processes & frame works to handle these challenges and provide business value. Data Virtualization is best suited to address most off these challenges and can provide unified & secure data access layer, can provide business agility in meeting the customer needs and data as a service. As organizations look at how to deliver a real-time, fully agile, integrated, and secure data platform to support applications, they think of data virtualization. Since Forrester’s last evaluation to the most recent evaluation, data virtualization vendors have improved their security, scalability, big data, data discovery, data quality, and cloud capabilities. Data virtualization can have transactions that can write back to the data sources as well. Hybrid Data Storage: With exponentially increasing volumes of data, it is becoming difficult for Data Warehouses (DW) to integrate, store and handle large volumes of data as part of conventional DW storage, which are normally RDBM’s based SQL systems. This results in poor performance w.r.t data integration & data

Data virtualization

Embed Size (px)

Citation preview

Page 1: Data virtualization

Three reasons why time has come for Data Virtualization to play pivotal role in

Data Management

Off late, Data Management related challenges are increasing exponentially. Reasons are many

i.e need for quick response, ever increasing volume of data, different types of data i.e text, objects etc.,

and too many types of sources (In addition to traditional sources like ERP’s, legacy systems sources like

social media, web logs, sensor’s etc.). Apart from this, the number and variety of tools which are

handling data is also increasing day by day. All the enterprises need to continuously evaluate and

explore better data management technologies, processes & frame works to handle these challenges and

provide business value.

Data Virtualization is best suited to address most off these challenges and can provide unified &

secure data access layer, can provide business agility in meeting the customer needs and data as a

service. As organizations look at how to deliver a real-time, fully agile, integrated, and secure data

platform to support applications, they think of data virtualization. Since Forrester’s last evaluation to the

most recent evaluation, data virtualization vendors have improved their security, scalability, big data,

data discovery, data quality, and cloud capabilities. Data virtualization can have transactions that can

write back to the data sources as well.

Hybrid Data Storage:

With exponentially increasing volumes of data, it is becoming difficult for Data Warehouses

(DW) to integrate, store and handle large volumes of data as part of conventional DW storage, which are

normally RDBM’s based SQL systems. This results in poor performance w.r.t data integration & data

Page 2: Data virtualization

retrieval, expensive data storage & administration. We can handle this situation by keeping limited

amount of data in the DW i.e by either removing the data or moving portions of data to offline storage.

In both the scenario's, quick historical analysis of data is not possible. The third option is to move some

of the data to an inexpensive online storage system like Hadoop Eco System, which can handle large

volumes of data at very low price points. But the way you should interface with Hadoop system

compared to a normal SQL Database is completely different. That is where we can have Data

Virtualization layer which can connect to both Hadoop system and SQL databases and provide unified &

complete access of the date to Reporting Layer. In this scenario, the DW is not one single and uniform

storage, but it consists of multiple & different types of storages. This solves the cloud migration problem

as well.

Real Time Data Integration & Virtual Data Marts/Applications:

With ever increasing volumes, data sharing, collaboration & widespread data movement is a

major challenge. Also Traditional Data Marts/Application's (which are associated with direct physical

storage) has lots of issues. First one is they are mainly inflexible. When these objects needs to be

extended/changed, all the ETL programmes behind these objects needs to be changed, data needs to be

recasted etc. There are issues related to data quality as well. Since our experience shows that

organizations tend to build lot of overlapping objects like these over a period of time, maintaining single

version of truth is difficult. With Virtual DataMart’s/Application's which are possible using Data

Virtualization, the above issues can be avoided while providing the same level of functionality to the

reporting layer. All the data storage objects like tables etc. will be stored as virtual objects/views in

virtualization layer. Performance of the Virtual objects can be improved by applying caching techniques

etc. Logical structure is separated from Physical structure and can leverage the advances in techniques

such as real-time & push-down query optimization, selective ETL, in-memory caching, distributed in-

memory thus reducing or eliminating the wasteful full replication of traditional ETL.

Common Data Layer with data as a service:

C Level leadership of the companies (in particular CIO's and CDO's) look not only at short term tactical

needs but also future needs (both strategic and tactical) when it comes to satisfying the data needs of

different stakeholders of the organization. Through Data Virtualization this concern can be addressed by

providing/exposing required data through a common/unified data layer in the form of data as a service

while satisfying the short term data integration/reporting needs. Data will be treated as an Asset. This

unified layer can help business & technical users in terms of quick consumption, navigation, discovery,

lineage analysis etc. This also helps in maintaining the highest levels of security as data is not

fragmented into too many silos and can be controlled through centralized access mechanism.

In the coming years, data virtualization can be taken to the next level by vertical integration i.e

integrating it through end to end physical infrastructure (server & network etc. including the cloud). This

will result in effective data movement, completely virtualized environments and minimal performance

related issues.

Page 3: Data virtualization

Disclaimer

All the information provided above is based on my understanding of the subject/concept and

the information I have taken from different survey’s etc. (which were indicated above). Accuracy of the

information is based on the current understanding. As the time passes, some concepts may undergo

change or evolve further. If you find any incorrect information, please feel free to contact me or drop a

mail.