Upload
praveen-reddy
View
60
Download
0
Embed Size (px)
Citation preview
Three reasons why time has come for Data Virtualization to play pivotal role in
Data Management
Off late, Data Management related challenges are increasing exponentially. Reasons are many
i.e need for quick response, ever increasing volume of data, different types of data i.e text, objects etc.,
and too many types of sources (In addition to traditional sources like ERP’s, legacy systems sources like
social media, web logs, sensor’s etc.). Apart from this, the number and variety of tools which are
handling data is also increasing day by day. All the enterprises need to continuously evaluate and
explore better data management technologies, processes & frame works to handle these challenges and
provide business value.
Data Virtualization is best suited to address most off these challenges and can provide unified &
secure data access layer, can provide business agility in meeting the customer needs and data as a
service. As organizations look at how to deliver a real-time, fully agile, integrated, and secure data
platform to support applications, they think of data virtualization. Since Forrester’s last evaluation to the
most recent evaluation, data virtualization vendors have improved their security, scalability, big data,
data discovery, data quality, and cloud capabilities. Data virtualization can have transactions that can
write back to the data sources as well.
Hybrid Data Storage:
With exponentially increasing volumes of data, it is becoming difficult for Data Warehouses
(DW) to integrate, store and handle large volumes of data as part of conventional DW storage, which are
normally RDBM’s based SQL systems. This results in poor performance w.r.t data integration & data
retrieval, expensive data storage & administration. We can handle this situation by keeping limited
amount of data in the DW i.e by either removing the data or moving portions of data to offline storage.
In both the scenario's, quick historical analysis of data is not possible. The third option is to move some
of the data to an inexpensive online storage system like Hadoop Eco System, which can handle large
volumes of data at very low price points. But the way you should interface with Hadoop system
compared to a normal SQL Database is completely different. That is where we can have Data
Virtualization layer which can connect to both Hadoop system and SQL databases and provide unified &
complete access of the date to Reporting Layer. In this scenario, the DW is not one single and uniform
storage, but it consists of multiple & different types of storages. This solves the cloud migration problem
as well.
Real Time Data Integration & Virtual Data Marts/Applications:
With ever increasing volumes, data sharing, collaboration & widespread data movement is a
major challenge. Also Traditional Data Marts/Application's (which are associated with direct physical
storage) has lots of issues. First one is they are mainly inflexible. When these objects needs to be
extended/changed, all the ETL programmes behind these objects needs to be changed, data needs to be
recasted etc. There are issues related to data quality as well. Since our experience shows that
organizations tend to build lot of overlapping objects like these over a period of time, maintaining single
version of truth is difficult. With Virtual DataMart’s/Application's which are possible using Data
Virtualization, the above issues can be avoided while providing the same level of functionality to the
reporting layer. All the data storage objects like tables etc. will be stored as virtual objects/views in
virtualization layer. Performance of the Virtual objects can be improved by applying caching techniques
etc. Logical structure is separated from Physical structure and can leverage the advances in techniques
such as real-time & push-down query optimization, selective ETL, in-memory caching, distributed in-
memory thus reducing or eliminating the wasteful full replication of traditional ETL.
Common Data Layer with data as a service:
C Level leadership of the companies (in particular CIO's and CDO's) look not only at short term tactical
needs but also future needs (both strategic and tactical) when it comes to satisfying the data needs of
different stakeholders of the organization. Through Data Virtualization this concern can be addressed by
providing/exposing required data through a common/unified data layer in the form of data as a service
while satisfying the short term data integration/reporting needs. Data will be treated as an Asset. This
unified layer can help business & technical users in terms of quick consumption, navigation, discovery,
lineage analysis etc. This also helps in maintaining the highest levels of security as data is not
fragmented into too many silos and can be controlled through centralized access mechanism.
In the coming years, data virtualization can be taken to the next level by vertical integration i.e
integrating it through end to end physical infrastructure (server & network etc. including the cloud). This
will result in effective data movement, completely virtualized environments and minimal performance
related issues.
Disclaimer
All the information provided above is based on my understanding of the subject/concept and
the information I have taken from different survey’s etc. (which were indicated above). Accuracy of the
information is based on the current understanding. As the time passes, some concepts may undergo
change or evolve further. If you find any incorrect information, please feel free to contact me or drop a
mail.