13
DATABASE TRANSFORMATION November 2016 ABSTRACT The continuous IT budget reductions, the high database license and maintenance prices, the upcoming SAP support shortfall for ORACLE, the new hadoop and in-memory database technologies and the increasing requirements for non-structured data and real time big data analysis are pushing the CIOs over all business verticals to undertake a complete transformation of their database architecture. We discuss here how this new architecture could look like and which steps are necessary in order to safely walk this route and master this new challenge.

DATABASE TRANSFORMATION - Proventa AG · 2019-07-12 · 1 Overview Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the market. The main

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DATABASE TRANSFORMATION - Proventa AG · 2019-07-12 · 1 Overview Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the market. The main

DATABASE

TRANSFORMATION November 2016

ABSTRACT The continuous IT budget reductions, the high database license and maintenance prices, the upcoming SAP support shortfall for ORACLE, the new hadoop and in-memory database technologies and the increasing requirements for non-structured data and real time big data analysis are pushing the CIOs over all business verticals to undertake a complete transformation of their database architecture. We discuss here how this new architecture could look like and which steps are necessary in order to safely walk this route and master this new challenge.

Page 2: DATABASE TRANSFORMATION - Proventa AG · 2019-07-12 · 1 Overview Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the market. The main

1

Overview

Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the

market. The main reason for this was the great stability and transaction safety of their product.

Other factors like the alliances with key application and hardware vendors, their multiple OS

support, their good training concept, the large amount of specialist and their aggressive sales and

acquisition strategies have also helped to preserve their lead on this kind of technology.

During this period of time ORACLE was very careful on the enhancement and upgrade of their core

technology. The main features of the core database system remained almost unchanged over the

years and most of the new technologies were added as new software layers (i.e. RAC, queuing,

partitioning, data guard, spatial, OLAP, In-Memory). This strategy had advantages for the

robustness and backwards compatibility and also helped the sales cycle by increasing the price of

the database with the addition of independent but very critical features.

Thanks to this strategy, the quality of the core software and their good reputation, CIOs over the

world did not hesitate to buy the product and remained good customers. An eventual vendor

change would have been too risky, no cheaper and reliable alternatives were on the market and

the IT budgets were generous.

However, during the last five years the situation has changed and CIOs are or should start thinking

about a change. We have listed here the reasons for a change, the factors to be considered, a list

of key technologies that will help on the way, an example of a state of the art DB architecture, a

recommended methodology and a business case example.

Page 3: DATABASE TRANSFORMATION - Proventa AG · 2019-07-12 · 1 Overview Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the market. The main

2

7 reasons for a change

1 – Database license costs

During the last years, server costs were reduced dramatically and their performance was increased

even more. The main factor for the cost reduction was the hardware standardisation, but the

acceptance of open source operating systems (Linux), open source Web technology (Apache) and

open source development tools (i.e. JAVAEE, Eclipse, or GIT) has also contributed to this trend.

On the other hand, database license costs did not decrease and are now one of top costs of the IT

infrastructure. These costs are even increasing once companies start implementing spatial or in-

memory data analytics features on top of the existing database architecture.

2- SAP database and mobile support

SAP has communicated that it will stop its ORACLE support until 2025 (Source: SAP Roadmap). Even

now SAP HANA is mandatory for fact sheets, analytics apps and most of the rest of the FIORI Apps.

If you want to transform your IT according to the new mobile requirements before 2025, you will

need to migrate sooner your SAP database to SAP HANA.

3 – Open source maturity

The open source movement has led to a number of alternatives to large, complex, and expensive

relational database management systems (RDBMSs) for addressing most enterprise data

management problems. Open-Source RDBMSs (OSRDBMSs) have matured significantly and can

now be used to replace commercial RDBMSs. Gartner already stated on its report in April 2015

“The most demonstrable benefit of OSRDBMs given their increasing suitability from a technology

perspective, is the TCO of these products. When skills were at a minimum, management tools were

few and the software was relatively immature the TCO was not necessarily lower than those of

commercial vendor offerings. That has changed to the point where we now believe that the cost of

managing OSDBMSs and the availability of skills are now close to parity with those of the

commercial DBMS offerings.” A leading example of this development is EDB Postgres, an open

source variant of the pioneering RDBMS project PostgreSQL, which was originally developed by

Dr. Michael Stonebraker and his team at the University of California at Berkeley. Thanks to a very

active open source community, this RDBMS has continued to evolve aggressively to meet the

needs of business users for both analytics and transaction support.

Page 4: DATABASE TRANSFORMATION - Proventa AG · 2019-07-12 · 1 Overview Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the market. The main

3

2009

2015

RDBMS Maturity Evaluation (Source: Gartner)

4 – Unstructured data

Unstructured data is growing significantly faster than structured data. According to Gartner 80% of

business is conducted on unstructured information. As a result, enterprise expenditure on filers is

growing, and IT executives know that action is required to ensure that this expenditure does not

grow out of control. On the other hand, RDBMS databases were not originally designed to process

non-relational data. Some additional software layers and features were added to commercial

RDBMS like ORACLE in order to handle this kind of data, but even with these enhancements the

performance of such databases with this kind of data is poor and the processing of non-structured

data slows down the whole database system. New data storage and data processing concepts as

Hadoop, Kafka, Spark or Storm, originally developed at large companies as Google, IBM or Yahoo

were donated to open source communities where they were enhanced and now make it easy to

reliable process unbounded quantities of unstructured data without paying any licence fees.

5 – Big Data

According to Wikipedia “Big data is a term for data sets that are so large or complex that

traditional data processing applications are inadequate to deal with them. Challenges include

analysis, capture, data curation, search, sharing, storage, transfer, visualization, querying,

updating and information privacy.” In 2012 Gartner defined Big Data as "high volume, high

velocity, and/or high variety information assets that require new forms of processing to enable

enhanced decision making, insight discovery and process optimization." This data can both be

structured or unstructured. Unstructured data as explained above cannot be processed efficiently

on RDBMS systems. But classical RDBMSs have even difficulties handling structured big data. The

work may require "massively parallel software running on tens, hundreds, or even thousands of

servers". Jacobs, A. (6 July 2009) The Pathologies of Big Data, ACMQueue.

6 – Real time analysis

With the increasing amount of available real time data, new business requirements and new

business cases for this data are being developed over all industries. In order to monetarize such

new concepts, the data has also to be analysed on real time. Wikipedia explains the term Real-

Time Analytics also called Real Time Business Intelligence as follows: “Real-time business

Page 5: DATABASE TRANSFORMATION - Proventa AG · 2019-07-12 · 1 Overview Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the market. The main

4

intelligence (RTBI) is the process of delivering business intelligence (BI) or information about

[business operations] as they occur. Real time means near to zero latency and access to information

whenever it is required.”

Classical transactional RDBMSs as ORACLE are not fast enough processing this data for analysis. The

main reason being that transactional systems have to assure the transactional integrity and in order

to do so they are optimized to store information permanently and this is done on the hard drive

(HD). Saving information on HDs takes much longer than processing the data only on the RAM, this

is why new in-memory databases were developed for analysis purposes. Such databases do most

of the operations on the RAM and store only a small amount of data on the hard drive.

ORACLE offers now new in-memory DB features. These features are still not as performant as the

products of other new innovative in-memory databases companies and the combined price of the

ORACLE regular database with the in-memory features increases the database TCO significantly.

7 – Standard hardware and operating systems

The first operating systems were developed for mainframes. These operating systems were

extreme proprietary and could only work on a specific hardware. Both hardware and OS were very

expensive and only few companies could afford them. In the mid-1970s, with the introduction of

microcomputers new operating systems were developed. Both hardware and OS were considerably

less expensive than mainframes and hardware and OS started being developed independently from

each other. However, such systems were not reliable or performant enough to compete with

mainframes as commercial servers. With the release of Intel’s 32-bit architecture and multitasking

OS for microcomputers in the mid-1980s the gap between microcomputers and mainframes started

being reduced. The Linux kernel originated in 1991, was a milestone on this path. In the mid-1990s

organizations such as NASA started to replace their increasingly expensive mainframes with clusters

of inexpensive commodity computers running on Linux. Nowadays Linux is the leading operating

system on servers and more than 95% of servers run on Intel’s microarchitectures. Nevertheless, a

large amount of traditional companies as banks, telcos or insurances still uses obsolete hardware

and OS for some or their core applications. The main reason being that these applications were

developed over the decades and a migration of data and logic to standard software and hardware

was considered to be too risky. Today, this lack of innovation is both expensive and dangerous for

the company and its CIO, as hardware, software and staff are getting older and difficult to maintain.

Additionally, this obsolete infrastructure is slowing down the business innovation process, making

these branches a good target for market disrupters as Google, Amazon or Apple.

In order to retire the obsolete hardware and operating systems the data has to be migrated to a

new database architecture. Such a step should be done to a state of the art and future-oriented

database system. Open Source RDBM systems are to DB what Linux was to OS and should therefore

be the preferred migration target.

6 Factors to be considered

1 – Limited IT Budget

Page 6: DATABASE TRANSFORMATION - Proventa AG · 2019-07-12 · 1 Overview Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the market. The main

5

According to Gartner, IT spending worldwide declined the past two years. Gartner estimates the

data center costs to be above 40% of the total budget and are therefore a prime target for cost-

savings programs. Any change on this area should therefore have a clear business case, a good

return on investment and if possible a break-even of less than two years.

2 – IT Safety

Databases are a critical part of the IT infrastructure. If the database system fails most of the

business processes will stop working. On the other hand, database bugs may produce data

corruption which will have a major economic impact on any company. Therefore, reliability and

robustness are still the most important aspects of any database infrastructure and should be the

main requirements of the new infrastructure.

3 –IT availability and continuity

Nowadays, business processes run 24/7 and any disruption has a direct monetary impact.

Database downtimes, even during the database migration should be keep to a minimum and if

possible be avoided.

4 – Staff

The shortage of IT experts on the market is a main concern of any CIO. On the other hand, missing

or obsolete IT skills are not valid reasons for lay-off on most European countries. Therefore, any

database transformation process in Europe should make sure that the actual staff is being trained

on the new technology, the technology should be intuitive, easy to use, well documented and the

interfaces should be similar to the existing ones.

5 – Legacy systems

ORACLE being the lead database system over the last decades, most of the non-standard

applications created from scratch at the companies (legacy) base on ORACLE databases and use

the most popular ORACLE non ANSI standard features as PL/SQL, hints, partitioning, SQL*Loader

and dblinks. In order to migrate such databases easily, the new database architecture should

support these features.

6 – Cloud

The next step on hardware standardisation and data center cost reduction after the rolling out of

the microcomputer architecture and Linux as OS is the server virtualization. This allows the CIO to

easily move the IT to newer, faster and more cost-effective data centers.

Any new database architecture should therefore be designed to run on virtual machines on a cloud.

Page 7: DATABASE TRANSFORMATION - Proventa AG · 2019-07-12 · 1 Overview Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the market. The main

6

6 Technology assets

1 – SAP/HANA

Being SAP the leading ERP system, a migration of SAP based applications to SAPs new state of the

art SAP/HANA database should always be considered. The main arguments for SAP/HANA are its

embedded SAP application support, its in-memory features and its unique support of FIORI apps

which enhance the mobile use of SAP software. Open Source RDBMS are not well suited for SAP

applications because they are not supported and the use of them will put the SAP applications at

risk. ORACLE, on the other hand, will only be supported until 2025. Nevertheless, SAP/HANA is an

expensive database system and not suited for cost reduction purposes of non-SAP applications.

2 – EnterpriseDB

Many enterprises are using open source RDBMSs to relieve costs. Because these RDBMSs are often

easier to administer and more flexible than alternatives, they yield staff time savings and greater

operational flexibility. Such enterprises have not relaxed their operational requirements, however.

These open source RDBMSs not only must meet the same standards of reliability, scalability, and

manageability as the RDBMSs they replace but also, in many cases, must exceed them.

The EDB Postgres Platform features the full range of capabilities one would expect of an enterprise-

class RDBMS, building on PostgreSQL and adding greater performance, security, database

administrator (DBA) and developer productivity features, and compatibility with traditional

enterprise RDBMSs. The EDB Postgres Platform can be deployed to a wide range of infrastructure

options from virtualized and container environments to public, private, and hybrid clouds.

Professional services, training, 24 x 7 support, and Remote DBA round out the platform ensuring

enterprise customer success.

According to Wikipedia Gartner positioned EnterpriseDB in the Leaders Quadrant in its Magic

Quadrant for Operational Database Management Systems in October 2014 and again in September

2015. EnterpriseDB was recognized in the Challengers Quadrant in the Magic Quadrant for

Operational Database Management Systems in October 2016.

Thanks to its ORACLE compatibility features (i.g. data structures, syntax, semantics, PL/SQL,

functions, packages, utilities and replication services) EDB is a perfect target RDBMS for cost

reduction of non-SAP transactional applications running on ORACLE.

3 – Exasol

EXASOL is an analytic in-memory, column-based, compressed, massively parallelized, high-

scalable, tunning-free database that also includes support for Hadoop HDFS formats. The high-

speed database is acknowledged by Gartner in its "Magic Quadrant for Data Warehouse Database

Management Systems" as the only German database vendor besides SAP.

According to the TPC-H benchmark Exasol is the #1 ad-hoc decision support (BI) database system

and has the best price-peformance ratio for database analytics.

Page 8: DATABASE TRANSFORMATION - Proventa AG · 2019-07-12 · 1 Overview Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the market. The main

7

EXASOL is perfect choice for applications requiring real time data analytics and data driven

businesses.

4 – Hadoop

Apache Hadoop is an open-source software framework used for distributed storage of very large

data sets. The core of Apache Hadoop consists of a storage part, known as Hadoop Distributed

File System. Hadoop splits files into large blocks and distributes them across nodes in cluster.

Hadoop is the ideal solution for storage and management of unstructured data. The major cloud

implementations of Hadoop platforms are Microsoft Azure, Amazon Webservices, Google Cloud

Platform and Century Link Cloud services. Apache Hadoop is therefore the perfect complement to

a RDBMS system.

5 – Pentaho Data Integration

The wide range of information gathered by a business is rarely stored in a single database or format.

Data integration is the process by which information from multiple databases is consolidated. With

an intuitive, graphical, drag and drop design environment and a proven, scalable, standards-based

architecture, Data Integration is increasingly the choice for organizations over traditional,

proprietary ETL or data integration tools.

Pentaho Data Integration (PDI) is a powerful ETL application. Thanks to its visual interface, you can

extract information from any data source for preparation, transform it and delivery to a target

without writing a single line of code. It supports deployment on single node computers as well as

on a cloud, or cluster. PDI is written in Java and runs on almost any environment.

PDI is a very helpful data migration tool and allows you to design and create all the data migration

processes visually, schedule and run them automatically. The community version of PDI is open

source and free of charge.

Gartner has recognized Pentaho in the February 2016 Magic Quadrant for Business Intelligence and

Analytics Platforms as a Visionary Platform.

6 – Shareplex

As explained above, during a migration the database downtime should be reduced as much as

possible. A key technology is database replication, such technology uses change data capture (CDC)

methods that determines (and tracks) the data that has changed at the source database so that

action can be taken using the changed data. With this kind of software, it is possible to start a data

migration and during the data migration process track data changes and propagate them to the

target database. This way, you are able to create an identical copy of a database that keeps all

changes of the original one on real time. Once the replication is completed and stable you are able

to switch the application from the source database to the target database with virtually no

downtime.

Quest’s Shareplex offers database connectors to ORACLE, SAP/HANA and EDB and is therefore a

perfect tool for a database transformation with extreme short downtimes.

Page 9: DATABASE TRANSFORMATION - Proventa AG · 2019-07-12 · 1 Overview Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the market. The main

8

Recommended architecture

Usually an actual architecture at any industry will be similar to the figure below:

The recommended target architecture would be a mixture of the technology explained above:

Page 10: DATABASE TRANSFORMATION - Proventa AG · 2019-07-12 · 1 Overview Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the market. The main

9

Recommended Methodology

1 – Holistic approach

If an organization considers a database transformation, in order to achieve optimal savings and

performance results, the database infrastructure should be considered and planned as a whole.

Partial or local database transformations often do not achieve good results in terms of license or

maintenance savings. As an example, if the organization migrates only the SAP environment to

SAP/HANA but still uses ORACLE for the rest of the transactional applications and the DWH the

TOC may increase instead of decreasing. On the other hand, an isolated database transformation

of the DWH to an in-memory database will increase the performance but will not significantly

reduce the costs.

2 – Professional support

In order to master this complex challenge, database transformation experts with good know-how

and experience on several RDBMS systems and an understanding of both the old and the new

architecture as well as ETL and replication technologies are necessary.

2 – Good Assessment

The first step towards a transformation should be a good assessment of the actual infrastructure.

This assessment can be performed on four steps:

Collect information

Number and size of databases, used features, available environments and licenses.

Design new architecture

Size, hardware, software and database features

Calculate savings

Hardware savings, OS savings and DB savings

Define the first Transformation as a POC

Select the system, define hardware, define OS and DB, select ETL and replication

technology and calculate price and schedule

3 – Fixed price offers

Professional experienced partners should be able to make a fixed price offer for both a POC and

the global database transformation once they have collected all relevant data.

4 – 1st Transformation as a POC

In order to test the database transformation, a first database system should be selected for a

POC. This POC can then be conducted on four steps:

Document

Document storage, DB software, DB Features, Backup and Deployment

Implementation

Implement DB installation packages. Convert DB software. Implement missing features.

Page 11: DATABASE TRANSFORMATION - Proventa AG · 2019-07-12 · 1 Overview Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the market. The main

10

Implement ETL processes using appropriated technology (Pentaho and Shareplex).

Test

Test installation, data transfer, backup and recovery and the application layer.

Deployment

Install hardware and software. Connect old and new database. Transfer data and redirect

application.

5 – Complete transformation as a cycle

Once a POC was conducted successfully, the whole transformation can be performed using the

same steps on a cycle. Additionally, the old hardware has to be retired and the licence contracts

cancelled.

Business Case Example

Based on the architecture example defined above, a possible business case for the database

transformation of an SAP ERP system, 2 transactional applications (i.e. CRM and E-Shop) and a

conventional DWH all using ORACLE redundant database servers is calculated below:

Page 12: DATABASE TRANSFORMATION - Proventa AG · 2019-07-12 · 1 Overview Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the market. The main

11

Page 13: DATABASE TRANSFORMATION - Proventa AG · 2019-07-12 · 1 Overview Over the last twenty years ORACLE has managed to remain the #1 database solution vendor on the market. The main

12

PROVENTA AG Untermainkai 29

60329 Frankfurt am Main www.proventa.de

069 - 23 25 50

Diego Calvo de Nó (Member of the Executive Board)

[email protected]

+49 160 478 11 69