47
HA100 - SAP HANA Introduction (SPS08 - 2104) Unit 3: Architecture Unit 3 Architecture Unit Overview This unit provides an overview to the architecture behind SAP HANA. Unit Objectives After completing this unit, you will be able to: Understand the architecture of SAP HANA Explain the necessity of a persistence layer Understand the difference between data and log volumes Describe the reboot process after a power failure Explain the principles of shadow paging Unit Contents Lesson: Architecture Lesson: Persistence Layer Procedure: Exercise 2: Working with Catalog Objects Lesson: Architecture Lesson Duration: 10 Minutes Lesson Overview Page 1 of 47

Ha100 unit 3 hana architecture sp08

Embed Size (px)

Citation preview

Page 1: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Unit 3 Architecture

Unit Overview

This unit provides an overview to the architecture behind SAP HANA.

Unit Objectives

After completing this unit, you will be able to:

Understand the architecture of SAP HANA

Explain the necessity of a persistence layer Understand the difference between data and log volumes Describe the reboot process after a power failure Explain the principles of shadow paging

Unit Contents

Lesson: Architecture

Lesson: Persistence Layer

Procedure: Exercise 2: Working with Catalog Objects

Lesson:

Architecture

Lesson Duration: 10 Minutes

Lesson Overview

This lesson introduces a description of SAP HANA Database architecture and components.

Lesson Objectives

After completing this lesson, you will be able to:

Page 1 of 41

Page 2: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Understand the architecture of SAP HANA

This lesson presents an overview of the SAP HANA Architecture, without getting into too much detail (which is typically the scope of course HA200). The overall lesson duration includes 10 min for the exercise.

Business Example

A customer needs to transform his landscape to improve his global performance and lessen his IT administration tasks. He needs to turn to the SAP HANA technology and learn the new architecture of the SAP HANA Appliance.

HANA System Architecture

Figure: SAP HANA Modeler

Page 2 of 41

Page 3: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

SAP HANA System Example Deployment Architecture

SAP HANA Index Server Simplified Architecture

What are the components of SAP HANA Appliance?

1. Name Server (maintains landscape information)2. Index Server (holds data and executes all operations)3. Statistics Server (collects performance data about HANA)4. XS Engine (manages XS Services)5. SAP HANA Studio Repository (repository for HANA Contents LM)

Page 3 of 41

Page 4: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

6. SAP Host Agent (enables remote start / stop)7. SAP HANA Lifecycle Manager (LM) (manages SW updates for HANA)

Components of SAP HANA Index Server

1. External Interfaces2. Request processing and Execution Control3. Relational Engines4. Storage Engine5. Disk Storage

The above 5 components are part of Index Server from an architectural point of view. The following further 4 components are also part of its working.

6. Session Manager7. Transaction Manager8. Authorization Manager9. Metadata Manager

Page 4 of 41

Page 5: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure 53: Index Server Architecture

The Index Server of the SAP HANA database is a core component that orchestrates the database’s operations.

The Connection and Session Management which creates and manages sessions and connections for the database clients such as SAP BusinessObjects Reporting tools or applications.

The Transaction Manager coordinates transactions, controls transactional isolation, and keeps track of running and closed transactions.

The client requests are analyzed and executed by a set of specialized engines and processors that handle Request Processing and Execution Control. Once a session is established, the database client typically uses SQL statements to communicate with this module. For analytical applications, the multidimensional query language MDX is also supported.

Incoming SQL requests are received by the SQL Processor. This component executes the Data Manipulation Language (DML) statements, such as INSERT, SELECT, UPDATE or DELETE. Other types of requests are delegated to other components. For example, Data Definition Language (DDL) statements, such as the definition of relational tables, columns, views, indexes and procedures, are dispatched to the Metadata Manager.

Page 5 of 41

Page 6: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure: Network Connectivity

Planning commands are routed to the Planning Engine that allows financial planning applications to execute basic planning operations in the database layer.

The SAP HANA database offers programming capabilities to execute application-specific calculations inside the database system. The SAP HANA database has its own programming languages. SQLScript is used to write database stored procedures. Procedure calls are forwarded to the Stored Procedure processor.

Incoming MDX requests are processed by the MDX engine and also forwarded to the Calculation Engine, which is a common infrastructure that also supports SQL Script, MD and Planning operations.

The Persistence Layer component manages the communication between the Index Server and the File System that store the Data volume and Transaction Log volume.

The SAP HANA Index Server performs 7 key functions to accelerate and optimize analytics. Together, these functions provide robust security and data protection and enhanced data access.

1. Connection and Session Management – This component initializes and manages sessions and connections for the SAP HANA Database using pre-established Session Parameters. SAP has long been known for

Page 6 of 41

Page 7: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

excellence in session management through its integration of SAPRouter into the SAPGUI product used as a front end for accessing the ABAP stack.  SAP HANA retains the ability to configure Connection and Session management parameters to accommodate complex security and data transfer policies instituted.

2. Authentication – User and role-based privileges are authenticated by the SAP HANA Database. (The Users, Authorizations and Roles within the SAP ERP system are not applicable or transportable to the SAP HANA instance.) The SAP HANA authentication model allows granting of privileges to users or roles, and a privilege grants the right to perform a specified SQL operation on a specific Object. SAP HANA also utilizes a set of Analytic Privileges that represent filters or hierarchy drilldown limitations for analytic queries to protect sensitive data from unauthorized users. This model enforces “Segregation of Duty” for clients that have regulatory requirements for the security of data.

3. SQL Processor – The SQL Processor segments data queries and directs them to specialty query processing engines for optimized performance. It also ensures that SQL statements are accurately authored and provides some error handling to make queries more efficient. The SQL processor contains several engines and processors that optimize query execution:

a. The Multidimensional Expressions (MDX) Engine is queries and manipulates the multidimensional data stored in OLAP (OnLine Analytical Processing) data cubes.

b. The Planning Engine enables the basic planning operations within the SAP HANA Database for financial planning operations.

c. The Stored Procedure Processor executes procedure calls for optimized processing without reinterpretation.  (e.g. converting a standard InfoCube into an SAP HANA Optimized Infocube)

d. The Calculation Engine converts data into Calculation Models and creates a Logical Execution Plans to support parallel processing.

4. Relational Stores – SAP has further segmented the storage of In-Memory data into compartments within memory for speedier access.  Data not needed immediately is stored on a Physical Disk as opposed to RAM.  This allows quick access to the most relevant data. The SAP HANA Database houses four relational stores that optimize query performance:

a. The Row Store stores data in a row-type fashion and is optimized for high performance of write operation, and is derived from the P-Time "In Memory System" which was acquired by SAP in 2005.  The Row Store is held fully in RAM.

b. The Column Store stores data in a column-type fashion and is optimized for high performance of write operation, and is derived from TREX (Text Retrieval and Extraction)  which was unveiled by SAP in the SAP NetWeaver Search and Classification product.  This technology was further developed into a full relational column based store.  The Column Store is held fully in RAM.

Page 7 of 41

Page 8: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

c. The Object Store is an integration of SAP Live Cache Technology into the SAP HANA Database.

d. The Disk Based Store is used for data that does not need to be held in memory and is best used for "tracing data" or old data that is no longer used.  Disk Based Store is located on a hard disk is pulled into RAM as needed.

5. Transaction Manager – The SAP HANA Database processes individual SQL statements as transactions.  The Transaction Manager controls and coordinates transactions and sends relevant data to appropriate engines and to the Persistence Layer. This segmentation simplifies administration and troubleshooting.

6. Persistence Layer – The Persistence Layer provides built-in disaster recovery for the SAP HANA Database. The algorithms and technology is based on concepts pioneered by MAX DB and ensures that the database is restored to the most recent committed state after a planned or unplanned restart.  Backups are stored as Save Points in the Data Volumes via a Save Point Coordinator which is typically set to backup every five minutes.  Any change points that occur after a save point are designated as un-committed transactions and are stored in the Transaction Log Volume.  Typically, these volumes are saved to media and shipped offsite for a cold-backup disaster recovery remedy.

7. Repository - The Repository manages the versioning of Metadata Objects such as Attribute, Analytic Views and Stored Procedure.  It also enables the import and export of Repository content.

Persistence Layer

Persistence

Page 8 of 41

Page 9: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure : In-Memory Data Is Regularly Saved to Disk

Disk storage is still required to ensure the ability to restart in case of power failure and for permanent persistency. The SAP HANA persistency layer stores data in persistent disk volumes that are organized in pages. It is divided in log and data area:

Data changes such as insert, delete, and update are saved on disk immediately in the logs (synchronously). This is required to make a transaction durable. It is not necessary to persist the complete data, but the transaction log can be used to replay changes after a crash or database restart.

In customizable intervals (standard: every five minutes) a new savepoint is created, i.e. all the pages that were changed are refreshed in the data area of the persistence.

Whether or not disk access can become to a performance bottleneck depends on the usage. Since changes are written to the Data Volumes asynchronously, the user/application does not need to wait for this. When data that already resides in the main memory is read, there is no need to access the persistent storage.

However, when applying changes to data the transaction cannot be successfully committed before the changes are persisted to the log area.

To optimize the performance, for the log area fast storage is used like SSD or Fusion-I/O drives (cf. certified hardware configurations in the Product Availability Matrix).

Page 9 of 41

Page 10: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure : Storing Data in Data Volumes: Details

Like many modern database management system, SAP HANA can use the host operating system‘s file abstraction later. Each data volume contains one file in which data is organized into pages, ranging in size from 4KB to 16MB (page size class). Data is written to and loaded from the data volume page-wise. Over time, pages are created, changed, overwritten, and deleted. The size of the data file is automatically increased as more space is required. However, it is not automatically decreased when less space is required.

This means that at any given time, the actual payload of a data volume (that is the cumulative size of the pages currently in use) may be less than its total size. This is not necessarily significant – it simply means that the amount of data in the file is currently less than at some point in the past (for example, after a large data load). If a data volume has a considerable amount of free space, it might be appropriate to shrink the data volume. However, a data file that is excessively large for its typical payload can also indicate a more serious problem with the database. SAP support can help to analyze the situation.

Page 10 of 41

Page 11: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure : Dealing with File Size Limitations

With large SAP HANA appliances – in particular, single-host SAP ERP systems – the situation can occur that the Ext3 file system file size limitation of 2 TB is reached. In this case SAP HANA automatically creates additional files. This allows the use of Ext3 file systems even with applications that have a larger memory requirement per host.

Page 11 of 41

Page 12: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure : Shadow Paging Concept

While (redo) log entries arewritten synchronously, changed data in data volumes is periodically copied to disk in a so-called savepoint operation. During the savepoint operation, the SAP HANA database flushed all changed data from memory to the data volumes. The data belonging to a savepoint represents a consistent state of the data on disk and remains so until the next savepoint operation has been completed.

Note: The frequency for savepoint creation can be configured. Savepoints are also triggered automatically by a number of other operations such as data backup, and database shutdown and restart. You can trigger a savepoint manually by executing the following statement ALTER SYSTEM SAVEPOINT.

The phases of the savepoint operation are shown in the graphic above. SAP HANA uses a so-called “Shadow Paging Concept”. This means that write operations write to new physical pages and the previous savepoint version is still kept in shadow pages. Consequently, if a system crashes during a savepoint operation, it can still be restored from the last savepoint.

Figure : Restart Process

In the event of a database restart (for example after a crash) the data from the last completed savepoint can be read from the data volumes and the redo log entries written to the log volumes since the last savepoint can be replayed. This allows restoring the database to the last committed state.

Page 12 of 41

Page 13: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Note: After a system restart, per default not all tables are loaded into the main memory immediately.

Figure : Start of SAP HANA Database

While the row store is always to loaded entirely, only those columns of column tables that are usually essential are loaded into memory. The other columns are loaded if requested.

For example, if a query only uses some of the fields (columns) of a table, only these are loaded into the memory at time of query execution. All row-based tables (usually system tables) are available in the main memory. Their size significantly influences the time required to start the database. Other factors that influence startup time are mentioned in the graphic below.

Additionally, column tables that were loaded before restart and their attributes are reloaded. Reloading column tables in this way restores the database to a fully operational state more quickly. However, it does create performance overhead and may not be necessary in non-productive systems. You can deactivate the reload feature in the indexserver.ini file by setting the reload_tables parameter in the sql section to false.

Page 13 of 41

Page 14: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure : Start-Up Process

Note: It is possible to mark individual columns as well as entire column tables for preload.

When the preload flag is set tables are automatically loaded into memory after an index server start. The current status of the preload flag is visible in the system table TABLES in the PRELOAD column. Possible values are ’FULL’, ’PARTIALLY’ and ’NO’. Also in system table TABLE_COLUMNS in column PRELOAD with possible values being ’TRUE’ or ’FALSE’.

Note: When fields of large column tables are not in the main memory, the first access to the table might be significantly slower, because all requested columns are loaded to main memory before the query can be executed. This applies even if a single record shall be selected.

Caution: Simply flagging all tables for preload in order to accelerate initial queries, could slow down startup time tremendously. The preload flag is a tuning option and should be used carefully depending on the individual scenario and requirements.

Memory Usage

The total amount of memory used by SAP HANA is referred to as used memory. It includes program code and stack, all data and system tables, and the memory required for temporary computations. In the Linux operating environment, memory is allocated for the program code (sometimes called the text), the program stack, and data. Most of the data memory, called the heap, is under program control.

Page 14 of 41

Page 15: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure : SAP HANA Memory Pool

As an in-memory database, it is critical for SAP HANA to manage and track its own consumption of memory carefully. For this purpose, the SAP HANA database preallocates and manages its own data memory pool. The memory pool is used for storing in-memory tables, for thread stacks, as well as for temporary computations, intermediate results, and other data structures. SAP HANA’s utilization of memory thus includes its program code (exclusive and shared), the program stack, and the memory pool, which includes all data tables (row and column), system tables, and created tables.

At any given time, parts of the pool are in use for temporary computations. The total amount of memory in use is referred to as used memory. This is the most precise indicator of the amount of memory that the SAP HANA database uses.

Figure 141: Virtual, Physical, and Resident Memory

Page 15 of 41

Page 16: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

When (part of) the virtually allocated memory actually needs to be used, it is loaded or mapped to the real, physical memory of the host and becomes “resident”. Physical memory is the DRAM memory installed on the host. On SAP HANA hosts, it typically ranges from 128 Gigabytes (GB) to 4 Terabytes (TB). It is used to run the Linux operating system, SAP HANA, and all other programs. Resident memory is the physical memory actually in operational use by a process.

Figure : Memories

The SAP HANA database, across its different processes, reserves a pool of memory before actual use. This pool of allocated memory is preallocated from the operating system over time, up to a predefined global allocation limit, and is then efficiently used as needed by the SAP HANA database code.

When memory is required for table growth or for temporary computations, the SAP HANA code obtains it from the existing memory pool. When the pool cannot satisfy the request, the HANA memory manager will request and reserve more memory from the operating system. At this point, the virtual memory size of the HANA processes grows. Once a temporary computation completes or a table is dropped, the freed memory is returned to the memory manager, which recycles it to its pool, without informing Linux. Thus, from SAP HANA’s perspective, the amount of Used Memory shrinks, but the process virtual and resident sizes are not affected. This creates a situation where the Used Memory may even shrink to below the size of SAP HANA’s resident memory, which is perfectly normal.

Note: The database may also actively unload tables or individual columns from memory, for example, if a query or other processes in the database require more memory than is currently available. It does this based on a least recently used algorithm.

Caution: Due to the preallocation of memory as described above, Linux memory indicators such as top and meminfo do not accurately reflect the actual SAP HANA used memory size. Main memory monitoring should always be based on SAP HANA monitoring features.

Memory Management in the Column Store

Page 16 of 41

Page 17: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

The column store is optimized for read operations but also provides good performance for write operations. This is achieved through two data structures: main storage and delta storage.

Figure : Column Store Memory Management

The column store uses efficient compression algorithms that help to keep all relevant application data in memory. Write operations on this compressed data would be costly, as they would require reorganizing the storage structure.

Therefore, write operations in column store do not directly modify compressed data. All changes go into a separate area called the delta storage. The delta storage exists only in main memory. Only delta log entries are written to the persistence layer when delta entries are inserted.

Delta merge operation:

The delta merge operation is executed on table level. Its purpose is to move changes collected in write-optimized delta storage into the

compressed and read-optimized main storage. Read operations always have to read from both main storage and delta storage

and merge the results. The delta merge operation is decoupled from the execution of the transaction

that performs the changes. It happens asynchronously at a later point in time.

Note: For the delta merge operation a double buffer concept is used. This has the advantage that the table only needs to be locked for a short time. Details can be found in the Administration Guide.

Page 17 of 41

Page 18: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Caution: The minimum memory requirement for the delta merge operation includes the current size of main storage + future size of main storage + current size of delta storage + some additional memory. It is important to understand that even if a column store table is unloaded or partly loaded, the whole table is loaded into memory to perform the delta merge.

Figure : The Art of Merging

The request to merge the delta storage of a table into its main storage can be triggered in several ways:

Auto Merge:

The standard method for initiating a merge in SAP HANA is the auto merge. A system process called mergedog periodically checks the column store tables that are loaded locally and determines for each individual table (or single partition of a split table) whether or not a merge is necessary based on configurable criteria (for example, size of delta storage, available memory, time since last merge, and others).

Smart Merge:

If an application powered by SAP HANA requires more direct control over the merge process, SAP HANA supports a function that enables the application to request the system to check whether or not a delta merge makes sense now. This function is called

Page 18 of 41

Page 19: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

smart merge. For example, if an application starts loading relatively large data volumes, a delta merge during the load may have a negative impact both on the load performance and on other system users. Therefore, the application can disable the auto merge for those tables being loaded and send a “hint” to the database to do a merge once the load has completed. When the application issues a smart merge hint to the database to trigger a merge, the database evaluates the criteria that determine whether or not a merge is necessary. If the criteria are met, the merge is executed.

Hard and Forced Merges:

Delta merge operations for a table can be manually triggered using an SQL statement. This is called a hard merge and results in the database executing the delta merge immediately once sufficient system resources are available. An immediate merge (regardless of the system resource availability) can be triggered by passing an optional parameter in the statement.

Critical Merge:

The database can trigger a critical merge in order to keep the system stable. For example, in a situation where auto merge has been disabled and no smart merge hints are sent to the system, the size of the delta storage could grow too large for a successful delta merge to be possible. The system initiates a critical merge automatically when a certain threshold is passed. Critical merge is inactive by default.

Hint: The delta merge operation is a potentially expensive operation and must be managed according to available resources and priority. There are various option for controlling and monitoring delta merge operations. For details see also the SAP HANA Administration Guide.

Note: Detail detailed information on memory management do you find in appendix 1 “Deep Diving into Memory Management and Persistence”

High Availability and Disaster Recovery

Page 19 of 41

Page 20: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure 54: SAP HANA High Availability – High Availability – Disaster Tolerance

It is crucial for all business applications to make sure they can support daily business without disruption.

SAP HANA offers concepts to make sure that the system is available even in the event of a disaster. There are concepts to guarantee this within a datacenter and also between physically different datacenters.

Figure 55: High Availability (Scale Out) Per Datacenter

Page 20 of 41

Page 21: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure 56: Scale Out – SAP HANA Database Landscape

High-Availability enables the failover of a node within one distributed SAP HANA appliance. Failover uses a cold standby node and gets triggered automatically.

Landscape Up to 3 master name-servers can be defined. During startup one server gets elected as active master. The active master assigns a volume to each starting index server or no volume in case of standby servers.

Master name-server failure

In case of a master name-server failure, another of the remaining name-servers will become active master.

Index-server failure

The master name-server detects an index-server failure and executes the failover. During the failover the master name-server assigns the volume of the failed index-server to the standby server.

Page 21 of 41

Page 22: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure: SAP HANA High Availability - Node Failover (Standby)

Page 22 of 41

Page 23: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure 57: High Availability Between Different Data Centers (Disaster Tolerance - Cluster across Datacentres)

The mirroring is offered on the storage system level. It will be offered together with the appliance by an HA-certified partner. The hardware partner will define how this concept is finally realized with his operation possibilities.

Performance impact is to be expected on data changing operations as soon as the synchronous mirroring is activated. The impact depends strongly on a lot of external factors like distance, connection between data centers, etc. The synchronous writing of the log with the concluding COMMITs is the crucial part here.

In case of emergency, the primary data center is not available any more and a process for the take-over must be initiated. So far a lot of customers wished to have a manual process here, but an automated process is also able to be implemented.

This take-over process then would end the mirroring officially, will mount the disks to the already installed HANA software and instances, and start up the secondary database side of the cluster. If the host names and instance names on both sides of the cluster are identical, no further steps with hdbrename are necessary.

It would be possible to run a development and/or QA instance of the three tier installation on this secondary cluster hardware, simply to utilize it until the take-over is executed. The take-over then would stop these dev. and/or QA instances and mount the

Page 23 of 41

Page 24: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

production disks to the hosts. It would require an additional set of disks for the dev. and QA instance.

So far no hot standby via log shipping is available or even log shipping by recovering of log backups on a standby host. This needs some changes in the engines of HANA database which needs time to be realized. Both solutions are on the agenda of HANA’s future development.

Data Replication Methods

Figure: Data Replication Methods

There are different technologies how to load data into SAP HANA (different data provisioning scenarios) which are covered in the Unit “Data Provisioning”.

The methods are:

SAP Landscape Transformation

SAP Data Services

Flat file upload

Direct Extractor Connection (DXC)

Page 24 of 41

Page 25: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure: SAP HANA Typical System Landscape

SAP HANA Components

The following tables show which components will be installed in an SAP HANA landscape, depending on the actual scenario. These technologies are reflected in the different editions of SAP HANA, and correspond to a different range of requirements from the customer.

Page 25 of 41

Page 26: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure 58: Bill of Material / SAP HANA Appliance Software Components

Figure 59: Bill of Material / SAP HANA Peripheral Components

Page 26 of 41

Page 27: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Lesson Summary

You should now be able to:

Understand the architecture of SAP HANA

Lesson:

Persistence Layer

Lesson Duration: 30 Minutes

Lesson Overview

This lesson explains how the data persistence works in SAP HANA and present the role of each component of the persistence storage layer.

Lesson Objectives

After completing this lesson, you will be able to:

Explain the necessity of a persistence layer

Understand the difference between data and log volumes Describe the reboot process after a power failure Explain the principles of shadow paging

In this lesson, the purpose is to introduce the concept of persistence, which is a bit particular in the context of an in-memory database as compared with a classical one. The most technical aspects are dealt with in course HA200 and are not presented here.

Business Example

As a SAP HANA Database Administrator, you need to understand how the data persistence works, to be able to know how the system will recover in the event of a power failure.

Persistence Layer Components

The persistence layer of SAP HANA relies on Data and Log Volumes. The in-memory data is regularly saved to these volumes.

Page 27 of 41

Page 28: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure 60: SAP HANA Persistence – Data and Log Volumes

Data and log volumes are used as follows:

On a regular basis, data pages and before images (undo log pages) are written in the data volumes. This process is called a Savepoint.

Between two savepoints, after images (redo log pages) are written in the log volumes. This is done each time a transaction is committed.

The savepoint process relies to a concept called Shadow Memory.

Page 28 of 41

Page 29: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure 61: SAP HANA Persistence - Shadow Memory Concept

Shadow paging is used to undo changes that were persisted since the last savepoint. With the shadow page concept, physical disk pages written by the last savepoint are not overwritten until the next savepoint is successfully completed. Instead, new physical pages are used to persist changed logical pages. Until the next savepoint is complete, two physical pages may exist for one logical page:

The shadow page, which still contains the version of the last savepoint.

The current physical page which contains the changes written to disk after the last savepoint.

Savepoint - Writing Data in Persistence Layer (IMCE)

Page 29 of 41

Page 30: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure: Savepoint - Writing Data in IMCE

The persistence layer (by the savepoint coordinator) periodically writes savepoints to disk. The time interval for writing savepoints can be configured for example to 5 minutes. Savepoints are self-contained consistent states of the database that may be loaded without reading the log volumes.

During a savepoint:

All data which has been changed since the last savepoint is marked and will be written to disk. The data which belong to the last completed savepoint won‘t be overwritten (shadow page concept).

The redo log entries are written to Log volume

Converter table (mapping of logical pages to physical pages in savepoints)

Row store check point (persisting main part of row store data)

The savepoint consists of three phases:

In phase one, all modified pages are determined that are not yet written to disk. The savepoint coordinator triggers writing of these pages.

In phase two, the write operations for phase 3 are prepared.

A consistent change lock is acquired →no write operations are allowed.

Page 30 of 41

Page 31: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

All pages are determined that were modified during phase 1 and written to a temporary buffer.

List of open transaction is retrieved

Row store information for uncommitted changes made during phase 1 is written to disk

Current log position is determined (log position from which logs must be read during restart)

Change lock is released

In phase 3, all data is written to disk. Changes are allowed again during this phase.

1. Temporary buffers created in phase 2

2. List of open transactions

3. Row store check point is invoke

4. Log queue is flushed up to the savepoint log position

5. Restart record is written (containing e.g. the savepoint log position)

System Restart Procedure

After a restart, the system is restored from the savepoint versions of the data pages. This way, all data changes written since the last savepoint are not restored.

After the savepoint is restored, the log is replayed to restore the most recent committed state.

Page 31 of 41

Page 32: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure 62: Persistence Layer in SAP HANA Database – System Restart

Actions during system restart

Restore data

o Reload the last savepointo Search the undo log for uncommitted transactions saved with last savepoint

(stored on the data volume) and roll them backo Search the redo log for committed transactions since last savepoint (stored

on the log volume) and re-execute them

Load all the tables of the row store into memory Load the tables of the column store that are marked for preload into memory

Note: Only tables marked for preload are loaded into memory during startup. Tables marked for loading on demand will only be loaded into memory at first access.

SAP HANA Backup and Recovery

Page 32 of 41

Page 33: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

External Backupx

Figure: Save to External Backup Destination

Backup Steps1. User starts backup via SAP in-memory computing studio

2. Database triggers a global database snapshot (backup manager in master name server)

a. Commits for all transactions on hold

b. Master name and index servers create snapshots of their persistent storage

c. Commits allowed again

d. Master name server and index servers write snapshots to backup destinations

e. SAP in-memory computing studio monitors progress

3. Manual configuration backup (recommended: every time a data backup is carried out):

a. /usr/sap/<SID>/SYS/global/hdb/custom/config

b. /usr/sap/<SID>/HDB<instance no>/<hostname> (without sub-directories!)

Page 33 of 41

Page 34: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure: Database copy from Multi-node to Single-node system

Figure: Database copy: Steps to perform and limitation

Page 34 of 41

Page 35: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Disk Failure

Figure: Recovery Scenario - Disk Failure (Data Volume)

Recovery Master name server is restored, then triggers restore of the other index and name

servers.

If log is available, transactions are restored.

Recovery: In general there are three data sources involved in the recovery process: Data backups stored in the file system Log Backups stored in the file system . Online logs

Page 35 of 41

Page 36: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure: SAP HANA 1.0 Backup and Recovery Feature Overview

Exercise 2: Working with Catalog Objects

Use

Page 36 of 41

Page 37: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

After completing this exercise, you will be able to:

See whether the content of a able stored in rows or in columns in the SAP HANA stores

Display the structure of a table Identify what indexes are defined for a table Preview the content of a table

During this exercise, you will use two tables from the TRAINING schema:

able MARA. This is the table that contains all the products in an SAP ERP database.

Table HANA_SEARCH_DATA .

Procedure

1. Start the SAP HANA Studio and log on to the SAP HANA system.

For this exercise you can use either the SAP HANA Administration Console perspective or the SAP HANA Modeler perspective.

So let’s start looking for table MARA.

2. Display table MARA in the Catalog the the HANA system.

Hint: To find the MARA table easily, you will apply a filter to the list of tables.

In the Systems view, expand the catalog to display the content of the Tables node for the TRAINING schema.

Right-click the Tables node and choose Filters....

Page 37 of 41

Page 38: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure 63: Filtering Catalog Objects

Enter the filter pattern MA and choose OK.

Double-click the Tables node.

The table list is now filtered and displays only tables a few tables, including table MARA.

Figure 64: List of Tables

3. Open the definition of table MARA and identify the storage type.

Page 38 of 41

Page 39: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Right-click table MARA and choose Open Definition.

Figure 65: Opening a Table Definition

Figure 66: Table Definition

This screen shows the table structure, that is, all the columns of the table with their types, length, and so on.

The table type (column store or row store) is display in the top-right corner of this screen.

Here, you see that table MARA is using the column-based storage type.

4. Identify the key columns of table MARA.

The key fields of the table are marked in the Key column.

For table MARA, the key columns are MANDT (client) and MATNR (material number).

Page 39 of 41

Page 40: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

Figure 67: Key Fields

5. Preview the data of table MARA.

In the context menu for the table, you can use one of the following options:

Open Content, to simply display the table content.

Open Data Preview, to explore the table contents as raw data (flat table view), build a quick report on distinct values, or build an analysis (such as a chart or table) based on the data like in a BI tool.. Here, you will use the data preview and display raw data.

Right-click the MARA table and choose Open Data Preview.

6. Open the definition of table HANA_SEARCH_DATA and identify the table type and indexes.

Right-click the Tables node and choose Filters....

Enter the filter pattern HANA and choose OK. If needed, to display the filtered table list, double-click the Tables node. Right-click table HANA_SEARCH_DATA and choose Open Definition.

Figure 68: Table HANA_SEARCH_DATA

Identify the type of the HANA_SEARCH_DATA table.

Page 40 of 41

Page 41: Ha100 unit 3 hana architecture sp08

HA100 - SAP HANA Introduction (SPS08 - 2104)Unit 3: Architecture

To display the list of indexes created on the table, select the Indexes tab.

Your observations should be as follows:

The table is a row-store table.

There are no indexes defined.

Facilitated Discussion

Discussion Questions

Use the following questions to engage the participants in the discussion. Feel free to use your own additional questions.

Lesson Summary

You should now be able to:

Explain the necessity of a persistence layer

Understand the difference between data and log volumes Describe the reboot process after a power failure Explain the principles of shadow paging

Unit Summary

You should now be able to:

Understand the architecture of SAP HANA

Explain the necessity of a persistence layer Understand the difference between data and log volumes Describe the reboot process after a power failure Explain the principles of shadow paging

Page 41 of 41