21
TERADATA- DAY 1 Teradata Introduction Teradata Architecture Types of spaces Teradata Data Protection Mechanisms Prepared By AnilKumar P

Teradata-Day1

Embed Size (px)

DESCRIPTION

Teradata

Citation preview

Page 1: Teradata-Day1

TERADATA- DAY 1

Teradata IntroductionTeradata ArchitectureTypes of spacesTeradata Data Protection Mechanisms

Prepared ByAnilKumar P

Page 2: Teradata-Day1

Teradata Introduction:

What Is Teradata?

-Teradata is a relational database management system (RDBMS) that drives a company's data warehouse.

-The origin of the name Teradata is "tera-,“ derived from Greek which means "trillion.“

-Teradata was the first commercial database system to scale to and support a trillion bytes of data.

Page 3: Teradata-Day1

Teradata Advantages :

Single data store Scalability Unconditional parallelism (parallel architecture) Ability to model the business Mature, parallel-aware Optimizer

Page 4: Teradata-Day1

Single Data Store

Teradata acts as a single data store, with multiple client applications making inquiries against it concurrently

Page 5: Teradata-Day1

Scalability

Addition of new components to the system increases the performance linearly.

Adding components allows the system to accommodate increased workload without decreased throughput .

ComplexityTeradata is adept at complex data models that satisfy the information needs throughout an enterprise.

It has the ability to perform large aggregations during query run time and can perform up to 64 joins in a single query.

Page 6: Teradata-Day1

Concurrent UsersTeradata has the ability to handle from hundreds to thousands of users , who are often running multiple, complex queries on the system simultaneously.

Unconditional ParallelismTeradata’s ability to manage large amounts of data is accomplished using the concept of parallelism, wherein many individual processors perform smaller tasks concurrently to accomplish an operation against a hugerepository of data.

Teradata's parallelism does not depend on query tuning, limited data quantity, column range constraints, or specialized data models --Teradata has "unconditional parallelism."

Page 7: Teradata-Day1

Ability to Model the Business

It support all types of data model.

Mature, Parallel-Aware Optimizer

Teradata's Optimizer is the most robust in the industry, able to handle: Multiple complex queries 64 Joins per query Unlimited ad-hoc processingThe Optimizer is parallel-aware, meaning that it has knowledge of system components and determines the least expensive plan (time wise)to process queries fast and in parallel.

Page 8: Teradata-Day1

Teradata System :

A Teradata system contains one or more nodes where the processing occurs for the Teradata Database .

There are two types of Teradata systems:

Symmetric multiprocessing (SMP) - An SMP Teradata system has a single node that contains multiple CPUs sharing a memory pool.

Massively parallel processing (MPP) - Multiple SMP nodes working together comprise a larger, MPP implementation of Teradata. The nodes are connected using the BYNET, which allows multiple virtual processors on multiple nodes to communicate with each other.

Page 9: Teradata-Day1
Page 10: Teradata-Day1

BYNET

The BYNET is a high-speed interconnect that enables nodes in the systemto communicate. It has several unique features:

Scalable: Addition of nodes to the system, increases the system size without performance penalty -- and sometimes even increase performance.

High performance: An MPP system typically has two BYNET networks (BYNET 0 and BYNET 1). Because both networks in a system are active, the system benefits from having full use of the aggregate bandwidth of both the networks.

Fault tolerant: Each network has multiple connection paths. If the BYNET detects an unusable path in either network, it will automatically reconfigure that network so all messages avoid the unusable path.

Load balanced: Traffic is automatically and dynamically distributed between both BYNETs.

Page 11: Teradata-Day1

BYNET Hardware and Software

The BYNET hardware and software handle the communication between the vprocs and the nodes. Hardware: The nodes of an MPP system are connected with the BYNET hardware, consisting of BYNET boards and cables. Software: The BYNET software is installed on every node. This BYNET driver is an interface between the PDE software and the BYNET hardware.

Parallel Database Extensions (PDE)

The Parallel Database Extensions (PDE) software layer was added to the operating system to support the parallel software environment.

Page 12: Teradata-Day1

Teradata Architecture

Page 13: Teradata-Day1

Channel Driver

Channel Driver software is the means of communication between an application and the PEs assigned to channel-attached clients. Thereis one Channel Driver per node.

Teradata Gateway

Teradata Gateway software is the means of communication between an application and the PEs assigned to network-attached clients. There isone Teradata Gateway per node.

Page 14: Teradata-Day1

Basic components of Teradata Architecture:

The Parsing Engine Message Passing Layer Access Module Processor

Parsing Engine : A Parsing Engine (PE) is a vproc that manages the dialogue between a client application and the Teradata Database, once a valid session has been established. Each PE can support a maximum of 120 sessions. Components : 1. PARSER 2.OPTIMIZER 3. DISPATCHER

Message Passing Layer : •Carrying messages between the AMPs and PEs. •Point-to-Point, Multi-Cast, and Broadcast communications. •Merging answer sets back to the PE.

Page 15: Teradata-Day1

Access Module Processor (AMP) :The AMP is a virtual processor that controls its portion of the data on the system. The AMPs work in parallel, each AMP managing the data rows stored on its vdisk. AMPs are involved in data distribution and data access in different ways.

Finding the rows requested Lock management Sorting rows Aggregating columns Join processing Output conversion and formatting Creating answer set for client Disk space management Recovery processing

Page 16: Teradata-Day1

TERADATA Database or Users :

Database or user must created from DBC or existing DB Perm space must be extracted from immediate owner. Perm Space used only by Tables , Join Index or Stored procedure. Un used Perm is utilized by Temp/Spool space.

Page 17: Teradata-Day1

Teradata Database Spaces :

PERM – Permanent space Tables , Indexes , Sub Table , JOIN Indexes , Stored Procedure used PERM Space. PERM space deducted from Owner database.

CREATE DATABASE WMT_EDW FROM DBC AS PERM = 2000000 SPOOL = 5000000 NO FALLBACK NO AFTER JOURNAL DUAL BEFORE JOURNAL DEFAULT JOURNAL TABLE = WMT.journals ;

SPOOL – Working space Temporary working space to store intermediate query result/answers set. SELECT statement use spool space. Large number of non unique values , poor distribution of data or join on columns results in “Insufficient spool” error. Volatile and Derived table uses SPOOL space.

TEMP – Working spaceTEMP space is acquired by GTT (Global Temporary Tables) when it is materialized.

Page 18: Teradata-Day1

Data Protection:

-LOCKS-RAID-FALLBAK-JOURANLS-CLIQUE

Locks : We have 4 types of locks applied on three levels. -Database level-Table level-Row hash level

Types of Locks:-Exclusive locks-Write locks-Read locks-Access locks

Page 19: Teradata-Day1

RAID: Redundant Array of Inexpensive Disks (RAID) is a storage technology that provides data protection at the disk drive level.RAID 1 : Disk Mirror TechniqueRAID 5 : Parity Checking Method.

Fallback: Fallback is a Teradata feature that protects Data against AMP failure. Fallback uses groups of AMPs that provide data availability and consistency if an AMP is unavailable.

Clique :Set of SMPs/Nodes that share commonset of diskarrays.Provides protection from Node failure.If a node fails, all vprocs will migrate to the remaining nodes in the clique (VprocMigration).A clique can support up to 128 vprocs.

Page 20: Teradata-Day1

Journal:

TD Journals used for specific types of data recovery or process recovery.

1.Recovery Journals : -Automatically activated when AMP is taken offline.

2.Transaction Journal : A journal of Transaction "BEFOREIMAGE“, Automatic rollback in the EVENT of transaction Failure.

3.Permanent Journal : User specified , systemmaintainedjournal. Use for unexpected software and hardware Disaster.

Page 21: Teradata-Day1

Thanku