Upload
harishkode
View
223
Download
0
Embed Size (px)
Citation preview
7/31/2019 Teradata 12.0
1/4
ctive data warehousing is the
process of mixing strategic and
operational intelligence on the same plat-
form by utilizing up-to-date enterprise
data. With active data warehousing, very
high requirements are placed on the
underlying technology.
For one, the active data warehouse
must perform considerably well on all
parts of the workload. Not only must
it manage and track a widely mixed
workload under all conditions, but it
must also be capable of managing the
flow of changing data. Second, it must
be possible to implement new applica-tions quickly and easily.
Teradata 12.0 is the next major step
toward making it easy for anyone to
build a high-performance active
data warehouse.
QWhat changes deliver new levels
of performance?
A Partitioned primary index (PPI)is extended to multiple levels,statistics are enhanced to deliver better
estimates, many query estimating and
planning improvements have been
added, and the explain plans are
extended with information that
previously had to be assumed.
Q
How does multi-level PPI
(ML-PPI) work?
A The current PPI capability allowsfor one function to determine thehorizontal partitioning of the table.With
ML-PPI, those partitioning capabilities
are extended to multiple levels. Separate
functions within ML-PPI define the parti-
tioning for each level, which can then be
utilized together or independently for much
more granular access to the desired data in
the table.
For example, an insurance company
may partition first by state, then month.Questions may be asked by state, month
or both, allowing the quantity of data
accessed to better match the requested
analysis. Reducing the data accessed reduces
the I/O load on the system and improves
overall system throughput.
QHow is statistics-based estimation
improved?
A More and varied users can submita greater assortment of queries,making accurate estimation for query
planning extremely important. Statistics
are crucial for guiding the cost-based
optimization of queries.
The number of intervals in the statistics
structure has been doubled, significantlyincreasing the detail, accuracy and under-
standing of skew. Demographics of mul-
tiple column statistics are improved with
TeradataMagazin
Teradata 12.0:Active performance
A
New release adds more value to your enterprise data warehouse.by Todd Walter with contributions from Alan Greenspan
TECH2TECH ASK THE EXPERTS
Benefits of multi-level partitioned primary index
> Increases performance of multi-
dimensional queries where the set of
constrained dimensions changes from
query to query
>Enhances performance of queries withconstraints (including range con-
straints) on multiple columns without
the overhead of secondary indexes
> Decreases the amount of I/O needed
to execute a query
> Provides an access method that
allows multiple ways to access the
data efficiently
> Allows DBAs more partitioning expres-
sions flexibility when there are multiplechoices; the DBA can define multiple
partitioning expressions and obtain
the benefits from each of them and
in combination
A.G.
PAGE 1 | Teradata Magazine | September 2007 | 2007 Teradata Corporation | AR-5386
7/31/2019 Teradata 12.0
2/4
a better picture of how nulls appear
in the data.
Q
What happens if my statistics are
not current?
A Stale statistics can lead to invalidestimations that, in turn, can nega-tively affect query planning. Balancing the
overhead of re-collecting statistics against
the potential of inaccurate statistics often
demands trade-offs. For example, with cer-
tain columns (especially dates), it has been
necessary to collect statistics frequently to
get reasonable plans for queries at or near
the end of the data range.
Teradata 12.0 significantly reduces this
overhead by extrapolating appropriate values
when the collected statistics are not current,
thus avoiding daily collection and reducing
overall system utilization. To continue with
the above scenario, when statistics in a date
column have not been recently re-collected,
the optimizer will determine how much
additional data has been added to the table
and will extend the statistics to account for
the new date ranges. Query planning will
improve even when the statistics are not
collected every day. (See figure.)
QWhat is new in query estimation
and planning?
A Query complexity continues to risewith many nested views, subqueries,derived tables and complex joins. The
Teradata optimizer has been enhanced to
capture additional information about each
nested level in the query and then to carry
that information throughout the planning
process. Improvements have been made
to the estimation of the cost and results
of complex joins. New query rewrite
rules have been added to address many
customer-provided optimization oppor-
tunities, and the rewrite engine can now
execute in multiple passes, giving further
opportunities to improve the plan. More
accurate and consistent query plans will
be produced as a result.
QWhat changes have been made in
the explain plans?
A Size of spools, join and groupkey columns, and qualifiedreferenced object names in the query,
along with additional information, are
included in the explain plan output to
provide more information to the usersreading the explain.
QHow does Teradata improve the
ability to manage the ever-increasing
active workload?
A All of the aforementioned queryestimation accuracy improvementspull double-duty by helping the accuracy
of the workload management rules. More
information can be provided by the user
or application about each query. Work-
load exceptions have been extended and
a traffic cop has been added to automate
changes to workload rules and settings
based on user-defined or system events.
An entirely new way of accessing infor-
mation is being developed to provide
additional management information t
more data warehouse users.
QWhat is a traffic cop?
A New workload management profcan be implemented when systemconditions change or when a user-defin
event such as the end of the load wind
is reached. For example, if a system com
nent has failed or the system is exceptiobusy, the traffic cop can switch rule set
appropriately adjust the workflow base
the current system conditions.
QHow do users and applications pr
more information about the work
on the system?
A Like banding a bird to track itsflight path, a query band can nbe provided with each query. The ba
can contain any number of attribute
and values that provide detailed info
mation about the query and may be
referenced by workload managemen
rules during the query classification
phase. The information is then captu
in the query log, where it can be use
Figure Extrapolated statistics reducecollection requirements
Teradata 12.0 can extrapolate appropriate statistics values when collected statistics are not current.
As this example shows, when a query references a date beyond the range with collected statistics, tdatabase will automatically extrapolate the collected statistics to create an accurate plan. This proce
allows the reduction of frequent statistics collection while getting accurate cost-based optimizations
PAGE 2 | Teradata Magazine | September 2007 | 2007 Teradata Corporation | AR-5386
7/31/2019 Teradata 12.0
3/4
future analysis of the work flowing
through the system.
Query banding is especially valuable for
applications that send work through pooled
sessions, such as traditional session pool
applications, analytic applications using
pooled sessions, business intelligence (BI)
tools and new Web service applications.
Information useful for workload manage-
ment, or for tracking the use of the data
warehouse (such as the application, work unit,
requesting user, etc.,) can be acquired through
the query banding. After the application is
adjusted to provide the information, all of
the tracking and linking to workload man-
agement is handled automatically.
QWhat new workload exceptions
are available?
A System-level workload exceptions havebeen added in Teradata 12.0. Now asingle rule can be used to raise exceptions,
regardless of the querys workload group.
Multiple rules can be assigned to a single work-
load group to allow multiple levels of control
for requests that are not behaving as expected.
QHow will systems management
information be provided in the future?
A Teradata 12.0 includes a new appli-cation programming interface (API)to make the system management informa-
tion available via standard SQL interfaces.
With this capability, data can be retrieved
and control settings altered. This makes the
information available through the standard
ODBC and JDBC interfaces.
On top of those APIs, a new system
management user interface has been
implemented. Using Web services, portal
delivered displays and a fully extensible
architecture, a fundamentally new form
of management interface is provided to
administrative and end users.
QHow will the increasing flow of change
to the data warehouse be handled?
A The continuous flow of data needsto get into the database efficiently andthen must be backed up without affecting
the flow. If availability and disaster recovery
requirements lead to a dual system implemen-
tation, the data must be synchronized with asystem in another data center.
QWhat is new for the data acquisition
and integration process?
A Many implementations use extract,load and transform (ELT) processes,which perform the transformation and
apply steps within the database, taking full
advantage of the parallel data engine to do
the work. Teradata 12.0 adds the ability to
perform bulk merge (upsert), working from
a table of change data and doing all the work
within the database.
Insert and merge operations have been
enhanced with an option to log errors to an
error table rather than initiating an abort. This
means that less effort needs to be spent on get-
ting the data perfect before applying it. Togeth-
er, these two functions significantly improve
the ELT capabilities for the warehouse.
ELT using bulk SQL operations avoids
constraints of the load utilities,allowing
greater use of physical tuning techniques
for active queries (unique secondary and
join indexes) and allowing more use of
active features (triggers, referential integrity).
QHow do we perform backup
without affecting the continuous
loading processes?
A Online Archive now allows backupsto be performed without stopping theload processes. It will automatically initiate a
checkpoint, save a log of changes and back up
the log as part of the backup. On a restore, thelog will automatically be restored and rolled
back to the checkpoint ensuring that a consis-
tent restore has been done.
QHow will the increasing volume of data
changes be synchronized?
A Teradata Replication Services has beenupgraded to significantly increase thebandwidth per replication group. Greater load
volumes and more tables can be handled by a
single replication group.
QWhat is new for building applications?
A Teradatas extensibility strategy takesseveral steps forward in Teradata 12.0,
TeradataMagazin
TECH2TECH ASK THE EXPERTS
Partitioned primary index is extended to multiple levels,
statistics are enhanced to deliver better estimates, many
query estimating and planning improvements have beenadded, and the explain plans are extended with information
that previously had to be assumed.
PAGE 3 | Teradata Magazine | September 2007 | 2007 Teradata Corporation | AR-5386
7/31/2019 Teradata 12.0
4/4
allowing for more applications to be im-
plemented closer to the data in the data
warehouse. The extensibility functions will
be used to deliver Teradata functionality as
well; XML support and spatial data supportwill be the first examples.
QAre we able to wr ite our applications
in Java?
A Stored procedures (SPs) can nowbe implemented in Java.An implemen-ter can choose the ANSI Stored Procedure
Language, C/C++ or Java as the implemen-
tation language for procedures to run in the
database. Application developers who wish to
implement their entire application in Java can
write the database portion in Java as well.
QWhen can we return result sets from
Teradata stored procedures?
A Teradata 12.0 includes the ability toreturn sets of result rows from a SQLstored procedure. This will allow SPs to per-
form a wider range of application functions.
QHow can we write C/C++ proce-
dures that easily utilize data from
the warehouse?
A With a newly defined interface, itis easier to submit SQL to Teradatafrom within a C/C++ stored procedure.
This makes it simpler to utilize data from
the warehouse while also accessing external
interfaces or libraries such as connecting
to the message bus or linking to another
database engine.
Q
What if we want to return more general
result sets from a table function?
A This release includes a new methodfor defining the result row from atable function that allows the result para-
meters and data types at the invocation
time of the function.
A general-purpose function can now be
written to operate upon a more general
source of data and return data in the form
desired by each request.
QWhat else is included in Teradata 12.0?
A Many additional features addressopportunities in scalability, secur-ity, unicode support, system tools (such as
scandisk and checktable) and index wizard
support for recommending PPIs for new
or existing tables. To review the detailed
release documentation for a complete view,
visit Teradata.com/resources and select
Technical Documentation.
Activate your enterpriseTeradata has made major continuous
progress in delivering the supporting
technology for building an active data
warehouse. Each step makes it easier and
more automatic to deliver the endemic
access to integrated data that every enterprise
is demanding. Teradata 12.0 is the next major
step in that evolution, making major progress
in query planning, statistics collection, per-
formance, workload management, system
operations and extensibility.Installing and taking advantage of this tech-
nology will position your Teradata platform
for the easiest possible active data warehouse
implementation or as an extension of your
current implementation to provide more users
the data they need to deliver the best possible
value to your organization every day.T
Todd Walter, CTO, Teradata Development
Division, oversees R&D efforts for Teradata
Database software and systems. He is also
responsible for the future vision and devel-
opment of the active data warehouse.
Alan Greenspan is the product marketing
manager for Teradata Database and
load utilities. He has been with Teradata
for 15 years.
The next release of the core Teradat
product set has a new name in add
tion to a broad collection of exciting and
innovative capabilities: Teradata 12.0. N
all of the software products at the core o
the data warehouse infrastructure will ha
a consistent, unified numbering sequen
Beginning with this release, the set of
products that are designed, developed
and tested together will have a single
release number.
Teradata selected the number 12 to s
the unified numbering sequence becaus
it is the 12th major database release in
Teradatas history. This current release w
preceded by five major database releas
of Version 1 (TOS-based, Teradata
Operating System), and six major datab
releases of Version 2 (Open OS-based
UNIX, Linux and Windows).
Effective with the new release, the
numbering system will be:
> Teradata 12.0
> Teradata Database 12.0
> Teradata Tools and Utilities (TTU) 1
Teradata Database 12.0 and Teradata
Tools and Utilities 12.0 are included in thTeradata 12.0 release. The individual clie
Tools and Utilities products will also adju
to the 12.0 numbering sequence.
The term Teradata Warehouse, init
developed for use within the company,
adopted externally to package releases
and indicate system-level certification of
the various software products with differ
release numbers. This certification will
continue as an important step in the dev
opment cycle, but the unified numberin
scheme will replace the need for a sepa
rate warehouse package designation.
Minor releases will continue to be
numbered in the current method. For
example, the next major release will be
Teradata 13.0 and the minor release
following it will be Teradata 13.1.
A
Teradata 12.0: The 12tmajor database release
PAGE 4 | Teradata Magazine | September 2007 | 2007 Teradata Corporation | AR-5386